Panic Mode On (77) Server Problems? |
![]() |
| log in |
Message boards : Number crunching : Panic Mode On (77) Server Problems?
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 23 · Next
| Author | Message |
|---|---|
Thanks again. I had already come across that link, but the information I did see on there is above my league to be honest ;). But I'll nose around further and see where it may lead me to. Regarding the servers of the SETI project: are we to assume then the system (software of the SETI project servers) "lacks" somekind of security mechanisme preventing these problems? I mean, the more VHARs they send out, the more traffic they generate, cauze "us crunchers" chew right through them at very high speeds, hence asking lots of new units in return, thus chocking up traffic eventually for everybody. It seems the system is unable to maintain somekind of balance between the 3 main types of units being sent out to the crunchers: that of course would mean it would have be able to select data from different tapes, better yet, multiple tapes holding a different type of recording (as you've specified earlier) to maintain a balanced mixtures of units being sent out. If I understand it correctly, right now, the data from the different tapes that have been split, being sent out now, all are 'basketweave' mode recordings, leading inevitably to the current problems (shorties storm). Please feel free to comment whether I see things right or wrong ;) Kind regards ____________ To boldly crunch ... | |
| ID: 1290388 · | |
|
| |
| ID: 1290446 · | |
|
Where i can find info how the data it s collected in Arecibo and how it s "given" to Seti? Someone spoke about tapes..... | |
| ID: 1290453 · | |
|
well, | |
| ID: 1290454 · | |
well, If you werent english i would ask you to translate.......to english ____________ By your command !!! | |
| ID: 1290455 · | |
|
| |
| ID: 1290460 · | |
Yes, things havn't come back right for the last 3 outages now so it seems that something is being overlooked somewhere. Cheers. ____________ | |
| ID: 1290464 · | |
|
10/2/2012 5:29:03 PM | | Internet access OK - project servers may be temporarily down. | |
| ID: 1290467 · | |
Yes, things havn't come back right for the last 3 outages now so it seems that something is being overlooked somewhere. Or all the shorties presently in the system have found yet another limitation. We had, what- a couple of months? where there we barely any shorties at all, and still quite a few VLARs being sent out. Now we've got shorties, lots & lots of shorties. So for one of my cards, instead of doing 3 WUs every 15-20min it's now doing 3 every 4-5min. Over 3 times the throughput. Could be the systems are hitting their limit of RAM, or they need more HDDs for more I/O. Whatever it is, the problems- inability to upload & frequent "Project has no tasks available", "No tasks sent" or "Scheduler reached timeout" messages- all seem to have started around the time all the shorties started coming through en-mass. EDIT- one system finally managed to upload enough WUs to request more- and now the Scheduler is just timing out on the request. By the time it can request more work again, there'll be too many uploads backed up for it to be able to do so. ____________ Grant Darwin NT. | |
| ID: 1290468 · | |
What is the explanation behind a "shorties storm" ? They don't seem to originate from the same tapes? Yet almost anything being sent out are VHAR units. Is this a server problem? I'm curious to read some info on this ;) Works flows slowly at best right now, better than nothing at all of course. Hi Richard, Thank you for this information. I have been crunching SETI work for 12 1/5 years and I did not know this fact. I thought it had to do with the angle that the antenna was pointing at, relative to the sky. and that this has some issues with not picking up the same amount of information and was degraded for some reason or other. Many thanks for this info, Arvid ____________ Arvid Almstrom | |
| ID: 1290470 · | |
|
I have just had a record of WU's being processed on my computer, 6 out of 8 slots were processing. Unfortunately, it only lasted about 1 1/2 minute and I am now back to 0 of 8. | |
| ID: 1290475 · | |
|
Intrestingly Matt has posted in the Technical News thread http://setiathome.berkeley.edu/forum_thread.php?id=69594 and seems to be totally unaware that there have been any problems at all!! The download servers have been trading off for a bit - we are now currently settled on using vader and georgem as the download server pair. As well, I just moved from apache to nginx on those servers. I think it's working well, but if any of you notice weird behavior let me know! ____________ | |
| ID: 1290491 · | |
|
Things are certainly borked upload wise- prior to the outage & since whatever was done on Sunday there had been a steady tream of work being returned- 100,000 results per hour. | |
| ID: 1290511 · | |
|
Sounds like we're in for one heck of a ride. Shorties galore over here, but uploads and downloads are hit an miss still. | |
| ID: 1290520 · | |
I have just had a record of WU's being processed on my computer, 6 out of 8 slots were processing. Unfortunately, it only lasted about 1 1/2 minute and I am now back to 0 of 8. Would it be practical to have the AP Splitter program run as a different type of workunit? | |
| ID: 1290530 · | |
|
| |
| ID: 1290532 · | |
Not sure i understand the quesiont- AP WUs are a different type of WU, that's why they are processed by a different programme to the MultiBeam WUs. ____________ Grant Darwin NT. | |
| ID: 1290533 · | |
well, I'd expect clive to mean that Einstein@Home workunits will be running tonight, and the heat from that will be warming the house. | |
| ID: 1290537 · | |
Things are certainly borked upload wise- prior to the outage & since whatever was done on Sunday there had been a steady tream of work being returned- 100,000 results per hour. It's now down to less than 45,000/hr, and i notice that the splitters are unable to get much above 25/s & so the ready to send buffer has actually dropped down to 1,700, and continues to fall. So even if people eventually do upload all of their completed work, and a Scheduler request finally does go through- there won't be any work left to allocate. Something is very, very wrong. ____________ Grant Darwin NT. | |
| ID: 1290550 · | |
Intrestingly Matt has posted in the Technical News thread http://setiathome.berkeley.edu/forum_thread.php?id=69594 and seems to be totally unaware that there have been any problems at all!! Yes, and this is very curious. Very, very curious. Unless Matt has isolated himself completely, he must be aware that the project is not functioning correctly and hasn't for some time now. I thought about offering to send him a "cruncher" to use at home (something basic with a GPU) so he could have a user experience, first hand. But then I supposed if he has no trusted confidant here who can get his attention, a cruncher on the shelf at home probably would have the same success wrestling him away from whatever else he is doing. Obviously something is wrong, yet he's asked for reports of anomalies. In the BOOK, Jurassic Park, Crichton wrote-in a reason nobody on the island was aware of the number of loose and rampaging dinosaurs. Being unaware that they could breed, the monitoring system "listened-for" 6 of this, 8 of that, 3 of the other. Once that number was accounted-for, the system quit counting. (it stopped looking, so never picked-up on the fact that there were 18 of this, 12 of that, and 46 of the other) I wonder if Matt (et al) aren't falling victim to the same sort of thing: The 100Mb pipe is full and there are uploads, downloads, and results created and received; therefore everything must be working as well as it can (or some inadequately-deductive equivalent of that). It is exactly as you say: "...and seems to be totally unaware..." Our usual grousing about the servers' speed and inadequate bandwidth may have made him deaf to our complaints. I think he has dozens, if not hundreds, of people who would be thrilled by an opportunity to make him acutely aware of what is happening, if we only had the means. | |
| ID: 1290567 · | |
Message boards : Number crunching : Panic Mode On (77) Server Problems?
| Copyright © 2013 University of California |