Message boards :
Number crunching :
Panic Mode On (28) Server problems
Message board moderation
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 14 Jun 99 Posts: 468 Credit: 53,129,336 RAC: 0 ![]() |
Same here in the land of OZ, 1 or 2 wu's upload but no luck updating, calm down and go roll in some catnip lol ![]() |
kittyman ![]() ![]() ![]() ![]() Send message Joined: 9 Jul 00 Posts: 51525 Credit: 1,018,363,574 RAC: 1,004 ![]() ![]() |
Same here in the land of OZ, 1 or 2 wu's upload but no luck updating, calm down and go roll in some catnip lol Catnip, hell......I wanna CRUNCH something.... LOL. But uploading and reporting would do for now. What I wanna know is just where the problem is....and is it gonna be addressed soon? "Time is simply the mechanism that keeps everything from happening all at once." ![]() |
Labriskus Send message Joined: 20 Feb 07 Posts: 11 Credit: 745,920 RAC: 0 ![]() |
Erhm not sure if anyone noticed the news page; Projects are down due to a server closet air conditioning failure. We have to power down most of our computers until this is fixed. 17 Feb 2010 2:36:55 UTC http://setiathome.berkeley.edu/index.php Apolagies if this has allready been pointed out but this could be whats going on :) |
kittyman ![]() ![]() ![]() ![]() Send message Joined: 9 Jul 00 Posts: 51525 Credit: 1,018,363,574 RAC: 1,004 ![]() ![]() |
Erhm not sure if anyone noticed the news page; My friend, if the AC had not been fixed, we would not be talking right now.......the servers would still be down. There is a comms problem that existed before the AC failure, and still persists. "Time is simply the mechanism that keeps everything from happening all at once." ![]() |
Labriskus Send message Joined: 20 Feb 07 Posts: 11 Credit: 745,920 RAC: 0 ![]() |
Erhm not sure if anyone noticed the news page; I see ........ then i stand corrected :) Still its a nice chance to give the pc a clean :) |
kittyman ![]() ![]() ![]() ![]() Send message Joined: 9 Jul 00 Posts: 51525 Credit: 1,018,363,574 RAC: 1,004 ![]() ![]() |
Erhm not sure if anyone noticed the news page; I suspect many dust bunnies are meeting their maker about now. "Time is simply the mechanism that keeps everything from happening all at once." ![]() |
jangliss Send message Joined: 14 Jan 04 Posts: 1 Credit: 7,544 RAC: 0 ![]() |
My friend, if the AC had not been fixed, we would not be talking right now.......the servers would still be down. You'd think they'd update the front page to reflect that the A/C is back up again.
Weird that half the servers are still reporting "OK", and yet I've seen numerous complaints about not being able to report updates. All I keep getting is "rety in {some time}". I thought it might have been my connection, but tested from 3 different locations, still get it. Sniffing into the traffic, the server is throwing a 500 error when it reports the data. I'd be more than happy to post my traffic if it helps anybody resolve the issues. |
kittyman ![]() ![]() ![]() ![]() Send message Joined: 9 Jul 00 Posts: 51525 Credit: 1,018,363,574 RAC: 1,004 ![]() ![]() |
My friend, if the AC had not been fixed, we would not be talking right now.......the servers would still be down. That's the whole durned point..... I have not had anybody confirm or deny whether this problem is within the Seti servers or is with their connection to the outside world. I am gonna go to sleep right now, should I not freeze because the Cuda cards aren't throwing off much heat at the moment, and hope when I wake up it was all just a bad kitty dream. WHERE ARE THE FISH??? "Time is simply the mechanism that keeps everything from happening all at once." ![]() |
Matthew S. McCleary ![]() Send message Joined: 9 Sep 99 Posts: 121 Credit: 2,288,242 RAC: 0 ![]() |
It's situations such as this -- regardless of what the actual cause is -- that chase people away from crunching for SETI@home. Simply acknowledging that a problem exists and a solution is being looked for, whether the problem is Berkeley's or elsewhere, goes a long way towards calming everyone's nerves. We're not getting that, though, obviously. ![]() |
![]() ![]() ![]() Send message Joined: 16 Apr 00 Posts: 1296 Credit: 45,357,093 RAC: 0 ![]() |
With all due respect to Ned and Pappa, the Cricket Graphs don't lie. There has been a steady, overall reduction in throughput going back a week; well before the cooling went out in the closet. There are occasional upward spikes, to be sure, but the trend is obvious. Two weeks ago, everything was chugging along just fine and this thread was practically dormant. We understand about weekly outages and emergencies like the A/C. But something else is clearly not right. ???? ![]() Join the PACK! |
Roundel Send message Joined: 1 Feb 06 Posts: 21 Credit: 6,850,211 RAC: 0 ![]() |
Not sure if others have gone through since that 1 went through last night. But I'm now dry on a few machines and almost dry overall across the fleet Cant upload and now getting the errors that there are no jobs available on a dry machine. Oh well, all the hardware can take a much needed rest. |
Roundel Send message Joined: 1 Feb 06 Posts: 21 Credit: 6,850,211 RAC: 0 ![]() |
Well thats interesting, especially if you look at the monthly range. I hadn't noticed any connectivity problems until this whole situation arose at the beginning of the week. I wonder if a router or switch has been dying a slow death and finally gave up the ghost. |
PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1 ![]() |
Yes, over the month there is an obvious trend, but look at the yearly chart; the recent performance is in the noise! (But don't tell Matt or he may defer fixing the problem to work on other issues.) |
PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1 ![]() |
Monitoring my upload process, I see a very few making it through at present. What is frustrating is that I see a lot that get as far as 100% uploaded to be subsequently rejected and queued up to try again. The last bit of handshaking fails and causes the system to repeat work (upload) that appears to have been completed. This is not an new observation. Because it obviously takes bandwidth and server resources to execute this type of failure, and because the behavior has been around 'forever', has any effort been made to remedy it? |
Dorphas ![]() Send message Joined: 16 May 99 Posts: 118 Credit: 8,007,247 RAC: 0 ![]() |
don't know what this may mean in the bigger picture, but i just had one machine upload about 50 workunits....but i can't get them to report at all. ![]() |
Highlander ![]() Send message Joined: 5 Oct 99 Posts: 167 Credit: 37,987,668 RAC: 16 ![]() ![]() |
My Rumor: I think, the last great power outage in the Bay-Area had demaged the ISP-Hardware, and ISP had setup a 10 mbit-link for emergency use. But this is really only my thought about the situation. And whatever it really is, hope all can be fixed in near future (many UL waiting on my side ^^). - Performance is not a simple linear function of the number of CPUs you throw at the problem. - |
![]() ![]() Send message Joined: 23 May 99 Posts: 4292 Credit: 72,971,319 RAC: 0 ![]() |
The only way I can get any to upload is keep pressing buttons...This project backoff is for the birds, I would rather see them fix the problem than cripple the client. Some get thru but then the project wants to backoff for 2 hours like that is going to do anything but delay the problem. Official Abuser of Boinc Buttons... And no good credit hound! ![]() |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
The only way I can get any to upload is keep pressing buttons...This project backoff is for the birds, I would rather see them fix the problem than cripple the client. Some get thru but then the project wants to backoff for 2 hours like that is going to do anything but delay the problem. I did a bit of button-pushing this morning, and got one machine down to one upload pending (it only had about a dozen in total, so I wasn't adding much to the load!). Nothing on the reporting front, until it tried again of its own accord while I was on the phone at 15:24. SETI@home 18/02/2010 15:24:41 Requesting 718981 seconds of new work, and reporting 10 completed tasks SETI@home 18/02/2010 15:24:56 Scheduler RPC succeeded [server version 611] SETI@home 18/02/2010 15:24:56 Message from server: (Project has no jobs available) Says it all, really. |
Dave Send message Joined: 29 Mar 02 Posts: 778 Credit: 25,001,396 RAC: 0 ![]() |
I know it makes us feel good - + I'm the same - but all this manual button-pushing remember does actually make things worse because it's putting more load on the server. The backoffs, though annoying, are there to spread the load throughout the thousands of clients out there. |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
Yes, over the month there is an obvious trend, but look at the yearly chart; the recent performance is in the noise! (But don't tell Matt or he may defer fixing the problem to work on other issues.) Up to and including Week 3 on the monthly chart, they were only splitting tapes for MultiBeam work - they were wrestling with major Astropulse database problems. Astropulse splitting restarted during Week 4, and accounts for the higher average throughput since then (there hasn't been a regular supply of AP work since last May, and AP-crunchers' caches are drier than Death Valley). Every AP unit split gets gobbled up instantly. It's gone quiet again on the AP front now, because all loaded tapes have been split. Other peaks and troughs relate to the variety in Angle Range for the MB work recently: if a recording was made during a high AR sky survey, the resulting WUs are processed (and hence downloaded) at four(-ish) times the rate of other ARs. And the flatline since Monday is another story entirely..... |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.