Panic Mode On (56) Server problems?

Author	Message
Starman Send message Joined: 15 May 99 Posts: 204 Credit: 81,351,915 RAC: 25	Message 1156646 - Posted: 27 Sep 2011, 15:11:27 UTC Well, my "work" machine is dead in the water. It hasn't been able to upload since yesterday afternoon, even using the 72.52.96.30 proxy! ID: 1156646 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67	Message 1156651 - Posted: 27 Sep 2011, 15:39:26 UTC - in response to Message 1156646. Has anybody tried using programs like Tor to enable using proxies. I know it is designed to provide security/hide users etc. Haven't any experience myself, but maybe worth a try, but might need a knowledgeable person to set up, I just don't know. ID: 1156651 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1156652 - Posted: 27 Sep 2011, 15:39:48 UTC No ups or downs right now... One outage....coming up. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1156652 ·

Dave Send message Joined: 29 Mar 02 Posts: 778 Credit: 25,001,396 RAC: 0	Message 1156656 - Posted: 27 Sep 2011, 21:45:16 UTC Aaand we're through ID: 1156656 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1156657 - Posted: 27 Sep 2011, 21:45:57 UTC I got myself 13 or 14 APs just before things went down earlier today. Don't know what the DL speed was since I was sleeping, but I woke up to find that I had a nice list of "waiting to run." Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1156657 ·

Lint trap Send message Joined: 30 May 03 Posts: 871 Credit: 28,092,319 RAC: 0	Message 1156658 - Posted: 27 Sep 2011, 21:46:27 UTC - in response to Message 1156656. Aaand we're through Seems like it... Lt ID: 1156658 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1156662 - Posted: 27 Sep 2011, 21:50:58 UTC Remember how the server problems started when task estimates jumped up to several hours? Preliminary observations are that estimates for work issued since the outage have started to move back towards normality. I haven't got enough yet to measure how big the change is, but the last plan we heard about was to make a five-fold step change this first time. So, for the time being, you may get five times as much work as you expect - you have been warned ;-) Of course, the quota limits of 50/CPU and 400/GPU should still be in place to stop things going off scale - but I haven't been able to check that yet. ID: 1156662 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1156664 - Posted: 27 Sep 2011, 22:04:17 UTC - in response to Message 1156662. Last modified: 27 Sep 2011, 22:06:12 UTC Remember how the server problems started when task estimates jumped up to several hours? Preliminary observations are that estimates for work issued since the outage have started to move back towards normality. I haven't got enough yet to measure how big the change is, but the last plan we heard about was to make a five-fold step change this first time. So, for the time being, you may get five times as much work as you expect - you have been warned ;-) Of course, the quota limits of 50/CPU and 400/GPU should still be in place to stop things going off scale - but I haven't been able to check that yet. I can confirm that too, i snipped my Two Astropulse tasks out of my client_state.xml and got them resent, their estimated runtimes matched what i expected, i then got all my recent MB Wu's resent, they now match Wu's i got just before the bodged changeset, now going to do my remaining shorties to get my DCF back up to One. Claggy ID: 1156664 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1156671 - Posted: 27 Sep 2011, 22:22:12 UTC - in response to Message 1156664. Last modified: 27 Sep 2011, 22:29:07 UTC Remember how the server problems started when task estimates jumped up to several hours? Preliminary observations are that estimates for work issued since the outage have started to move back towards normality. I haven't got enough yet to measure how big the change is, but the last plan we heard about was to make a five-fold step change this first time. So, for the time being, you may get five times as much work as you expect - you have been warned ;-) Of course, the quota limits of 50/CPU and 400/GPU should still be in place to stop things going off scale - but I haven't been able to check that yet. I can confirm that too, i snipped my Two Astropulse tasks out of my client_state.xml and got them resent, their estimated runtimes matched what i expected, i then got all my recent MB Wu's resent, they now match Wu's i got just before the bodged changeset, now going to do my remaining shorties to get my DCF back up to One. Claggy ALL THE WAY in one jump?? !! What did the DCF get down to on that box? If it was below 0.1, won't we be in -177 territory? One of my 9800GTs is showing DCF=0.0391 - I'd better got and nudge it into a fetch. Back soon..... Edit - got some already. Showing 1 minute 15 seconds for a shorty, 4 m 15 s for mid-AR - I'd expect 4 minutes plus / around 20 minutes respectively, so we're not getting the whole DCF correction back in one go. Phew - may be a bit rocky, but not enough for errors. Panic over. ID: 1156671 ·

Lint trap Send message Joined: 30 May 03 Posts: 871 Credit: 28,092,319 RAC: 0	Message 1156674 - Posted: 27 Sep 2011, 22:27:58 UTC My DCF has Sky-Rocketed up to .19xx from where it was this morning, between .01xxx and .02xxx. GPU ap's still have longer estimates than CPU ap's, though... No new work received since the outage ended. Lt ID: 1156674 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1156679 - Posted: 27 Sep 2011, 22:41:12 UTC - in response to Message 1156671. Last modified: 27 Sep 2011, 22:45:36 UTC Remember how the server problems started when task estimates jumped up to several hours? Preliminary observations are that estimates for work issued since the outage have started to move back towards normality. I haven't got enough yet to measure how big the change is, but the last plan we heard about was to make a five-fold step change this first time. So, for the time being, you may get five times as much work as you expect - you have been warned ;-) Of course, the quota limits of 50/CPU and 400/GPU should still be in place to stop things going off scale - but I haven't been able to check that yet. I can confirm that too, i snipped my Two Astropulse tasks out of my client_state.xml and got them resent, their estimated runtimes matched what i expected, i then got all my recent MB Wu's resent, they now match Wu's i got just before the bodged changeset, now going to do my remaining shorties to get my DCF back up to One. Claggy ALL THE WAY in one jump?? !! What did the DCF get down to on that box? If it was below 0.1, won't we be in -177 territory? One of my 9800GTs is showing DCF=0.0391 - I'd better got and nudge it into a fetch. Back soon..... I've always had flops in my app_info's (except for the HD5770), so DCF didn't go down as low as others did: 27/09/2011 23:00:02 SETI@home [dcf] DCF: 0.175808->1.307273, raw_ratio 1.307273, adj_ratio 7.435800 And i've been doing pre and post change Wu's (suspending old Wu's), first doing old Wu's with DCF ~1, then doing the post Dodgey changeset Wu's with DCF ~0.18, then swapping back to doing Old Wu's again, Now i'm rushing through my resent shorties in ~2 minutes, DCF climbing fast, APR rates look normal. Edit: Richard, remember i'm running Jason's special Boinc.exe, and i haven't had any AP since before the bodged changeset, so DCF there was ~1. Claggy ID: 1156679 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1156682 - Posted: 27 Sep 2011, 22:52:34 UTC - in response to Message 1156679. Remember how the server problems started when task estimates jumped up to several hours? Preliminary observations are that estimates for work issued since the outage have started to move back towards normality. I haven't got enough yet to measure how big the change is, but the last plan we heard about was to make a five-fold step change this first time. So, for the time being, you may get five times as much work as you expect - you have been warned ;-) Of course, the quota limits of 50/CPU and 400/GPU should still be in place to stop things going off scale - but I haven't been able to check that yet. I can confirm that too, i snipped my Two Astropulse tasks out of my client_state.xml and got them resent, their estimated runtimes matched what i expected, i then got all my recent MB Wu's resent, they now match Wu's i got just before the bodged changeset, now going to do my remaining shorties to get my DCF back up to One. Claggy ALL THE WAY in one jump?? !! What did the DCF get down to on that box? If it was below 0.1, won't we be in -177 territory? One of my 9800GTs is showing DCF=0.0391 - I'd better got and nudge it into a fetch. Back soon..... I've always had flops in my app_info's (except for the HD5770), so DCF didn't go down as low as others did: 27/09/2011 23:00:02 SETI@home [dcf] DCF: 0.175808->1.307273, raw_ratio 1.307273, adj_ratio 7.435800 And i've been doing pre and post change Wu's (suspending old Wu's), first doing old Wu's with DCF ~1, then doing the post Dodgey changeset Wu's with DCF ~0.18, then swapping back to doing Old Wu's again, Now i'm rushing through my resent shorties in ~2 minutes, DCF climbing fast, APR rates look normal. Claggy I'm deliberately running without flops so I can watch what the server's doing cleanly. That DCF of 0.0391 was set by a mid-AR which completed in 1,279.42 seconds The new estimate of 4:15 (255 seconds) is almost exactly five times smaller. I've completed the first task from the new estimate set - a shorty, so we wouldn't expect DCF to reset all the way, but it's come up to 0.1310 - that's fair enough. I'd expect fast Fermis to scrape above 0.02, so once they get and complete their first single WU, I would expect them to start fetching normally again - but still to have DCF well below 0.1, so it won't be safe to remove the cap completely next week. We'll need at least one more interim step. ID: 1156682 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1156717 - Posted: 28 Sep 2011, 0:47:07 UTC Definitely looks like ETAs are way down. Looks like ~33% what it should be normally. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1156717 ·

Gary Charpentier Volunteer tester Send message Joined: 25 Dec 00 Posts: 30649 Credit: 53,134,872 RAC: 32	Message 1156719 - Posted: 28 Sep 2011, 0:55:57 UTC Here we go again ... Tue Sep 27 17:54:00 2011 SETI@home Temporarily failed download of 23jn11ae.3946.24198.15.10.62: HTTP error Tue Sep 27 17:54:00 2011 SETI@home Backing off 1 min 0 sec on download of 23jn11ae.3946.24198.15.10.62 Tue Sep 27 17:54:02 2011 SETI@home Temporarily failed download of 23jn11ae.3946.24198.15.10.99: HTTP error Tue Sep 27 17:54:02 2011 SETI@home Backing off 1 min 0 sec on download of 23jn11ae.3946.24198.15.10.99 ID: 1156719 ·

zoom3+1=4 Volunteer tester Send message Joined: 30 Nov 03 Posts: 65745 Credit: 55,293,173 RAC: 49	Message 1156721 - Posted: 28 Sep 2011, 1:00:02 UTC - in response to Message 1156719. I had that happen to Me a few hours back, about a minute ago though I was able to snag 14 wu's for the 2 GTX295 cards in this PC, I have a cache of 1.25 days. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's ID: 1156721 ·

Terror Australis Volunteer tester Send message Joined: 14 Feb 04 Posts: 1817 Credit: 262,693,308 RAC: 44	Message 1156741 - Posted: 28 Sep 2011, 2:54:47 UTC And the Network is UP!!! I'm now able to get through without a proxy. Not getting a lot of work due to the logjam but when I do the downloads "hoot", peak speeds of up to 30KBs. Hopefully the fix is in. DCF's are all over the place, from 0.09 to 1.4. (Don't know why, I disabled DCF correction in my Rescheduling program.) Estimated GPU crunching times on new units looks pretty good but CPU times are about X3. T.A. ID: 1156741 ·

Wandering Willie Volunteer tester Send message Joined: 19 Aug 99 Posts: 136 Credit: 2,127,073 RAC: 0	Message 1156775 - Posted: 28 Sep 2011, 8:15:27 UTC Quick question just received 30 short WU,s after outage all resend time outs for 27/09/2011 Should I leave these for an hour or two to let the replica data base catch up or will they be okay to crunch. (13,358 seconds) Deadline for these 11/10/2011 Michael ID: 1156775 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67	Message 1156777 - Posted: 28 Sep 2011, 8:27:28 UTC - in response to Message 1156775. Last modified: 28 Sep 2011, 8:28:00 UTC I'd crunch them, and if you're worried about replica catch up, then hit the activity menu and suspend network activity. That will suspend everything on the network so no requests, d/loads as well. ID: 1156777 ·

Wandering Willie Volunteer tester Send message Joined: 19 Aug 99 Posts: 136 Credit: 2,127,073 RAC: 0	Message 1156778 - Posted: 28 Sep 2011, 8:36:26 UTC - in response to Message 1156777. Thank you. It was just in case they had already been completed. Michael ID: 1156778 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1156810 - Posted: 28 Sep 2011, 12:40:36 UTC Last modified: 28 Sep 2011, 12:47:29 UTC Current server status shows replica DB caught up... Looks like AP work is going out again, so bandwidth and downloads are stuffed. No MB left to split until the AP stuff goes out....my GPUs are gonna starve. Hang in there, kitties. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1156810 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.