Panic Mode On (35) Server problems

Author	Message
hiamps Volunteer tester Send message Joined: 23 May 99 Posts: 4292 Credit: 72,971,319 RAC: 0	Message 1012348 - Posted: 5 Jul 2010, 19:43:47 UTC Something just doesn't add up right... Results ready to send 183,652 1,640 9m Current result creation rate 27.8027/sec 0.2383/sec 5m 7/5/2010 12:40:16 PM SETI@home Requesting new tasks for GPU 7/5/2010 12:40:20 PM SETI@home Scheduler request completed: got 0 new tasks 7/5/2010 12:40:20 PM SETI@home Message from server: Project has no tasks available 7/5/2010 12:40:36 PM SETI@home Sending scheduler request: To fetch work. 7/5/2010 12:40:36 PM SETI@home Requesting new tasks for GPU 7/5/2010 12:40:43 PM SETI@home Scheduler request completed: got 0 new tasks 7/5/2010 12:40:43 PM SETI@home Message from server: Project has no tasks available 7/5/2010 12:40:58 PM SETI@home Sending scheduler request: To fetch work. 7/5/2010 12:40:58 PM SETI@home Requesting new tasks for GPU 7/5/2010 12:41:07 PM SETI@home Scheduler request completed: got 0 new tasks 7/5/2010 12:41:07 PM SETI@home Message from server: Project has no tasks available Official Abuser of Boinc Buttons... And no good credit hound! ID: 1012348 ·

[B^S] madmac Volunteer tester Send message Joined: 9 Feb 04 Posts: 1175 Credit: 4,754,897 RAC: 0	Message 1012349 - Posted: 5 Jul 2010, 19:45:52 UTC I have downloaded some WUS just checked and I have got client detached when I have not is there a problem? ID: 1012349 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1012352 - Posted: 5 Jul 2010, 19:51:43 UTC - in response to Message 1012348. Last modified: 5 Jul 2010, 19:52:21 UTC Something just doesn't add up right... Results ready to send 183,652 1,640 9m Current result creation rate 27.8027/sec 0.2383/sec 5m 7/5/2010 12:40:16 PM SETI@home Requesting new tasks for GPU 7/5/2010 12:40:20 PM SETI@home Scheduler request completed: got 0 new tasks 7/5/2010 12:40:20 PM SETI@home Message from server: Project has no tasks available 7/5/2010 12:40:36 PM SETI@home Sending scheduler request: To fetch work. 7/5/2010 12:40:36 PM SETI@home Requesting new tasks for GPU 7/5/2010 12:40:43 PM SETI@home Scheduler request completed: got 0 new tasks 7/5/2010 12:40:43 PM SETI@home Message from server: Project has no tasks available 7/5/2010 12:40:58 PM SETI@home Sending scheduler request: To fetch work. 7/5/2010 12:40:58 PM SETI@home Requesting new tasks for GPU 7/5/2010 12:41:07 PM SETI@home Scheduler request completed: got 0 new tasks 7/5/2010 12:41:07 PM SETI@home Message from server: Project has no tasks available Hiamps buddy.....this has been discussed sooooooooo many times here before. Surprised you have never caught it. There are lots of WUs ready to send, but they only get loaded into the feeder one batch at a time. And are being sent out as fast as the bandwidth will allow. Then the feeder has to be refilled. When you connect and the feeder is between cycles, you get the no tasks available message. You just have to happen to connect at the moment that the feeder has just been refilled to get some for your rig. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1012352 ·

JohnDK Volunteer tester Send message Joined: 28 May 00 Posts: 1222 Credit: 451,243,443 RAC: 1,127	Message 1012353 - Posted: 5 Jul 2010, 19:51:59 UTC - in response to Message 1012348. Something just doesn't add up right... Results ready to send 183,652 1,640 9m Current result creation rate 27.8027/sec 0.2383/sec 5m Something about those Results ready to send, are send to the download servers at a rate where it can't keep up with demand and therefor are empty at times. I think the status for that is missing, I think be course it changes so quick all the time it wouldn't make sense showing it. (This is my attempt on explaning something I know just about nothing about :) ) ID: 1012353 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1012386 - Posted: 5 Jul 2010, 21:05:49 UTC Okay, I tried reading through the past 50 posts and just skipped to the end. I suppose since I'm running AP-only, I could possibly provide some insight to those spikes. I was able to very rarely get work (0 and 1 tasks) during the low periods, but during the spikes is when I was able to get three new WUs at a time, and could just about get one every time I asked for one. The Catch-22 is since the bandwidth is maxed, there are HTTP errors and slow download rates, so BOINC won't ask for new work until the downloads are actually in progress instead of "connect failed." So it's possible that those spikes were in fact AP distribution since it was a lot easier to get them during the spikes. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1012386 ·

hiamps Volunteer tester Send message Joined: 23 May 99 Posts: 4292 Credit: 72,971,319 RAC: 0	Message 1012392 - Posted: 5 Jul 2010, 21:21:14 UTC - in response to Message 1012352. Last modified: 5 Jul 2010, 21:21:38 UTC Something just doesn't add up right... Results ready to send 183,652 1,640 9m Current result creation rate 27.8027/sec 0.2383/sec 5m 7/5/2010 12:40:16 PM SETI@home Requesting new tasks for GPU 7/5/2010 12:40:20 PM SETI@home Scheduler request completed: got 0 new tasks 7/5/2010 12:40:20 PM SETI@home Message from server: Project has no tasks available 7/5/2010 12:40:36 PM SETI@home Sending scheduler request: To fetch work. 7/5/2010 12:40:36 PM SETI@home Requesting new tasks for GPU 7/5/2010 12:40:43 PM SETI@home Scheduler request completed: got 0 new tasks 7/5/2010 12:40:43 PM SETI@home Message from server: Project has no tasks available 7/5/2010 12:40:58 PM SETI@home Sending scheduler request: To fetch work. 7/5/2010 12:40:58 PM SETI@home Requesting new tasks for GPU 7/5/2010 12:41:07 PM SETI@home Scheduler request completed: got 0 new tasks 7/5/2010 12:41:07 PM SETI@home Message from server: Project has no tasks available Hiamps buddy.....this has been discussed sooooooooo many times here before. Surprised you have never caught it. There are lots of WUs ready to send, but they only get loaded into the feeder one batch at a time. And are being sent out as fast as the bandwidth will allow. Then the feeder has to be refilled. When you connect and the feeder is between cycles, you get the no tasks available message. You just have to happen to connect at the moment that the feeder has just been refilled to get some for your rig. Went in and reset my DCF for the millionth time and started getting some work. Official Abuser of Boinc Buttons... And no good credit hound! ID: 1012392 ·

hiamps Volunteer tester Send message Joined: 23 May 99 Posts: 4292 Credit: 72,971,319 RAC: 0	Message 1012396 - Posted: 5 Jul 2010, 21:27:41 UTC Last modified: 5 Jul 2010, 21:33:56 UTC Nothing even turned in and once again I am at 99 hours for completion....PLEASE FIX THIS. EDIT: This time I changed the DCF in old and previous client states also, wonder how long it will last? Official Abuser of Boinc Buttons... And no good credit hound! ID: 1012396 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1012411 - Posted: 5 Jul 2010, 22:04:13 UTC Last modified: 5 Jul 2010, 22:11:35 UTC Seems my E8500 host has had a miscommunication and detached itself, after i abort all the now useless tasks, one of the next communications gives me 46 Ghost Wu's, was it really a good idea to limit everyone to 20 tasks at a time over the weekend? now even more hosts need to refill their cache's, making this more likely. Claggy ID: 1012411 ·

Odan Send message Joined: 8 May 03 Posts: 91 Credit: 15,331,177 RAC: 0	Message 1012416 - Posted: 5 Jul 2010, 22:13:30 UTC - in response to Message 1012047. Last modified: 5 Jul 2010, 22:14:15 UTC Anybody have a clue what the bandwidth cycles on the Cricket Graph are all about? I don't think I have ever seen such a well defined pattern before.... There appears to be correlation with when the splitters are boosting the "Results ready to send" and when they're idle. Compare Scarecrow's graphs, though it's hard to really match the time scales. It's a case where sampling the server status once an hour isn't quite enough to pin down the relationship, but my guess is the high rate download bursts are occurring just after the splitters have stopped for awhile. Or it could just be coincidence... Joe And "ready to send" doesn't drop to zero when splitters are idle, but bandwidth load drops hugely nevertheless. My suspicions- The Ready to Send buffer probably drops quite rapidly when badnwidth is maxed out, then continues to drop gradually once the network traffic drops off till such time as the splitters fire up & top up the buffer; the graphs aren't updated frequently enough to see accurately what's happening. Extremely wild supposition- The traffic bursts may be related to odd work request behaviour. I noticed one or 2 threads where people commented about the client not requesting new work, even though they had less than 20 in their cache. After a while, it does request work & that's when you're getting those bursts in network traffic. Lots of clients running down their buffer below 20 Work Units before requesting more, resulting in short bursts of network traffic. EDIT- just had a look at the Astropulse graphs. It shows a full Ready to Send buffer, with ups & downs similar to MB, but the slope of the waveform is different. MB- buffer fills quickly, drains slowly. AP- buffer fills slowly, but drains quickly. Looks like the spkies could be very much AP related. Just odd with their fairly consistent frequency. The spike interval & duration I can't explain any better than above but if you compare the cricket graph with scarecrow's "AP in progress" you can see a beautiful correlation with a ratcheting up of AP in progress every time a data burst occurs. Very pretty! [/img] ID: 1012416 ·

soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0	Message 1012418 - Posted: 5 Jul 2010, 22:17:44 UTC to add insult to injury, there seems to be some internet connectivity problems towards the San Francisco bay area in general (not SETI/Berkeley/etc specific). This could be compounding errors while downloading and amplifying spikes. Janice ID: 1012418 ·

Ianab Volunteer tester Send message Joined: 11 Jun 08 Posts: 732 Credit: 20,635,586 RAC: 5	Message 1012419 - Posted: 5 Jul 2010, 22:26:05 UTC - in response to Message 1012411. Well the plan seemed to be do a "Soft Start". Limit the work units to 20 until all the slower hosts had full caches, then increase the limit in steps over a day to 2 and get the big caches full in stages. But when they opened the tap a bit more, the database crashed. So things were left on trickle feed for the weekend as that was a lot better than being dead in the water with the database down. Now the tap has been turned on, data has is flowing as fast as the system is able. But for my slower PCs things are worse as the overload means they cant refill their 10 or 20 WU caches at the moment. And I suspect the guys looking for 500 units are not getting them reliably either. So now we are just now seeing the congestion that we expected on Friday. The Gradual Startup was a good plan, hope they can get the bugs worked out of. Separate quota for GPU and CPU would quiet a lot of the Whining from the power crunchers that have been on short rations, and get some work to everyone. Ian ID: 1012419 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1012426 - Posted: 5 Jul 2010, 22:32:25 UTC and now another 17 Ghost Wu's, Claggy ID: 1012426 ·

hiamps Volunteer tester Send message Joined: 23 May 99 Posts: 4292 Credit: 72,971,319 RAC: 0	Message 1012476 - Posted: 6 Jul 2010, 0:02:55 UTC - in response to Message 1012396. Nothing even turned in and once again I am at 99 hours for completion....PLEASE FIX THIS. EDIT: This time I changed the DCF in old and previous client states also, wonder how long it will last? Well this is the longest my times have stayed normal since this started. Anyone else have the DCF problem it would be cool to know if this helps. I changed the DCF to .2 in client state, client state_old and client state_previous. Official Abuser of Boinc Buttons... And no good credit hound! ID: 1012476 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1012482 - Posted: 6 Jul 2010, 0:11:03 UTC - in response to Message 1012476. Nothing even turned in and once again I am at 99 hours for completion....PLEASE FIX THIS. EDIT: This time I changed the DCF in old and previous client states also, wonder how long it will last? Well this is the longest my times have stayed normal since this started. Anyone else have the DCF problem it would be cool to know if this helps. I changed the DCF to .2 in client state, client state_old and client state_previous. I have never edited the prev file...I didn't think Boinc referred to it except in the case of some sort of corruption or crash of the working file. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1012482 ·

hiamps Volunteer tester Send message Joined: 23 May 99 Posts: 4292 Credit: 72,971,319 RAC: 0	Message 1012485 - Posted: 6 Jul 2010, 0:27:35 UTC - in response to Message 1012482. Last modified: 6 Jul 2010, 0:28:02 UTC Nothing even turned in and once again I am at 99 hours for completion....PLEASE FIX THIS. EDIT: This time I changed the DCF in old and previous client states also, wonder how long it will last? Well this is the longest my times have stayed normal since this started. Anyone else have the DCF problem it would be cool to know if this helps. I changed the DCF to .2 in client state, client state_old and client state_previous. I have never edited the prev file...I didn't think Boinc referred to it except in the case of some sort of corruption or crash of the working file. Neither had I, but I was desperate...LOL I just stole 60 non Vlars from my CPU with the rescheduler and my times stayed stable. Not sure which folder did it or if I am just lucky right now, thats why I am hoping someone else will try. Official Abuser of Boinc Buttons... And no good credit hound! ID: 1012485 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1012488 - Posted: 6 Jul 2010, 0:35:07 UTC and another 56 Ghosts, Claggy ID: 1012488 ·

hiamps Volunteer tester Send message Joined: 23 May 99 Posts: 4292 Credit: 72,971,319 RAC: 0	Message 1012489 - Posted: 6 Jul 2010, 0:36:11 UTC - in response to Message 1012488. and another 56 Ghosts, Claggy How do you tell if they are ghosts? Official Abuser of Boinc Buttons... And no good credit hound! ID: 1012489 ·

hiamps Volunteer tester Send message Joined: 23 May 99 Posts: 4292 Credit: 72,971,319 RAC: 0	Message 1012490 - Posted: 6 Jul 2010, 0:38:26 UTC - in response to Message 1012485. Last modified: 6 Jul 2010, 0:41:19 UTC Nothing even turned in and once again I am at 99 hours for completion....PLEASE FIX THIS. EDIT: This time I changed the DCF in old and previous client states also, wonder how long it will last? Well this is the longest my times have stayed normal since this started. Anyone else have the DCF problem it would be cool to know if this helps. I changed the DCF to .2 in client state, client state_old and client state_previous. I have never edited the prev file...I didn't think Boinc referred to it except in the case of some sort of corruption or crash of the working file. Neither had I, but I was desperate...LOL I just stole 60 non Vlars from my CPU with the rescheduler and my times stayed stable. Not sure which folder did it or if I am just lucky right now, thats why I am hoping someone else will try. Well never mind I was just lucky for a bit....back to 88 hours... EDIT: I think a VLar did me in, as I don't have any errors but noticed the last batch of downloads for GPU was all Vlars that wanted to start NOW! Official Abuser of Boinc Buttons... And no good credit hound! ID: 1012490 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1012493 - Posted: 6 Jul 2010, 0:43:55 UTC - in response to Message 1012489. Last modified: 6 Jul 2010, 1:21:50 UTC and another 56 Ghosts, Claggy How do you tell if they are ghosts? 06/07/2010 01:26:47 SETI@home update requested by user 06/07/2010 01:26:51 SETI@home [sched_op_debug] Starting scheduler request 06/07/2010 01:26:51 SETI@home Sending scheduler request: Requested by user. 06/07/2010 01:26:51 SETI@home Requesting new tasks for GPU 06/07/2010 01:26:51 SETI@home [sched_op_debug] CPU work request: 0.00 seconds; 0.00 CPUs 06/07/2010 01:26:51 SETI@home [sched_op_debug] NVIDIA GPU work request: 295929.11 seconds; 0.00 GPUs 06/07/2010 01:26:51 SETI@home [sched_op_debug] ATI GPU work request: 309850.73 seconds; 0.00 GPUs 06/07/2010 01:28:35 Project communication failed: attempting access to reference site 06/07/2010 01:28:35 SETI@home Scheduler request failed: Transferred a partial file 06/07/2010 01:28:35 SETI@home [sched_op_debug] Deferring communication for 1 min 0 sec 06/07/2010 01:28:35 SETI@home [sched_op_debug] Reason: Scheduler request failed 06/07/2010 01:28:36 Internet access OK - project servers may be temporarily down. 06/07/2010 01:28:43 SETI@home work fetch suspended by user and i've got fresh tasks listed in my task list at 6 Jul 2010 0:26:51 UTC, but Boinc never got them :-( Claggy ID: 1012493 ·

Bill Walker Send message Joined: 4 Sep 99 Posts: 3868 Credit: 2,697,267 RAC: 0	Message 1012496 - Posted: 6 Jul 2010, 0:47:23 UTC Just when you think things are getting better: BOINC asked for about 30 tasks, finally enough to fill my 5 day queue. Two downloaded about 20%, now all say "project back off..." Oh wait, one came back, up to 51% downloaded! ID: 1012496 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.