Panic Mode On (47) Server problems?

Author	Message
Dave Send message Joined: 29 Mar 02 Posts: 778 Credit: 25,001,396 RAC: 0	Message 1112799 - Posted: 3 Jun 2011, 20:46:58 UTC Floating on a sea of green? Surely floating on the blue line. In a green atmosphere... Maybe we have already found alien life - in the form of a duck that breaths green air!!! ID: 1112799 ·

SciManStev Volunteer tester Send message Joined: 20 Jun 99 Posts: 6656 Credit: 121,090,076 RAC: 0	Message 1112819 - Posted: 3 Jun 2011, 21:44:12 UTC I really like that duck! Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website ID: 1112819 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14661 Credit: 200,643,578 RAC: 874	Message 1112840 - Posted: 3 Jun 2011, 23:45:53 UTC - in response to Message 1112637. I'm surprised about the following message of BOINC which come from time to time since ~ 00:42 UTC: 'Scheduler request failed: Timeout was reached' - after ~ 60 secs, IIRC normal this happened ~ 600 secs in past.. I tested this with my 420M laptop, also running v6.12.28: 04/06/2011 00:33:22 \| SETI@home \| Sending scheduler request: To fetch work. 04/06/2011 00:33:22 \| SETI@home \| Reporting 7 completed tasks, requesting new tasks for CPU and NVIDIA GPU 04/06/2011 00:33:22 \| SETI@home \| [sched_op] CPU work request: 142292.84 seconds; 0.00 CPUs 04/06/2011 00:33:22 \| SETI@home \| [sched_op] NVIDIA GPU work request: 18929.68 seconds; 0.00 GPUs 04/06/2011 00:35:22 \| SETI@home \| Scheduler request completed: got 48 new tasks 04/06/2011 00:35:22 \| SETI@home \| [sched_op] estimated total CPU task duration: 157672 seconds 04/06/2011 00:35:22 \| SETI@home \| [sched_op] estimated total NVIDIA GPU task duration: 21611 seconds So extactly two minutes, but no timeout. Of the 48 tasks, 39 were shorties. ID: 1112840 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13797 Credit: 208,696,464 RAC: 304	Message 1112863 - Posted: 4 Jun 2011, 1:56:21 UTC - in response to Message 1112840. 6 days down & now into the 7th day of full network load. Usually the Results returned per hour are around 50,000 or so. Since the network storage problem outage it hasn't been below 60,000. For the last day it's been above 70,000 & it almost hit 80,000 at one stage. The hardware is certainly taking a beating, and no end in sight. The last batch of work i got still contained a lot of shorties. Grant Darwin NT ID: 1112863 ·

tbret Volunteer tester Send message Joined: 28 May 99 Posts: 3380 Credit: 296,162,071 RAC: 40	Message 1112890 - Posted: 4 Jun 2011, 5:01:52 UTC - in response to Message 1112863. 6 days down & now into the 7th day of full network load. Usually the Results returned per hour are around 50,000 or so. Since the network storage problem outage it hasn't been below 60,000. For the last day it's been above 70,000 & it almost hit 80,000 at one stage. The hardware is certainly taking a beating, and no end in sight. The last batch of work i got still contained a lot of shorties. I wonder how much damage the project would suffer if the air conditioning went away under these circumstances? Currently, "project backoff: 05:16:00" ID: 1112890 ·

tbret Volunteer tester Send message Joined: 28 May 99 Posts: 3380 Credit: 296,162,071 RAC: 40	Message 1112896 - Posted: 4 Jun 2011, 5:42:18 UTC - in response to Message 1112863. The hardware is certainly taking a beating, and no end in sight. The last batch of work i got still contained a lot of shorties. Oooops. Grant, you and I forgot that the hardware is only working about 1/10th as hard as it was probably designed-for. ID: 1112896 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13797 Credit: 208,696,464 RAC: 304	Message 1112922 - Posted: 4 Jun 2011, 7:31:06 UTC - in response to Message 1112890. I wonder how much damage the project would suffer if the air conditioning went away under these circumstances? No damage, but everything would shutdown. Grant, you and I forgot that the hardware is only working about 1/10th as hard as it was probably designed-for. Actually, due to the lack of the network storage, parts of it will be running pretty much at their limits. Grant Darwin NT ID: 1112922 ·

Dave Send message Joined: 29 Mar 02 Posts: 778 Credit: 25,001,396 RAC: 0	Message 1112925 - Posted: 4 Jun 2011, 7:47:40 UTC - in response to Message 1112819. Last modified: 4 Jun 2011, 7:50:37 UTC I really like that duck! Steve & for those of you that remember Orville on BBC1: "I 'aaate that duck" ;)! ID: 1112925 ·

Speedy Volunteer tester Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89	Message 1112933 - Posted: 4 Jun 2011, 10:07:25 UTC - in response to Message 1112922. Last modified: 4 Jun 2011, 10:08:03 UTC Actually, due to the lack of the network storage, parts of it will be running pretty much at their limits. From my understanding an Overland tech logged into the storage server & got it operational again. (Thanks Overland) To my knowledge all data storage is online. If anythings hampering the server it'll be from tech news 1st June There are some broken astropulse results clogging one of the validators (which is why it shows up on red on the status page). We'll have to figure out an automated way to detect these results and push them through (it's a real pain to do by hand). In the meantime, this is causing our workunit storage server to be quite full, and might hamper other workunit development sooner than later. ID: 1112933 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13797 Credit: 208,696,464 RAC: 304	Message 1112954 - Posted: 4 Jun 2011, 11:50:30 UTC - in response to Message 1112933. From my understanding an Overland tech logged into the storage server & got it operational again. (Thanks Overland) To my knowledge all data storage is online. The last i saw they had got it up & running again, but were in the process of determining what the actual cause of the fault/s was/were. There hasn't been any outage since then to transfer the data back to it, so it's still residing on Thumper. Grant Darwin NT ID: 1112954 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1113120 - Posted: 4 Jun 2011, 17:48:12 UTC The hardware is only working at about 1/10th of what it would be, because it is limited to 100mbit instead of gigabit. Second, best I can tell, the overland storage IS functional and working.. but everything is still on thumper. And the ridiculous amount of shorties probably coincides with the large number of the 100% blanked APs that I've gotten lately. Used to be there were only one or two of those per hundred, but now it seems there's 1-2 per 10, sometimes more than that (two days ago, I ran through eight in a row in my cache.. did some excellent stuff for the DCF..I ended up getting 10 more APs than I usually do). So there were a bunch of tapes recently that were essentially trash, or mostly trash. I still think that after the software blanking runs through and does its thing, it should be able to just mark 100% blanked WUs and keep them from being sent out in the first place. MBs aren't so bad because they are small, but wasting 16MB on APs like that just doesn't make sense. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1113120 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1113129 - Posted: 4 Jun 2011, 18:27:14 UTC - in response to Message 1112840. Last modified: 4 Jun 2011, 18:34:04 UTC I'm surprised about the following message of BOINC which come from time to time since ~ 00:42 UTC: 'Scheduler request failed: Timeout was reached' - after ~ 60 secs, IIRC normal this happened ~ 600 secs in past.. I tested this with my 420M laptop, also running v6.12.28: 04/06/2011 00:33:22 \| SETI@home \| Sending scheduler request: To fetch work. 04/06/2011 00:33:22 \| SETI@home \| Reporting 7 completed tasks, requesting new tasks for CPU and NVIDIA GPU 04/06/2011 00:33:22 \| SETI@home \| [sched_op] CPU work request: 142292.84 seconds; 0.00 CPUs 04/06/2011 00:33:22 \| SETI@home \| [sched_op] NVIDIA GPU work request: 18929.68 seconds; 0.00 GPUs 04/06/2011 00:35:22 \| SETI@home \| Scheduler request completed: got 48 new tasks 04/06/2011 00:35:22 \| SETI@home \| [sched_op] estimated total CPU task duration: 157672 seconds 04/06/2011 00:35:22 \| SETI@home \| [sched_op] estimated total NVIDIA GPU task duration: 21611 seconds So extactly two minutes, but no timeout. Of the 48 tasks, 39 were shorties. This is the log of the time in question: 03-Jun-2011 02:41:05 [SETI@home] Reporting 2 completed tasks, requesting new tasks for CPU and NVIDIA GPU 03-Jun-2011 02:42:02 [SETI@home] Scheduler request failed: Timeout was reached From ~ the time of your contact: 04-Jun-2011 00:25:50 [SETI@home] Reporting 5 completed tasks, requesting new tasks for CPU and NVIDIA GPU 04-Jun-2011 00:26:45 [SETI@home] Scheduler request failed: Timeout was reached 04-Jun-2011 00:42:00 [SETI@home] Reporting 7 completed tasks, requesting new tasks for CPU and NVIDIA GPU 04-Jun-2011 00:42:58 [SETI@home] Scheduler request completed: got 20 new tasks And from a few minutes ago: SETI@home 04.06.2011 20:00:43 Reporting 3 completed tasks, requesting new tasks for CPU and NVIDIA GPU SETI@home 04.06.2011 20:01:43 Scheduler request failed: Timeout was reached SETI@home 04.06.2011 20:19:11 Reporting 6 completed tasks, requesting new tasks for CPU and NVIDIA GPU SETI@home 04.06.2011 20:20:03 Scheduler request completed: got 12 new tasks All ~ 60 secs between request and answer. I have only DSL light (384/64 kbit/s). ID: 1113129 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13797 Credit: 208,696,464 RAC: 304	Message 1113263 - Posted: 5 Jun 2011, 0:37:06 UTC Uh oh, uploads have stopped going through. Grant Darwin NT ID: 1113263 ·

arkayn Volunteer tester Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0	Message 1113270 - Posted: 5 Jun 2011, 0:54:59 UTC - in response to Message 1113264. The upload server is just responding real slow right now, it took almost a minute before it showed the test page for me. ID: 1113270 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13797 Credit: 208,696,464 RAC: 304	Message 1113272 - Posted: 5 Jun 2011, 0:59:02 UTC - in response to Message 1113263. Uh oh, uploads have stopped going through. Panic over, must have been a minor glitch. Grant Darwin NT ID: 1113272 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13797 Credit: 208,696,464 RAC: 304	Message 1113275 - Posted: 5 Jun 2011, 1:03:31 UTC - in response to Message 1113272. Uh oh, uploads have stopped going through. Panic over, must have been a minor glitch. Or maybe not. Another bunch have started to pile up. Couple of minutes down & none of them have even started to go through. Grant Darwin NT ID: 1113275 ·

Gatekeeper Send message Joined: 14 Jul 04 Posts: 887 Credit: 176,479,616 RAC: 0	Message 1113292 - Posted: 5 Jun 2011, 1:41:19 UTC Uploads seem to be going away. Haven't had much success in the last hour or so. Cricket graph shows upload declining as well. Wasn't this related to a full storage situation in the past? With all these shorties coming back, that certainly would make sense. ID: 1113292 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1113343 - Posted: 5 Jun 2011, 4:58:43 UTC Last modified: 5 Jun 2011, 5:01:12 UTC It looks like the UL server/service is completely down now. Temporarily failed upload of xxxxxxxxxxxxxxxxxxx: HTTP error Temporarily failed upload of xxxxxxxxxxxxxxxxxxx: connect() failed and so on.. At least my BOINC can't UL. *- Best regards!* -** Sutaru Tsureku, team seti.international founder. - Optimize your PC for higher RAC. - SETI@home needs your help. - ID: 1113343 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19227 Credit: 40,757,560 RAC: 67	Message 1113347 - Posted: 5 Jun 2011, 5:11:32 UTC - in response to Message 1113343. Last modified: 5 Jun 2011, 5:12:27 UTC Uploads are working, maybe a little slowly; 05/06/2011 06:05:46 SETI@home Started upload of 2ap11ac.7671.13564.7.10.33_0_0 05/06/2011 06:07:02 SETI@home Finished upload of 2ap11ac.7671.13564.7.10.33_0_0 Times BST i.e. UTC +1 ID: 1113347 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13797 Credit: 208,696,464 RAC: 304	Message 1113353 - Posted: 5 Jun 2011, 6:05:48 UTC - in response to Message 1113347. Last modified: 5 Jun 2011, 6:06:57 UTC Uploads are working, maybe a little slowly; Try extremely slowly. Anything from 1-12 attempts before it goes through, and at about 2kB/s if i'm lucky. Network traffic & Scarecrow's Graphs show a significant drop off in the amount of work being returned. Something's broken, or seriously clogged. EDIT- and even the forums are behaving sluggishly now & then. Grant Darwin NT ID: 1113353 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.