Panic Mode On (57) Server problems?

Author	Message
Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 1158874 - Posted: 4 Oct 2011, 22:36:54 UTC - in response to Message 1158846. The scheduler is really funny. Seeing only 5.4 CPUs. 04.10.2011 23:05:30 SETI@home [sched_op] Starting scheduler request 04.10.2011 23:05:30 SETI@home Sending scheduler request: Requested by user. 04.10.2011 23:05:30 SETI@home Reporting 43 completed tasks, requesting new tasks for CPU and ATI GPU 04.10.2011 23:05:30 SETI@home [sched_op] CPU work request: 8902321.21 seconds; 5.40 CPUs 04.10.2011 23:05:30 SETI@home [sched_op] ATI GPU work request: 9069.50 seconds; 0.00 GPUs That's 0.6 reserved for the ATI GPU (0.3 for each of 2 tasks it's hoping to get). AFAICT the servers don't do anything with that count to ensure you get at least that many tasks for CPU, maybe sometime in the future. Having a non-zero instances does condition some logic to at least consider sending some work, though. Joe ID: 1158874 ·

Dimly Lit Lightbulb ðŸ˜€ Volunteer tester Send message Joined: 30 Aug 08 Posts: 15399 Credit: 7,423,413 RAC: 1	Message 1158875 - Posted: 4 Oct 2011, 22:37:27 UTC Well, thanks to something stuffing up my computer for the second time in as many weeks, a bunch of tasks errored out. Couple that with: 04/10/2011 23:16:35 \| SETI@home \| Project has no tasks available, I am now down to my final four tasks. It's OK though, they should keep me going until I can get my next fix, I mean, some more tasks :) ID: 1158875 ·

zoom3+1=4 Volunteer tester Send message Joined: 30 Nov 03 Posts: 65766 Credit: 55,293,173 RAC: 49	Message 1158881 - Posted: 4 Oct 2011, 23:06:49 UTC I can't even report wu's, http errors at the HE router I guess, I wish the nuts doing this would just get lost. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's ID: 1158881 ·

OTS Volunteer tester Send message Joined: 6 Jan 08 Posts: 369 Credit: 20,533,537 RAC: 0	Message 1158890 - Posted: 5 Oct 2011, 0:32:03 UTC - in response to Message 1158712. I'm sure all of you just read the latest news post. http://setiathome.berkeley.edu/forum_thread.php?id=65678 She's down for the night, I'm sure. Hopefully, things will be fixed by the time the usual outage is done tomorrow. Perhaps it is not only SETI. I came across this on Twitter about HE and a DDoS attack. I do not use Twitter but it looks current. https://twitter.com/#!/henet From the HE twitter feed 4 hours ago Experiencing large DDos attack on our network. Network admins doing everything possible. No formal ETA at the moment. Will give update ASAP. 3 hours ago Our network admins are still working as fast as posible to restore full connectivity and services. We'll continue to give updates as they come. 2 hours ago Network still experiencing incidences of attack, though network admins are doing their best to keep the attack contained and mitigated. It's not a Seti sever conking out this time, woohoo! :) I wonder if Bank of America uses HE and SETI was caught in the crossfire. http://abcnews.go.com/blogs/business/2011/10/bank-of-america-under-hacking-attack/ ID: 1158890 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1158908 - Posted: 5 Oct 2011, 1:24:31 UTC - in response to Message 1158890. I wonder if Bank of America uses HE and SETI was caught in the crossfire. http://abcnews.go.com/blogs/business/2011/10/bank-of-america-under-hacking-attack/ Just did a traceroute to see. Goes through a handful of hops on Level3 in Dallas and then directly to BoA's own allocated network addresses. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1158908 ·

zoom3+1=4 Volunteer tester Send message Joined: 30 Nov 03 Posts: 65766 Credit: 55,293,173 RAC: 49	Message 1158909 - Posted: 5 Oct 2011, 1:28:16 UTC - in response to Message 1158890. I'm sure all of you just read the latest news post. http://setiathome.berkeley.edu/forum_thread.php?id=65678 She's down for the night, I'm sure. Hopefully, things will be fixed by the time the usual outage is done tomorrow. Perhaps it is not only SETI. I came across this on Twitter about HE and a DDoS attack. I do not use Twitter but it looks current. https://twitter.com/#!/henet From the HE twitter feed 4 hours ago Experiencing large DDos attack on our network. Network admins doing everything possible. No formal ETA at the moment. Will give update ASAP. 3 hours ago Our network admins are still working as fast as possible to restore full connectivity and services. We'll continue to give updates as they come. 2 hours ago Network still experiencing incidences of attack, though network admins are doing their best to keep the attack contained and mitigated. It's not a Seti sever conking out this time, woohoo! :) I wonder if Bank of America uses HE and SETI was caught in the crossfire. http://abcnews.go.com/blogs/business/2011/10/bank-of-america-under-hacking-attack/ Yep, they do as seen here: http://bgp.he.net/AS10794#_whois The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's ID: 1158909 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304	Message 1158947 - Posted: 5 Oct 2011, 4:51:50 UTC - in response to Message 1158909. Has anyone heard anything on the Scheduler issue? I didn't see anything in the News or Tech news threads. Even after the outage i still can't contact the Scheduler. Grant Darwin NT ID: 1158947 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19080 Credit: 40,757,560 RAC: 67	Message 1158949 - Posted: 5 Oct 2011, 5:24:53 UTC - in response to Message 1158947. Has anyone heard anything on the Scheduler issue? I didn't see anything in the News or Tech news threads. Even after the outage i still can't contact the Scheduler. I've had a few contacts over the last few hours, but it has been only about one in five attempts that are sucessful. Saying that I've rec'd no new tasks. Also I noted about 3 hours ago there were 798,200 tasks available, that has now decreased to 783,867 and according to scarecrows graphs, creation rate is near 0.00 as you can get. Think we need a full database clean up and compaction asap, rather than contact with the servers. ID: 1158949 ·

Speedy Volunteer tester Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89	Message 1158953 - Posted: 5 Oct 2011, 5:33:06 UTC - in response to Message 1158949. Think we need a full database clean up and compaction asap, rather than contact with the servers. I was under the understanding that this was taken care of this morning while the servers were off line. ID: 1158953 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1158955 - Posted: 5 Oct 2011, 6:03:47 UTC Last modified: 5 Oct 2011, 6:05:00 UTC Well normally the tuesday outage does two things. One, it defragments and compresses the database, and then a copy of it is made and deployed on the replica (as well as sent to off-site storage). So one would think that after defragging and compressing the database, it should be running smooth now, right? Well there's only so much of the database that will fit in RAM, and unfortunately, with 13M and climbing waiting to be purged, I'm guessing that is taking up RAM for task page queries or something else. The issues we are experiencing with contacting the scheduler, internal DB operations in the server closet are likely as intermittent/slow because of the bloat. I think turning the scheduler off and turning db_purge on until the backlog is cleared is about the only thing that can rectify this situation. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1158955 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19080 Credit: 40,757,560 RAC: 67	Message 1158957 - Posted: 5 Oct 2011, 6:07:41 UTC Well 40 mins later and the MB's waiting to be sent still remains at 783,867, so definitely something is causing problems at the server end. ID: 1158957 ·

BetelgeuseFive Volunteer tester Send message Joined: 6 Jul 99 Posts: 158 Credit: 17,117,787 RAC: 19	Message 1158960 - Posted: 5 Oct 2011, 6:29:35 UTC - in response to Message 1158955. I agree that things are getting out of control and something should be done about it. However, I think that turning off the scheduler is a bit drastic. Turning off the splitters seems more appropriate. This way people will still be able to report completed tasks (before the deadline). Tom Well normally the tuesday outage does two things. One, it defragments and compresses the database, and then a copy of it is made and deployed on the replica (as well as sent to off-site storage). So one would think that after defragging and compressing the database, it should be running smooth now, right? Well there's only so much of the database that will fit in RAM, and unfortunately, with 13M and climbing waiting to be purged, I'm guessing that is taking up RAM for task page queries or something else. The issues we are experiencing with contacting the scheduler, internal DB operations in the server closet are likely as intermittent/slow because of the bloat. I think turning the scheduler off and turning db_purge on until the backlog is cleared is about the only thing that can rectify this situation. ID: 1158960 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1158961 - Posted: 5 Oct 2011, 6:37:13 UTC Last modified: 5 Oct 2011, 6:37:41 UTC This is not pretty. The kitties are getting rather depressed. My top rig has been doing MW since yesterday just to keep it warm because it runs 24/7. Two other rigs are out of Seti work, and can't even connect to report the last of what they've completed. The other 5 slower rigs are still crunching up their last. The main router is on the fritz, and the servers are tied up in knots when you can connect. Even the forums are doggy at times. Come on back, Seti. The kitties will leave the light on for ya. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1158961 ·

Kevin Olley Send message Joined: 3 Aug 99 Posts: 906 Credit: 261,085,289 RAC: 572	Message 1158966 - Posted: 5 Oct 2011, 7:05:56 UTC - in response to Message 1158961. This is not pretty. The kitties are getting rather depressed. My top rig has been doing MW since yesterday just to keep it warm because it runs 24/7. Two other rigs are out of Seti work, and can't even connect to report the last of what they've completed. The other 5 slower rigs are still crunching up their last. The main router is on the fritz, and the servers are tied up in knots when you can connect. Even the forums are doggy at times. Come on back, Seti. The kitties will leave the light on for ya. Set NNT and then update, they will then report. Don't forget to unset NNT after. Kevin ID: 1158966 ·

soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0	Message 1158971 - Posted: 5 Oct 2011, 7:44:16 UTC - in response to Message 1158966. This is not pretty. The kitties are getting rather depressed. My top rig has been doing MW since yesterday just to keep it warm because it runs 24/7. Two other rigs are out of Seti work, and can't even connect to report the last of what they've completed. The other 5 slower rigs are still crunching up their last. The main router is on the fritz, and the servers are tied up in knots when you can connect. Even the forums are doggy at times. Come on back, Seti. The kitties will leave the light on for ya. Set NNT and then update, they will then report. Don't forget to unset NNT after. Is that a bug work-around to be able to contact the scheduler again? Janice ID: 1158971 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19080 Credit: 40,757,560 RAC: 67	Message 1158972 - Posted: 5 Oct 2011, 7:50:52 UTC Just to let you know I rec'd one task, GPU VHAR, at 07:49:46. ID: 1158972 ·

LadyL Volunteer tester Send message Joined: 14 Sep 11 Posts: 1679 Credit: 5,230,097 RAC: 0	Message 1158973 - Posted: 5 Oct 2011, 7:53:09 UTC - in response to Message 1158971. Last modified: 5 Oct 2011, 7:53:41 UTC Set NNT and then update, they will then report. Don't forget to unset NNT after. Is that a bug work-around to be able to contact the scheduler again? On a 6.12.x client setting NNT forces 'report straight after upload' behaviour. Which would help if the problem was getting the client to report and not getting the report through... ID: 1158973 ·

soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0	Message 1158974 - Posted: 5 Oct 2011, 7:57:22 UTC - in response to Message 1158973. ahh gotcha. So no help in the just can not connect to scheduler. I have had 251 completed tasks here for hours on 6.10.58, and simply can not get there from here. Janice ID: 1158974 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19080 Credit: 40,757,560 RAC: 67	Message 1158975 - Posted: 5 Oct 2011, 7:58:00 UTC Time in last post was UTC But checking back I had rec'd another about 20 mins earlier. Strange thing is that Results ready to send is still at 783,867 thats over an hour after I last posted that number. So either the page is lying, or "Results ready to send" cannot be greater than that, for some reason. Unpurged WU/tasks maybe? ID: 1158975 ·

S@NL - John van Gorsel Volunteer tester Send message Joined: 5 Jul 99 Posts: 193 Credit: 139,673,078 RAC: 0	Message 1158976 - Posted: 5 Oct 2011, 8:00:59 UTC - in response to Message 1158966. Set NNT and then update, they will then report. Don't forget to unset NNT after. It could be a coincidence but it worked for me: 5-10-2011 9:51:41 \| SETI@home \| work fetch suspended by user 5-10-2011 9:51:43 \| SETI@home \| update requested by user 5-10-2011 9:51:46 \| SETI@home \| Sending scheduler request: Requested by user. 5-10-2011 9:51:46 \| SETI@home \| Reporting 361 completed tasks, not requesting new tasks 5-10-2011 9:51:56 \| SETI@home \| Scheduler request completed 5-10-2011 9:52:13 \| SETI@home \| work fetch resumed by user The scheduler is still unreachable when asking for new work though... Seti@Netherlands website ID: 1158976 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.