Message boards :
Number crunching :
Panic Mode On (17) Server problems
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 11 · Next
Author | Message |
---|---|
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Anyone noticed the new website code at play?... Some weird effect of CSS on the main page, all black with blue letters before it loads correctly, you mean? |
Matthew S. McCleary Send message Joined: 9 Sep 99 Posts: 121 Credit: 2,288,242 RAC: 0 |
Yeahhh... that will go over well with enthusiasts like me who have been working hard to throw everything they've got at it. Somehow I don't think people would like SETI@home saying to them: "We want your help, but only so much. We're cutting you off at 10 CPUs." In fact, if SETI@home were to adopt such a policy, I would abandon the project. There are plenty of other worthy DC projects which would never do such a thing. |
FiveHamlet Send message Joined: 5 Oct 99 Posts: 783 Credit: 32,638,578 RAC: 0 |
My rig has been trying to upload 100 tasks most of the day. Also still can't report 111 tasks this after having an outage for 5 hrs yesterday. I am now crunching Aqua on my gpu's and docking on my cpu to fill in missing workload.I have no 6.03 to crunch and only about 200 6.08 will be out of Seti work in 12 hrs on main rig second rig has no 6.08 but 200 6.03 schedular not very good. Seems like a real problem with Seti servers. Dave EDIT just checked the server status page and it's info is 6 hours old BIG PROBLEM |
Clyde C. Phillips, III Send message Joined: 2 Aug 00 Posts: 1851 Credit: 5,955,047 RAC: 0 |
I see only about three hours of tasks left. It'll be Einstein or else I'll have to turn off my machines tonight. |
FiveHamlet Send message Joined: 5 Oct 99 Posts: 783 Credit: 32,638,578 RAC: 0 |
I have 4 gpu's to feed so it looks like Aqua for me just to keep my room warm lol |
UrbanFlux Send message Joined: 21 Sep 06 Posts: 8 Credit: 1,150,626 RAC: 0 |
Fixed! Trust No One. Question Everything. |
FiveHamlet Send message Joined: 5 Oct 99 Posts: 783 Credit: 32,638,578 RAC: 0 |
Don't know what is fixed but server status is now showing 7 hrs. Still no uploads, reporting or downloads. |
UrbanFlux Send message Joined: 21 Sep 06 Posts: 8 Credit: 1,150,626 RAC: 0 |
Yep sorry, was seeing things. Trust No One. Question Everything. |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
I still think the math behind the quota needs to be revised. Sure, one bad task can still be -1, but a good task needs to be something like +1 or +2 instead of x2. Takes 100+ bad tasks to get down to a quota of 1, but it only takes 8 good ones to get back to 100. Sounds like it defeats the purpose of the quota altogether. $0.02 [edit: and my panic is gone now. The one Linux box that only had an AP to crunch requested 86400 seconds of work (nice number there..) and got 20 MBs. Good to go now.] Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Don't know what is fixed but server status is now showing 7 hrs. Before they took the website down, and took the server's offline for maintenance, the Status time was stuck at: [As of 17 Jun 2009 12:20:10 UTC], now at least it's updating it's status every 10 mins, even through the stats haven't been updated (yet). Work is getting through, I've got about 60 WU's downloaded, and another 20 downloading, You might have to get some uploaded before you'll request more. Claggy |
FiveHamlet Send message Joined: 5 Oct 99 Posts: 783 Credit: 32,638,578 RAC: 0 |
No point in having quotas if the schedular gets things totally screwed. I have 1 rig with 8 cores idle cos of no AP's or 6.03 plenty of 6.08's, and another rig with 4 cores and 2 gpu's with plenty of 6.03's and no 6.08's. Still have over 150 trying to upload here, been like this all day. |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
I still think the math behind the quota needs to be revised. Sure, one bad task can still be -1, but a good task needs to be something like +1 or +2 instead of x2. Takes 100+ bad tasks to get down to a quota of 1, but it only takes 8 good ones to get back to 100. Sounds like it defeats the purpose of the quota altogether. It really depends on what you're trying to achieve. If the purpose is to keep a machine that returns consistently bad results from chewing through work, it's good enough. It'll take a while to throttle it down, so an occasional validator error or math bug won't hurt machines that are not broken. It also means a broken machine (or maybe a bug-fix to the application) can recover quite quickly once it's fixed. If you did away with the -1/x2 mechanism entirely, things would probably still be okay, there would just be a lot more reissues. |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
I was also thinking of proposing that a new user's quota be at 10, and go from there. That way they don't go and fill up their cache and then abandon ship, leaving all of those tasks to time out and be reissued. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
I was also thinking of proposing that a new user's quota be at 10, and go from there. That way they don't go and fill up their cache and then abandon ship, leaving all of those tasks to time out and be reissued. I'd think even lower would be fine, maybe as low as two. Each validated WU would double-up, and they'd get to full quota fairly quickly. |
[AF>france>pas-de-calais]symaski62 Send message Joined: 12 Aug 05 Posts: 258 Credit: 100,548 RAC: 0 |
:) PAUSE ! 18/06/2009 00:02:40 Internet access OK - project servers may be temporarily down. --------------------------------- 18/06/2009 00:00:36 Resuming network activity 18/06/2009 00:00:36 SETI@home Started upload of 01mr09ad.27455.4980.11.8.163_1_0 18/06/2009 00:00:36 SETI@home Started upload of 24fe09ac.31767.5389.3.8.130_0_0 18/06/2009 00:00:36 SETI@home Sending scheduler request: To fetch work. 18/06/2009 00:00:36 SETI@home Requesting new tasks 18/06/2009 00:00:41 SETI@home Scheduler request completed: got 0 new tasks 18/06/2009 00:00:41 SETI@home Message from server: (Project has no jobs available) 18/06/2009 00:00:44 SETI@home [error] Error reported by file upload server: can't open file 18/06/2009 00:00:44 SETI@home Temporarily failed upload of 24fe09ac.31767.5389.3.8.130_0_0: transient upload error 18/06/2009 00:00:44 SETI@home Backing off 1 min 0 sec on upload of 24fe09ac.31767.5389.3.8.130_0_0 18/06/2009 00:00:47 SETI@home [error] Error reported by file upload server: can't open file 18/06/2009 00:00:47 SETI@home Temporarily failed upload of 01mr09ad.27455.4980.11.8.163_1_0: transient upload error 18/06/2009 00:00:47 SETI@home Backing off 1 min 0 sec on upload of 01mr09ad.27455.4980.11.8.163_1_0 18/06/2009 00:01:44 SETI@home Started upload of 24fe09ac.31767.5389.3.8.130_0_0 18/06/2009 00:01:47 SETI@home Started upload of 01mr09ad.27455.4980.11.8.163_1_0 18/06/2009 00:01:56 SETI@home Sending scheduler request: To fetch work. 18/06/2009 00:01:56 SETI@home Requesting new tasks 18/06/2009 00:02:01 SETI@home Scheduler request completed: got 1 new tasks 18/06/2009 00:02:03 SETI@home Started download of 12mr09ac.29680.16841.4.8.223 18/06/2009 00:02:09 Project communication failed: attempting access to reference site 18/06/2009 00:02:09 SETI@home Temporarily failed upload of 01mr09ad.27455.4980.11.8.163_1_0: connect() failed 18/06/2009 00:02:09 SETI@home Backing off 1 min 12 sec on upload of 01mr09ad.27455.4980.11.8.163_1_0 18/06/2009 00:02:09 SETI@home Finished download of 12mr09ac.29680.16841.4.8.223 18/06/2009 00:02:11 BOINC can't access Internet - check network connection or proxy configuration. 18/06/2009 00:02:38 Project communication failed: attempting access to reference site 18/06/2009 00:02:38 SETI@home Temporarily failed upload of 24fe09ac.31767.5389.3.8.130_0_0: connect() failed 18/06/2009 00:02:38 SETI@home Backing off 1 min 0 sec on upload of 24fe09ac.31767.5389.3.8.130_0_0 18/06/2009 00:02:40 Internet access OK - project servers may be temporarily down. 18/06/2009 00:02:56 Suspending network activity - user request SETI@Home Informational message -9 result_overflow with a general handicap of 80% and it makes much d' efforts for the community and s' expimer, thank you d' to be understanding. |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
I was also thinking of proposing that a new user's quota be at 10, and go from there. That way they don't go and fill up their cache and then abandon ship, leaving all of those tasks to time out and be reissued. I've also thought that the initial DCF should be pretty high. That would keep inefficient machines from over-requesting work initially, and allow more work once things settle in. The obvious problem is the high projected time, but it should be possible to explain that (or show a "displayed duration" based on a "display-DCF"). |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
:) PAUSE ! Looking at what is happening with uploads at the moment, suspending network for a few hours seems like the best option. Those of my uploads that are managing to make contact are getting to 100% but then failing to get the final "ack" from the database so it looks like a database access issue again. F. |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
I was just looking at the cricket graph and there is still about 30mbit leftover (if you add inbound and outbound together, it shouldn't be more than 100mbit..in a theoretical sense), but none of my 40 pending uploads will go through. I don't think it's a lack of bandwidth, I think there's just too many requests happening for apache to handle all of them. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
Larry256 Send message Joined: 11 Nov 05 Posts: 25 Credit: 5,715,079 RAC: 8 |
I still think the math behind the quota needs to be revised. Sure, one bad task can still be -1, but a good task needs to be something like +1 or +2 instead of x2. Takes 100+ bad tasks to get down to a quota of 1, but it only takes 8 good ones to get back to 100. Sounds like it defeats the purpose of the quota altogether. Wouldn't it cause less? |
Larry256 Send message Joined: 11 Nov 05 Posts: 25 Credit: 5,715,079 RAC: 8 |
I was just looking at the cricket graph and there is still about 30mbit leftover (if you add inbound and outbound together, it shouldn't be more than 100mbit..in a theoretical sense), but none of my 40 pending uploads will go through. I don't think it's a lack of bandwidth, I think there's just too many requests happening for apache to handle all of them. That will help them out- download,wait 1hr then abort. Repeat untill the problem is fixed. LOL |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.