Panic Mode On (38) Server problems

Message boards : Number crunching : Panic Mode On (38) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 11 · Next

AuthorMessage
Profile soft^spirit
Avatar

Send message
Joined: 18 May 99
Posts: 6497
Credit: 34,134,168
RAC: 0
United States
Message 1033163 - Posted: 15 Sep 2010, 22:14:53 UTC

Lab to computer mind games?

9/15/2010 2:41:18 PM SETI@home Sending scheduler request: Requested by user.
9/15/2010 2:41:18 PM SETI@home Reporting 115 completed tasks, not requesting new tasks
9/15/2010 2:41:29 PM SETI@home Scheduler request completed

They convinced my computer I do not want more tasks.

"These are not the tasks you are looking for" ;)
Janice
ID: 1033163 · Report as offensive
Profile Tim Norton
Volunteer tester
Avatar

Send message
Joined: 2 Jun 99
Posts: 835
Credit: 33,540,164
RAC: 0
United Kingdom
Message 1033164 - Posted: 15 Sep 2010, 22:20:30 UTC - in response to Message 1033163.  

It appears that if you have outstanding uploads then BM will not request new tasks even if you have none!!!!

5 pc's

4 with outstanding uploads - will not download wu - and have no work

1 pc no uploads is requesting new tasks fine

who wrote the logic for this?

anybody know why is the upload server the only one not online?
Tim

ID: 1033164 · Report as offensive
Profile soft^spirit
Avatar

Send message
Joined: 18 May 99
Posts: 6497
Credit: 34,134,168
RAC: 0
United States
Message 1033165 - Posted: 15 Sep 2010, 22:29:05 UTC - in response to Message 1033164.  

It is temporarily up to try to do some database clean up. The cooling for the room the servers are in died. I am guessing they have the doors open and fans on it for the moment. So it is up (kinda) but not fixed. When they go home they will need to shut it off and close the doors. At least we can talk for a bit. The splitters are off so no NEW work will be generated. The disks were over flowing, so they are probably trying to fix that in the middle of the disaster.

But it would be nice to take some re-issues off of their hands while we are at it.
Janice
ID: 1033165 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1033814 - Posted: 18 Sep 2010, 1:15:17 UTC
Last modified: 18 Sep 2010, 1:17:58 UTC

What's about your 'WU limit in progress'?

I have on one machine more WUs than on the other..

On 940 BE ~ 400 WUs/GPU on the E7600 ~ 800 WUs/GPU.. and both get the famous message..
Maybe there is something wrong?

When the limit was set?

Maybe the E7600 requested faster and got more before the limit was set..?
ID: 1033814 · Report as offensive
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 1033988 - Posted: 18 Sep 2010, 12:45:14 UTC

Keep on getting either time out messages after 8 minutes or this message
18/09/2010 13:42:29 Project communication failed: attempting access to reference site
18/09/2010 13:42:30 Internet access OK - project servers may be temporarily down.
Is there something wrong with the scheduler or is it just the amount of work going in and out?
ID: 1033988 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1034056 - Posted: 18 Sep 2010, 15:29:33 UTC - in response to Message 1033988.  

Keep on getting either time out messages after 8 minutes or this message
18/09/2010 13:42:29 Project communication failed: attempting access to reference site
18/09/2010 13:42:30 Internet access OK - project servers may be temporarily down.
Is there something wrong with the scheduler or is it just the amount of work going in and out?

Cricket graph says the servers have been maxed out since the project came back up on Thursday. This also explains why all the heavy-duty crunchers are getting ghosts.


Donald
Infernal Optimist / Submariner, retired
ID: 1034056 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1034242 - Posted: 18 Sep 2010, 23:36:47 UTC - in response to Message 1033814.  

What's about your 'WU limit in progress'?

I have on one machine more WUs than on the other..

On 940 BE ~ 400 WUs/GPU on the E7600 ~ 800 WUs/GPU.. and both get the famous message..
Maybe there is something wrong?

When the limit was set?

Maybe the E7600 requested faster and got more before the limit was set..?


We now have the old limit: 'server run, August 13-16 2010'?

 40/CPU
320/GPU


ID: 1034242 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1034282 - Posted: 19 Sep 2010, 1:07:04 UTC - in response to Message 1034242.  
Last modified: 19 Sep 2010, 1:21:04 UTC

What's about your 'WU limit in progress'?

I have on one machine more WUs than on the other..

On 940 BE ~ 400 WUs/GPU on the E7600 ~ 800 WUs/GPU.. and both get the famous message..
Maybe there is something wrong?

When the limit was set?

Maybe the E7600 requested faster and got more before the limit was set..?


We now have the old limit: 'server run, August 13-16 2010'?

 40/CPU
320/GPU



How do you figure that? The last post from S@H staff about limits was server run, September 3-6 2010 , and Jeff said they would tell us if the limits changed.

Did I miss a post over in Technical News?

Edit: Looking at your computers, I see that over the past two days, on both machines, you have had a number of Tasks reported as "error while computing", and many others that "timed out-no response" - Ghosts?

I suspect those errors and ghosts have reduced your allowable work-fetch to below the server limits.
Donald
Infernal Optimist / Submariner, retired
ID: 1034282 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1034287 - Posted: 19 Sep 2010, 1:10:48 UTC - in response to Message 1034282.  

What's about your 'WU limit in progress'?

I have on one machine more WUs than on the other..

On 940 BE ~ 400 WUs/GPU on the E7600 ~ 800 WUs/GPU.. and both get the famous message..
Maybe there is something wrong?

When the limit was set?

Maybe the E7600 requested faster and got more before the limit was set..?


We now have the old limit: 'server run, August 13-16 2010'?

 40/CPU
320/GPU



How do you figure that? The last post from S@H staff about limits was server run, September 3-6 2010 , and Jeff said they would tell us if the limits changed.

Did I miss a post over in Technical News?


I am getting about a 150 WU limit on my quad core CPUs.....
Even if not announced, it would not surprise me that some lower limits were imposed coming back up after almost a week of downtime.

I do seem to notice that work is becoming more available and downloads working much better over the last few hours. I think some of the demand is finally being satisfied.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1034287 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1034298 - Posted: 19 Sep 2010, 1:44:18 UTC - in response to Message 1034287.  


I am getting about a 150 WU limit on my quad core CPUs.....
Even if not announced, it would not surprise me that some lower limits were imposed coming back up after almost a week of downtime.

I do seem to notice that work is becoming more available and downloads working much better over the last few hours. I think some of the demand is finally being satisfied.


Mark, I agree this would not be the first time they lowered limits and didn't tell us until later.

But I looked at Sutaru's Application details, and he has Max Tasks per Day of 110(E7600) and 109(940 BE) on his Anon. Plat. CPUs and MTDs of 470(E7600) and 259(940 BE) on his Anon Plat. GPUs. I think all his recent errors and ghosts timing out are what is limiting his work-fetch.

Donald
Infernal Optimist / Submariner, retired
ID: 1034298 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1034317 - Posted: 19 Sep 2010, 2:41:09 UTC - in response to Message 1034298.  


I am getting about a 150 WU limit on my quad core CPUs.....
Even if not announced, it would not surprise me that some lower limits were imposed coming back up after almost a week of downtime.

I do seem to notice that work is becoming more available and downloads working much better over the last few hours. I think some of the demand is finally being satisfied.


Mark, I agree this would not be the first time they lowered limits and didn't tell us until later.

But I looked at Sutaru's Application details, and he has Max Tasks per Day of 110(E7600) and 109(940 BE) on his Anon. Plat. CPUs and MTDs of 470(E7600) and 259(940 BE) on his Anon Plat. GPUs. I think all his recent errors and ghosts timing out are what is limiting his work-fetch.

For those daily quota figures, the Max is scaled by number of CPUs or by number of GPUs and the project gpu_multiplier setting, believed to be 8. That is, those quotas are significantly above the 40/320 limit which Sutaru believes is in effect. I also think he understands the different messages each give. Add to that Geek@Play's comment that his hosts were being limited below the 80/640 I thought was indicated by his quad CPU w. dual GPU hosts having around 1600 in progress and I think Sutaru is probably right. We know the project has used those settings in the past, applying them is probably just a matter of replacing one file with another.
                                                                 Joe
ID: 1034317 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1034318 - Posted: 19 Sep 2010, 2:45:35 UTC - in response to Message 1034298.  
Last modified: 19 Sep 2010, 2:59:36 UTC

You saw the -> ? <- at the end of the line?

My machines had a few -12 errors (CUDA application BUG) and a lot of ghosts.. but this is an other topic. Again - 'two different pair of shoes'.. ;-)

If I reached the limit of the WUs/CPU or GPU/day, I get 'no new tasks' or something similar from the server.
But my machines get 'This computer has reached a limit on tasks in progress'.

My 940 BE machine started to DL a few WUs, so I guess my last guessing is right that we have again the/a low limit set.
ID: 1034318 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65755
Credit: 55,293,173
RAC: 49
United States
Message 1034319 - Posted: 19 Sep 2010, 2:49:13 UTC

I get this here:

SETI@home 9/18/2010 7:44:13 PM Message from server: This computer has reached a limit on tasks in progress

I've had a couple errors cause of a water pump failure in a nearby swamp cooler, replaced the pump, no more errors and I think there were only 3 total, Even though I do need more work to fill My cache up, I just get the above statement and sometimes 1 cuda WU...
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1034319 · Report as offensive
Highlander
Avatar

Send message
Joined: 5 Oct 99
Posts: 167
Credit: 37,987,668
RAC: 16
Germany
Message 1034352 - Posted: 19 Sep 2010, 6:44:02 UTC

Yes, limits of 40/320 are in place, but im wondering more about this:

SETI@home 19.09.2010 08:38:50 Scheduler request completed: got 58 new tasks

Thought, there was also a limit of about 20 per request? Seems, this doesn't apply any more.

- Performance is not a simple linear function of the number of CPUs you throw at the problem. -
ID: 1034352 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1034367 - Posted: 19 Sep 2010, 7:42:01 UTC - in response to Message 1034352.  

Yes, limits of 40/320 are in place, but im wondering more about this:

SETI@home 19.09.2010 08:38:50 Scheduler request completed: got 58 new tasks

Thought, there was also a limit of about 20 per request? Seems, this doesn't apply any more.

I don't recall any such limit. The only limits I recall are the per processor limits we have been talking about, and a maximum limit of 100 Tasks per request due to the capacity of the Download feeder process (100 Tasks per 6-second reload cycle, a mix of S@H Enhanced CPU, GPU, & Astropulse Tasks).


Donald
Infernal Optimist / Submariner, retired
ID: 1034367 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1034369 - Posted: 19 Sep 2010, 7:46:10 UTC - in response to Message 1034367.  

Yes, limits of 40/320 are in place, but im wondering more about this:

SETI@home 19.09.2010 08:38:50 Scheduler request completed: got 58 new tasks

Thought, there was also a limit of about 20 per request? Seems, this doesn't apply any more.

I don't recall any such limit. The only limits I recall are the per processor limits we have been talking about, and a maximum limit of 100 Tasks per request due to the capacity of the Download feeder process (100 Tasks per 6-second reload cycle, a mix of S@H Enhanced CPU, GPU, & Astropulse Tasks).


Oh, at one time there was for sure a per-scheduler-request limit on the number of tasks that would be issued. I think that it was twenty.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1034369 · Report as offensive
Profile Helli_retiered
Volunteer tester
Avatar

Send message
Joined: 15 Dec 99
Posts: 707
Credit: 108,785,585
RAC: 0
Germany
Message 1034372 - Posted: 19 Sep 2010, 7:51:38 UTC - in response to Message 1034242.  


....
We now have the old limit: 'server run, August 13-16 2010'?

 40/CPU
320/GPU



Well, it seems so. I have since yesterday exactly 1280 WUs on both of my
4-GPU Rigs. If i have finished one i get one new. Not more.

Helli
A loooong time ago: First Credits after SETI@home Restart
ID: 1034372 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1034375 - Posted: 19 Sep 2010, 7:57:18 UTC - in response to Message 1034369.  
Last modified: 19 Sep 2010, 8:03:45 UTC

Yes, limits of 40/320 are in place, but im wondering more about this:

SETI@home 19.09.2010 08:38:50 Scheduler request completed: got 58 new tasks

Thought, there was also a limit of about 20 per request? Seems, this doesn't apply any more.

I don't recall any such limit. The only limits I recall are the per processor limits we have been talking about, and a maximum limit of 100 Tasks per request due to the capacity of the Download feeder process (100 Tasks per 6-second reload cycle, a mix of S@H Enhanced CPU, GPU, & Astropulse Tasks).


Oh, at one time there was for sure a per-scheduler-request limit on the number of tasks that would be issued. I think that it was twenty.


Okay, haven't heard anyone talking about a limit like that. Not something I'm likely to get hit with, since my ancient G4s never have more than 5-6 tasks apiece anyway. (8{)

I wonder, then, if maybe bringing back THAT limit would help prevent ghosts. Smaller message file, less likely to choke the server, or get lost in the chaos due to a time-out.

It would take more scheduler requests to get you mega-crunchers filled up again after an outage, but maybe you'd actually get all the Tasks assigned instead of so many turning into ghosts.
Donald
Infernal Optimist / Submariner, retired
ID: 1034375 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1034376 - Posted: 19 Sep 2010, 7:58:33 UTC

It would be a good thing, I think, to raise the limits a little bit tomorrow instead of just taking them all off on Monday morning.
The Cricket graphs show just now a little sign of relief for the bandwidth after pounding steadily since coming back up on Thursday.

Open the spigot a little more tomorrow and then crack it wide open on Monday morning when the boyz are in the lab to monitor things.

I suspect, unless Jeff is monitoring my every post...LOL...that they will not change anything until Monday however.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1034376 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1034379 - Posted: 19 Sep 2010, 8:02:47 UTC - in response to Message 1034375.  
Last modified: 19 Sep 2010, 8:03:08 UTC

Yes, limits of 40/320 are in place, but im wondering more about this:

SETI@home 19.09.2010 08:38:50 Scheduler request completed: got 58 new tasks

Thought, there was also a limit of about 20 per request? Seems, this doesn't apply any more.

I don't recall any such limit. The only limits I recall are the per processor limits we have been talking about, and a maximum limit of 100 Tasks per request due to the capacity of the Download feeder process (100 Tasks per 6-second reload cycle, a mix of S@H Enhanced CPU, GPU, & Astropulse Tasks).


Oh, at one time there was for sure a per-scheduler-request limit on the number of tasks that would be issued. I think that it was twenty.


I wonder, then, if maybe bringing back THAT limit would help prevent ghosts. Smaller message file, less likely to choke the server, or get lost in the chaos due to a time-out.

It would take more scheduler requests to get you mega-crunchers filled up again after an outage, but maybe you'd actually get all the Tasks assigned instead of so many turning into ghosts.

Somehow....and believe me, I DON'T know the ins and outs of how Boinc works very well at the code level....I don't think that the amount of work issued at any one time has anything to do with it. But rather the inability of each individual WU to properly download. I am not sure that getting 100 WUs in one request or in 5 requests of 20 each makes any difference.

I could be wrong....I think Joe might know better where the fault lies.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1034379 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 11 · Next

Message boards : Number crunching : Panic Mode On (38) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.