Message boards :
Number crunching :
Panic Mode On (38) Server problems
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 10 · Next
Author | Message |
---|---|
![]() Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 ![]() |
What's about your 'WU limit in progress'? I have on one machine more WUs than on the other.. On 940 BE ~ 400 WUs/GPU on the E7600 ~ 800 WUs/GPU.. and both get the famous message.. Maybe there is something wrong? When the limit was set? Maybe the E7600 requested faster and got more before the limit was set..? ![]() |
![]() ![]() Send message Joined: 9 Feb 04 Posts: 1175 Credit: 4,754,897 RAC: 0 ![]() |
Keep on getting either time out messages after 8 minutes or this message 18/09/2010 13:42:29 Project communication failed: attempting access to reference site 18/09/2010 13:42:30 Internet access OK - project servers may be temporarily down. Is there something wrong with the scheduler or is it just the amount of work going in and out? ![]() |
![]() ![]() Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 ![]() ![]() |
Keep on getting either time out messages after 8 minutes or this message Cricket graph says the servers have been maxed out since the project came back up on Thursday. This also explains why all the heavy-duty crunchers are getting ghosts. Donald Infernal Optimist / Submariner, retired |
![]() Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 ![]() |
What's about your 'WU limit in progress'? We now have the old limit: 'server run, August 13-16 2010'? 40/CPU 320/GPU ![]() |
![]() ![]() Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 ![]() ![]() |
What's about your 'WU limit in progress'? How do you figure that? The last post from S@H staff about limits was server run, September 3-6 2010 , and Jeff said they would tell us if the limits changed. Did I miss a post over in Technical News? Edit: Looking at your computers, I see that over the past two days, on both machines, you have had a number of Tasks reported as "error while computing", and many others that "timed out-no response" - Ghosts? I suspect those errors and ghosts have reduced your allowable work-fetch to below the server limits. Donald Infernal Optimist / Submariner, retired |
kittyman ![]() ![]() ![]() ![]() Send message Joined: 9 Jul 00 Posts: 51555 Credit: 1,018,363,574 RAC: 1,004 ![]() ![]() |
What's about your 'WU limit in progress'? I am getting about a 150 WU limit on my quad core CPUs..... Even if not announced, it would not surprise me that some lower limits were imposed coming back up after almost a week of downtime. I do seem to notice that work is becoming more available and downloads working much better over the last few hours. I think some of the demand is finally being satisfied. "Time is simply the mechanism that keeps everything from happening all at once." ![]() |
![]() ![]() Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 ![]() ![]() |
Mark, I agree this would not be the first time they lowered limits and didn't tell us until later. But I looked at Sutaru's Application details, and he has Max Tasks per Day of 110(E7600) and 109(940 BE) on his Anon. Plat. CPUs and MTDs of 470(E7600) and 259(940 BE) on his Anon Plat. GPUs. I think all his recent errors and ghosts timing out are what is limiting his work-fetch. Donald Infernal Optimist / Submariner, retired |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 ![]() |
For those daily quota figures, the Max is scaled by number of CPUs or by number of GPUs and the project gpu_multiplier setting, believed to be 8. That is, those quotas are significantly above the 40/320 limit which Sutaru believes is in effect. I also think he understands the different messages each give. Add to that Geek@Play's comment that his hosts were being limited below the 80/640 I thought was indicated by his quad CPU w. dual GPU hosts having around 1600 in progress and I think Sutaru is probably right. We know the project has used those settings in the past, applying them is probably just a matter of replacing one file with another. Joe |
![]() Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 ![]() |
You saw the -> ? <- at the end of the line? My machines had a few -12 errors (CUDA application BUG) and a lot of ghosts.. but this is an other topic. Again - 'two different pair of shoes'.. ;-) If I reached the limit of the WUs/CPU or GPU/day, I get 'no new tasks' or something similar from the server. But my machines get 'This computer has reached a limit on tasks in progress'. My 940 BE machine started to DL a few WUs, so I guess my last guessing is right that we have again the/a low limit set. ![]() |
Highlander ![]() Send message Joined: 5 Oct 99 Posts: 167 Credit: 37,987,668 RAC: 16 ![]() ![]() |
Yes, limits of 40/320 are in place, but im wondering more about this: SETI@home 19.09.2010 08:38:50 Scheduler request completed: got 58 new tasks Thought, there was also a limit of about 20 per request? Seems, this doesn't apply any more. - Performance is not a simple linear function of the number of CPUs you throw at the problem. - |
![]() ![]() Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 ![]() ![]() |
Yes, limits of 40/320 are in place, but im wondering more about this: I don't recall any such limit. The only limits I recall are the per processor limits we have been talking about, and a maximum limit of 100 Tasks per request due to the capacity of the Download feeder process (100 Tasks per 6-second reload cycle, a mix of S@H Enhanced CPU, GPU, & Astropulse Tasks). Donald Infernal Optimist / Submariner, retired |
kittyman ![]() ![]() ![]() ![]() Send message Joined: 9 Jul 00 Posts: 51555 Credit: 1,018,363,574 RAC: 1,004 ![]() ![]() |
Yes, limits of 40/320 are in place, but im wondering more about this: Oh, at one time there was for sure a per-scheduler-request limit on the number of tasks that would be issued. I think that it was twenty. "Time is simply the mechanism that keeps everything from happening all at once." ![]() |
![]() ![]() Send message Joined: 15 Dec 99 Posts: 707 Credit: 108,785,585 RAC: 0 ![]() |
Well, it seems so. I have since yesterday exactly 1280 WUs on both of my 4-GPU Rigs. If i have finished one i get one new. Not more. Helli A loooong time ago: First Credits after SETI@home Restart |
![]() ![]() Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 ![]() ![]() |
Yes, limits of 40/320 are in place, but im wondering more about this: Okay, haven't heard anyone talking about a limit like that. Not something I'm likely to get hit with, since my ancient G4s never have more than 5-6 tasks apiece anyway. (8{) I wonder, then, if maybe bringing back THAT limit would help prevent ghosts. Smaller message file, less likely to choke the server, or get lost in the chaos due to a time-out. It would take more scheduler requests to get you mega-crunchers filled up again after an outage, but maybe you'd actually get all the Tasks assigned instead of so many turning into ghosts. Donald Infernal Optimist / Submariner, retired |
kittyman ![]() ![]() ![]() ![]() Send message Joined: 9 Jul 00 Posts: 51555 Credit: 1,018,363,574 RAC: 1,004 ![]() ![]() |
It would be a good thing, I think, to raise the limits a little bit tomorrow instead of just taking them all off on Monday morning. The Cricket graphs show just now a little sign of relief for the bandwidth after pounding steadily since coming back up on Thursday. Open the spigot a little more tomorrow and then crack it wide open on Monday morning when the boyz are in the lab to monitor things. I suspect, unless Jeff is monitoring my every post...LOL...that they will not change anything until Monday however. "Time is simply the mechanism that keeps everything from happening all at once." ![]() |
kittyman ![]() ![]() ![]() ![]() Send message Joined: 9 Jul 00 Posts: 51555 Credit: 1,018,363,574 RAC: 1,004 ![]() ![]() |
Yes, limits of 40/320 are in place, but im wondering more about this: Somehow....and believe me, I DON'T know the ins and outs of how Boinc works very well at the code level....I don't think that the amount of work issued at any one time has anything to do with it. But rather the inability of each individual WU to properly download. I am not sure that getting 100 WUs in one request or in 5 requests of 20 each makes any difference. I could be wrong....I think Joe might know better where the fault lies. "Time is simply the mechanism that keeps everything from happening all at once." ![]() |
![]() ![]() Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 ![]() ![]() |
In an earlier discussion of the causes of ghosts, either Joe or Claggy were talking about corrupted sched_request_ack_xml(?) files. And then today we had the situation where a sched_request_xml file was so big, with 2700 Tasks reporting, that it would not go through to the server, and had to be emailed to Dr. A for manual upload. It just seems to me that the larger these files are, the more likely they are to get "stuck" in the pipe during max traffic, and that reducing their size might be a partial solution. Donald Infernal Optimist / Submariner, retired |
kittyman ![]() ![]() ![]() ![]() Send message Joined: 9 Jul 00 Posts: 51555 Credit: 1,018,363,574 RAC: 1,004 ![]() ![]() |
In an earlier discussion of the causes of ghosts, either Joe or Claggy were talking about corrupted sched_request_ack_xml(?) files. And then today we had the situation where a sched_request_xml file was so big, with 2700 Tasks reporting, that it would not go through to the server, and had to be emailed to Dr. A for manual upload. You could be right.....and so far as I know, Vyper has still not been able to report his results successfully.......pending whatever solution DA may or may not be able to come up with. Still awaiting news on that saga. "Time is simply the mechanism that keeps everything from happening all at once." ![]() |
![]() ![]() Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 ![]() ![]() |
In an earlier discussion of the causes of ghosts, either Joe or Claggy were talking about corrupted sched_request_ack_xml(?) files. And then today we had the situation where a sched_request_xml file was so big, with 2700 Tasks reporting, that it would not go through to the server, and had to be emailed to Dr. A for manual upload. Since edits don't carry over into reply quotes, let the record show that it was 8448 Tasks reporting, not 2700, and that the file size was 31 MB. Unusually large, even after an extended outage, but still.. too large a file going either direction could get caught or be corrupted. Donald Infernal Optimist / Submariner, retired |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 ![]() |
In an earlier discussion of the causes of ghosts, either Joe or Claggy were talking about corrupted sched_request_ack_xml(?) files. And then today we had the situation where a sched_request_xml file was so big, with 2700 Tasks reporting, that it would not go through to the server, and had to be emailed to Dr. A for manual upload. The Ghosts i received since the servers came up this week, were from replies with 55, 29, 48 and 27 tasks respectively, i also think limiting the tasks in the sched_reply's might help, not sure it would be a full fix though, as the Server would now have to process more requests, and my last set from last week were 14 Astropulse tasks, the problem here is that Astropulse tasks are getting sent out in cycles, as in, you can't get any for hours, then you get 20 or more in one request, just after everyone else has got theirs too, at which point download speeds have already dropped, Claggy |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.