Panic Mode On (54) Server problems?

Author	Message
Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1153409 - Posted: 18 Sep 2011, 0:59:06 UTC - in response to Message 1152990. I'm still scratching my head over the Number of results returned per hour. Even with a shorty storm from hell, i wouldn't have thought it's be this high for this long. And the fact is most of my CPU work is VLARs, and only 1 in 5 requests for GPU work results in any being sent. Most of the time it's a "No tasks sent" message. Very occasionally i might get a "Project has no tasks available" message (feeder empty at that time). Grant Darwin NT ID: 1153409 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67	Message 1153420 - Posted: 18 Sep 2011, 1:34:21 UTC - in response to Message 1153409. Last modified: 18 Sep 2011, 1:35:00 UTC I'm still scratching my head over the Number of results returned per hour. Even with a shorty storm from hell, i wouldn't have thought it's be this high for this long. And the fact is most of my CPU work is VLARs, and only 1 in 5 requests for GPU work results in any being sent. Most of the time it's a "No tasks sent" message. Very occasionally i might get a "Project has no tasks available" message (feeder empty at that time). Take a look at the curves for AP, they are going in the opposite direction. Which sort of confirms that AP d/loads have been postponed for now. So with no AP tasks to fill in their time computers would need about 8 MB tasks on average. And the turnaround time will probably be because the estimates are so high then instead of d/loading 50 tasks/request they are getting just a few. edit] beware of splinters LOL ID: 1153420 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1153424 - Posted: 18 Sep 2011, 1:49:42 UTC - in response to Message 1153420. Take a look at the curves for AP, they are going in the opposite direction. Which sort of confirms that AP d/loads have been postponed for now. Postponed, or just blocked by the off the chart runtime estimates? Even if people are stopping new AP work, i wouldn't have expected the effect to be as large as it has been, nor last as long as it has. Grant Darwin NT ID: 1153424 ·

Gatekeeper Send message Joined: 14 Jul 04 Posts: 887 Credit: 176,479,616 RAC: 0	Message 1153445 - Posted: 18 Sep 2011, 3:55:33 UTC - in response to Message 1153424. Last modified: 18 Sep 2011, 3:58:02 UTC Take a look at the curves for AP, they are going in the opposite direction. Which sort of confirms that AP d/loads have been postponed for now. Postponed, or just blocked by the off the chart runtime estimates? Even if people are stopping new AP work, i wouldn't have expected the effect to be as large as it has been, nor last as long as it has. I have to agree with WinterKnight. My 980 rig has been running through a cache of over 100 AP's for the past week. It's now down to its last three in queue. I think BOINC is somehow bypassing AP units in its processes. When BOINC asks for new work, the return message has been "no work available" for AP since last Tuesday. The same holds true for my twin 580 rig, which also can only score 4 or 5 GPU WU's every third or fourth attempt. ID: 1153445 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67	Message 1153448 - Posted: 18 Sep 2011, 4:47:52 UTC Another pointer that indicates AP tasks are not been sent is that for the last 2 or 3 days the pipeline has been relatively free of blockages. I've seen relatively few tasks that have backed off and have not seen the dreaded Project backoff at all in that time. ID: 1153448 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1153450 - Posted: 18 Sep 2011, 4:56:20 UTC - in response to Message 1153445. Last modified: 18 Sep 2011, 4:56:49 UTC My 980 rig has been running through a cache of over 100 AP's for the past week. It's now down to its last three in queue. What are its estimited completion times? Has anyone got an AP WU recently- what was it's estimated completion time? When BOINC asks for new work, the return message has been "no work available" for AP since last Tuesday. Either the estimated completion times for new AP work are so huge you can only get a couple at a time, or something got borked with the Scheduler when they put in the patch. The AP Raedy to Send buffer just continues to grow, as the Work in Progress gets less & less. The same holds true for my twin 580 rig, which also can only score 4 or 5 GPU WU's every third or fourth attempt. I've got that problem too. There just isn't enough GPU work about- it generally takes at least 5 attempts before any gets downloaded, and then it gets done in no time at all. Every now & then you get a couple of requests in a row that result in work. The end result is that my GPU cache is pretty much stagnant. A couple of download bursts & it tops up slightly, then several non-results & it shrinks down again. Grant Darwin NT ID: 1153450 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67	Message 1153454 - Posted: 18 Sep 2011, 5:22:41 UTC The last AP tasks I received were at "13 Sep 2011 \| 8:52:02 UTC" thats early Tuesday about 8 hours before the maintenance period. ID: 1153454 ·

Dave Stegner Volunteer tester Send message Joined: 20 Oct 04 Posts: 540 Credit: 65,583,328 RAC: 27	Message 1153457 - Posted: 18 Sep 2011, 6:08:03 UTC I have 12 AP only machines. None has received a new task since the Tuesday outage. Dave ID: 1153457 ·

Dad Volunteer tester Send message Joined: 21 May 99 Posts: 44 Credit: 35,266,844 RAC: 10	Message 1153461 - Posted: 18 Sep 2011, 6:59:22 UTC YAY, finally got some gpu wo's! ID: 1153461 ·

Gatekeeper Send message Joined: 14 Jul 04 Posts: 887 Credit: 176,479,616 RAC: 0	Message 1153462 - Posted: 18 Sep 2011, 7:04:43 UTC - in response to Message 1153450. My 980 rig has been running through a cache of over 100 AP's for the past week. It's now down to its last three in queue. What are its estimited completion times? Has anyone got an AP WU recently- what was it's estimated completion time? The last three, all sent on 9/13, are showing completion estimates of 16:27:31 before they start. Since I'm running 12 at once, actual times tend to be closer to 17:30/18:00. This machine hasn't had any problems getting GPU work, though. Right now, my cache is 2700 GPU units, 2100 of which are VHAR. That's up from 2000 total last night at this time, and includes all my processing (3 295's 24/7) The twin 580 rig is still FUBAR workwise, and is still lurching into EDF mode every time an AP unit finishes. With no GPU work of consequence to balance things out I don't see a solution until DA fixes things. FWIW, the difference in performance of workfetch on these two rigs is, IMO, the fact that I never took the flops entries out of the 980 rig, but did take them out when I upgraded to the twin 580's on the other box. ID: 1153462 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1153467 - Posted: 18 Sep 2011, 7:34:48 UTC - in response to Message 1153462. My 980 rig has been running through a cache of over 100 AP's for the past week. It's now down to its last three in queue. What are its estimited completion times? Has anyone got an AP WU recently- what was it's estimated completion time? The last three, all sent on 9/13, are showing completion estimates of 16:27:31 before they start. Since I'm running 12 at once, actual times tend to be closer to 17:30/18:00. So you've got nothing new since the outage either. Grant Darwin NT ID: 1153467 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1153469 - Posted: 18 Sep 2011, 7:44:37 UTC Last modified: 18 Sep 2011, 7:45:39 UTC Don't know if it's intentional or not, but AP work in the field has been decreasing since the outage. See the Scarecrow graphs. And MB in the field has been steadily increasing, which is amazing given the shorty storm and the number of results being returned per hour. I have never seen such a high level of returned work sustained this long. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1153469 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1153472 - Posted: 18 Sep 2011, 7:50:44 UTC - in response to Message 1153469. And MB in the field has been steadily increasing, which is amazing given the shorty storm and the number of results being returned per hour. It's also amazing in that i'm unable to build up the caches of GPU work. Most requests for work result in none, then i get a couple of good requests that bump the cache (as piddling as it is) back to where it was. I just can't get it to grow. Grant Darwin NT ID: 1153472 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1153475 - Posted: 18 Sep 2011, 7:58:39 UTC - in response to Message 1153472. And MB in the field has been steadily increasing, which is amazing given the shorty storm and the number of results being returned per hour. It's also amazing in that i'm unable to build up the caches of GPU work. Most requests for work result in none, then i get a couple of good requests that bump the cache (as piddling as it is) back to where it was. I just can't get it to grow. Same here...most rigs are just maintaining about where they are. The problem has been compounded by the Boinc code revision that has totally skewed DCFs and work requests because of bloated time to completion estimates. That might get fixed with a code update during the next outage. But, unless the work mix coming from Arecibo changes, this is not gonna get better anytime soon. And, in fact, may get worse if AP starts being sent out again, monopolizing the available bandwidth. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1153475 ·

soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0	Message 1153546 - Posted: 18 Sep 2011, 11:46:38 UTC GPU Still sucking air. Get a handful, 0 available repeatedly, finish 'em up, a while later another handful. Filling not even an issue. Getting ANY is. Janice ID: 1153546 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67	Message 1153547 - Posted: 18 Sep 2011, 11:52:57 UTC - in response to Message 1153546. GPU Still sucking air. Get a handful, 0 available repeatedly, finish 'em up, a while later another handful. Filling not even an issue. Getting ANY is. What's your DCF? ID: 1153547 ·

soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0	Message 1153551 - Posted: 18 Sep 2011, 12:00:15 UTC - in response to Message 1153547. It is on automatic, I have not messed with it. Machine is not having "too full" issues at all. Just getting measly amounts of work irregularly. I keep asking, server keeps saying 0. or 1. Occasional 30-40.. after it has gone completely dry again. Janice ID: 1153551 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67	Message 1153553 - Posted: 18 Sep 2011, 12:09:06 UTC - in response to Message 1153551. It is on automatic, I have not messed with it. Machine is not having "too full" issues at all. Just getting measly amounts of work irregularly. I keep asking, server keeps saying 0. or 1. Occasional 30-40.. after it has gone completely dry again. I wasn't suggesting you had fiddled with it, I was just asking what it is. It could be that stopping BOINC, editting DCF to a realistic value, then re-starting BOINC might fix it. I'm not having a problem d/loading, not as many as normal agreed, but getting more than you. But that is due to the, now, publicised problem I am having with the AP APR. Because as soon as an AP task completes it punches my DCF up to 1.5 min. ID: 1153553 ·

IFRS Volunteer tester Send message Joined: 21 May 99 Posts: 1736 Credit: 259,180,282 RAC: 0	Message 1153589 - Posted: 18 Sep 2011, 14:06:44 UTC ItÂ´s not just the client problem. Even if you request work and itÂ´s available, itÂ´s not beeing assigned. It keeps just sending 0 or 1 even if you are drain. ID: 1153589 ·

Khangollo Send message Joined: 1 Aug 00 Posts: 245 Credit: 36,410,524 RAC: 0	Message 1153592 - Posted: 18 Sep 2011, 14:21:48 UTC Last modified: 18 Sep 2011, 14:22:58 UTC I wouldn't be surprised if a lot of feeder slots are permanently occupied by APs no one wants to have (because no doubt AP task duration is badly F-ed up, too). ID: 1153592 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.