Message boards :
Number crunching :
Panic Mode On (54) Server problems?
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 10 · Next
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
I'm still scratching my head over the Number of results returned per hour. Even with a shorty storm from hell, i wouldn't have thought it's be this high for this long. And the fact is most of my CPU work is VLARs, and only 1 in 5 requests for GPU work results in any being sent. Most of the time it's a "No tasks sent" message. Very occasionally i might get a "Project has no tasks available" message (feeder empty at that time). Grant Darwin NT |
W-K 666 Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67 |
Take a look at the curves for AP, they are going in the opposite direction. Which sort of confirms that AP d/loads have been postponed for now. So with no AP tasks to fill in their time computers would need about 8 MB tasks on average. And the turnaround time will probably be because the estimates are so high then instead of d/loading 50 tasks/request they are getting just a few. edit] beware of splinters LOL |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
Take a look at the curves for AP, they are going in the opposite direction. Which sort of confirms that AP d/loads have been postponed for now. Postponed, or just blocked by the off the chart runtime estimates? Even if people are stopping new AP work, i wouldn't have expected the effect to be as large as it has been, nor last as long as it has. Grant Darwin NT |
Gatekeeper Send message Joined: 14 Jul 04 Posts: 887 Credit: 176,479,616 RAC: 0 |
Take a look at the curves for AP, they are going in the opposite direction. Which sort of confirms that AP d/loads have been postponed for now. I have to agree with WinterKnight. My 980 rig has been running through a cache of over 100 AP's for the past week. It's now down to its last three in queue. I think BOINC is somehow bypassing AP units in its processes. When BOINC asks for new work, the return message has been "no work available" for AP since last Tuesday. The same holds true for my twin 580 rig, which also can only score 4 or 5 GPU WU's every third or fourth attempt. |
W-K 666 Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67 |
Another pointer that indicates AP tasks are not been sent is that for the last 2 or 3 days the pipeline has been relatively free of blockages. I've seen relatively few tasks that have backed off and have not seen the dreaded Project backoff at all in that time. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
My 980 rig has been running through a cache of over 100 AP's for the past week. It's now down to its last three in queue. What are its estimited completion times? Has anyone got an AP WU recently- what was it's estimated completion time? When BOINC asks for new work, the return message has been "no work available" for AP since last Tuesday. Either the estimated completion times for new AP work are so huge you can only get a couple at a time, or something got borked with the Scheduler when they put in the patch. The AP Raedy to Send buffer just continues to grow, as the Work in Progress gets less & less. The same holds true for my twin 580 rig, which also can only score 4 or 5 GPU WU's every third or fourth attempt. I've got that problem too. There just isn't enough GPU work about- it generally takes at least 5 attempts before any gets downloaded, and then it gets done in no time at all. Every now & then you get a couple of requests in a row that result in work. The end result is that my GPU cache is pretty much stagnant. A couple of download bursts & it tops up slightly, then several non-results & it shrinks down again. Grant Darwin NT |
W-K 666 Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67 |
The last AP tasks I received were at "13 Sep 2011 | 8:52:02 UTC" thats early Tuesday about 8 hours before the maintenance period. |
Dave Stegner Send message Joined: 20 Oct 04 Posts: 540 Credit: 65,583,328 RAC: 27 |
I have 12 AP only machines. None has received a new task since the Tuesday outage. Dave |
Dad Send message Joined: 21 May 99 Posts: 44 Credit: 35,266,844 RAC: 10 |
YAY, finally got some gpu wo's! |
Gatekeeper Send message Joined: 14 Jul 04 Posts: 887 Credit: 176,479,616 RAC: 0 |
My 980 rig has been running through a cache of over 100 AP's for the past week. It's now down to its last three in queue. The last three, all sent on 9/13, are showing completion estimates of 16:27:31 before they start. Since I'm running 12 at once, actual times tend to be closer to 17:30/18:00. This machine hasn't had any problems getting GPU work, though. Right now, my cache is 2700 GPU units, 2100 of which are VHAR. That's up from 2000 total last night at this time, and includes all my processing (3 295's 24/7) The twin 580 rig is still FUBAR workwise, and is still lurching into EDF mode every time an AP unit finishes. With no GPU work of consequence to balance things out I don't see a solution until DA fixes things. FWIW, the difference in performance of workfetch on these two rigs is, IMO, the fact that I never took the flops entries out of the 980 rig, but did take them out when I upgraded to the twin 580's on the other box. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
My 980 rig has been running through a cache of over 100 AP's for the past week. It's now down to its last three in queue. So you've got nothing new since the outage either. Grant Darwin NT |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Don't know if it's intentional or not, but AP work in the field has been decreasing since the outage. See the Scarecrow graphs. And MB in the field has been steadily increasing, which is amazing given the shorty storm and the number of results being returned per hour. I have never seen such a high level of returned work sustained this long. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
And MB in the field has been steadily increasing, which is amazing given the shorty storm and the number of results being returned per hour. It's also amazing in that i'm unable to build up the caches of GPU work. Most requests for work result in none, then i get a couple of good requests that bump the cache (as piddling as it is) back to where it was. I just can't get it to grow. Grant Darwin NT |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
And MB in the field has been steadily increasing, which is amazing given the shorty storm and the number of results being returned per hour. Same here...most rigs are just maintaining about where they are. The problem has been compounded by the Boinc code revision that has totally skewed DCFs and work requests because of bloated time to completion estimates. That might get fixed with a code update during the next outage. But, unless the work mix coming from Arecibo changes, this is not gonna get better anytime soon. And, in fact, may get worse if AP starts being sent out again, monopolizing the available bandwidth. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0 |
GPU Still sucking air. Get a handful, 0 available repeatedly, finish 'em up, a while later another handful. Filling not even an issue. Getting ANY is. Janice |
W-K 666 Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67 |
GPU Still sucking air. Get a handful, 0 available repeatedly, finish 'em up, a while later another handful. Filling not even an issue. Getting ANY is. What's your DCF? |
soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0 |
It is on automatic, I have not messed with it. Machine is not having "too full" issues at all. Just getting measly amounts of work irregularly. I keep asking, server keeps saying 0. or 1. Occasional 30-40.. after it has gone completely dry again. Janice |
W-K 666 Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67 |
It is on automatic, I have not messed with it. Machine is not having "too full" issues at all. Just getting measly amounts of work irregularly. I keep asking, server keeps saying 0. or 1. Occasional 30-40.. after it has gone completely dry again. I wasn't suggesting you had fiddled with it, I was just asking what it is. It could be that stopping BOINC, editting DCF to a realistic value, then re-starting BOINC might fix it. I'm not having a problem d/loading, not as many as normal agreed, but getting more than you. But that is due to the, now, publicised problem I am having with the AP APR. Because as soon as an AP task completes it punches my DCF up to 1.5 min. |
IFRS Send message Joined: 21 May 99 Posts: 1736 Credit: 259,180,282 RAC: 0 |
It´s not just the client problem. Even if you request work and it´s available, it´s not beeing assigned. It keeps just sending 0 or 1 even if you are drain. |
Khangollo Send message Joined: 1 Aug 00 Posts: 245 Credit: 36,410,524 RAC: 0 |
|
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.