The 400 and 50 WU limits are way too small

Author	Message
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1198170 - Posted: 21 Feb 2012, 10:38:32 UTC Actually, the AP creation rate is about what it has been for many months. Anywhere between 0.25-1.00 is "normal." It has spiked over 1.00 a few times, but not very often. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1198170 ·

red-ray Send message Joined: 24 Jun 99 Posts: 308 Credit: 9,029,848 RAC: 0	Message 1198192 - Posted: 21 Feb 2012, 11:57:19 UTC - in response to Message 1198170. Last modified: 21 Feb 2012, 12:51:52 UTC How can a system reach the limit when it's just returned 3 tasks? 21/02/2012 11:47:52 \| SETI@home \| Reporting 3 completed tasks, requesting new tasks for CPU 21/02/2012 11:47:58 \| SETI@home \| Scheduler request completed: got 0 new tasks 21/02/2012 11:47:58 \| SETI@home \| No tasks sent 21/02/2012 11:47:58 \| SETI@home \| This computer has reached a limit on tasks in progress I assume it's a race condition and the test is being done before the 3 just returned have been processed. The effect of this is to cause the CPU work fetch deferred to back off and starve the system of WUs. This is my slow (18K per day) system, so it's not a big issue. It just amused me. ID: 1198192 ·

LadyL Volunteer tester Send message Joined: 14 Sep 11 Posts: 1679 Credit: 5,230,097 RAC: 0	Message 1198196 - Posted: 21 Feb 2012, 12:09:56 UTC - in response to Message 1198192. How can a system reach the limit when it's just returned 3 tasks? 21/02/2012 11:47:52 \| SETI@home \| Reporting 3 completed tasks, requesting new tasks for CPU 21/02/2012 11:47:58 \| SETI@home \| Scheduler request completed: got 0 new tasks 21/02/2012 11:47:58 \| SETI@home \| No tasks sent 21/02/2012 11:47:58 \| SETI@home \| This computer has reached a limit on tasks in progress Good one :D Maybe it reported GPU tasks - it was asking for CPU. ID: 1198196 ·

Belthazor Volunteer tester Send message Joined: 6 Apr 00 Posts: 219 Credit: 10,373,795 RAC: 13	Message 1198198 - Posted: 21 Feb 2012, 12:11:56 UTC - in response to Message 1198192. How can a system reach the limit when it's just returned 3 tasks? It's easy - scheduler just made re-count neÑessary time to crunch received WUs and on some reasons this time is increased, for example if you're used BOINC 24\7 but now it was shutted down for a while. ID: 1198198 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1198200 - Posted: 21 Feb 2012, 12:25:55 UTC - in response to Message 1198198. How can a system reach the limit when it's just returned 3 tasks? It's easy - scheduler just made re-count neÑessary time to crunch received WUs and on some reasons this time is increased, for example if you're used BOINC 24\7 but now it was shutted down for a while. No, the limit is by quantity, not by time. ID: 1198200 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19064 Credit: 40,757,560 RAC: 67	Message 1198216 - Posted: 21 Feb 2012, 13:07:56 UTC - in response to Message 1197937. "The 400 and 50 WU limits are way too small" for some computers, way too big for most. We need BOINC to provide a time-based limits option, restricting each host to about 2 days of work "in progress" would be appropriate for this project currently. Joe Provided that the project servers - which is where such a limit would have to be applied - have an accurate estimate of the runtime of tasks cached. At the moment, the servers ignore DCF entirely when calculating their own idea of how long newly-allocated work will run during work allocation, and the well-rehearsed APR capping problem means that hosts are running some extreme DCFs, far from the 1.0000 taken for granted by CreditNew. (I've got a 0.0691) But DCF is controlling the hosts requests for work. MB cpu tasks on my computers reset the DCF to 1.00 +/- 0.1 but the MB GPU or AP tsks drive it low. If I load up the cpu with AP work then the GPU tsks drive the DCF down to below 0.2 regulary. Of course when the AP tasks complete and a MB task completes then the computer enters panic mode. ID: 1198216 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1198224 - Posted: 21 Feb 2012, 13:30:16 UTC - in response to Message 1198216. But DCF is controlling the hosts requests for work. It may be controlling the request, but it isn't controlling the reply any more. With my 0.0691, it's a case of "request 15 hours work, allocated 1 hour" - even if there's an infinite amount of work ready to send in the feeder queue. ID: 1198224 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19064 Credit: 40,757,560 RAC: 67	Message 1198226 - Posted: 21 Feb 2012, 13:39:37 UTC - in response to Message 1198224. But DCF is controlling the hosts requests for work. It may be controlling the request, but it isn't controlling the reply any more. With my 0.0691, it's a case of "request 15 hours work, allocated 1 hour" - even if there's an infinite amount of work ready to send in the feeder queue. Then how is it I have to drive the DCF down to get even 1 day of work for the GPU. At one stage I had about 8 days of work for the cpu thanks to ~20 ap's and only about 50 tasks total for the gpu. ID: 1198226 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1198229 - Posted: 21 Feb 2012, 13:50:17 UTC - in response to Message 1198226. Last modified: 21 Feb 2012, 13:51:31 UTC But DCF is controlling the hosts requests for work. It may be controlling the request, but it isn't controlling the reply any more. With my 0.0691, it's a case of "request 15 hours work, allocated 1 hour" - even if there's an infinite amount of work ready to send in the feeder queue. Then how is it I have to drive the DCF down to get even 1 day of work for the GPU. At one stage I had about 8 days of work for the cpu thanks to ~20 ap's and only about 50 tasks total for the gpu. I can't answer that, but try enabling <sched_op_debug> logging so you get the 'estimated total task duration' after a work allocation. If DCF is low, the (client estimate of the) work allocation will be low in the same proportion, at best. Edit - and if DCF is high, you probably won't have a resouce shortfall to trigger the work request in the first place. ID: 1198229 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19064 Credit: 40,757,560 RAC: 67	Message 1198237 - Posted: 21 Feb 2012, 14:13:39 UTC - in response to Message 1198229. Edit - and if DCF is high, you probably won't have a resouce shortfall to trigger the work request in the first place. Thats the problem, with only a project DCF, the GPU tasks are estimated, when DCF=1, at 4 or 5 times actual. ID: 1198237 ·

red-ray Send message Joined: 24 Jun 99 Posts: 308 Credit: 9,029,848 RAC: 0	Message 1198244 - Posted: 21 Feb 2012, 14:53:56 UTC - in response to Message 1198237. Last modified: 21 Feb 2012, 14:54:49 UTC I keep wondering why the Average turnaround time can't be used to decide how many WUs you are allowed. Currently for my 980X it is 0.38 days and it has 731 pending so it would need to be allowed about 6000 rather than 2200 so it could last 3 days. If each host had a quota that got adjusted to meet a target turn-a-round time all the hosts would get a fair share of WUs and overall there would be far less results in progress. Ideally you would have separate GPU and CPU quotas and turn-a-round times. ID: 1198244 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1198302 - Posted: 21 Feb 2012, 22:22:59 UTC - in response to Message 1198244. I keep wondering why the Average turnaround time can't be used to decide how many WUs you are allowed. Currently for my 980X it is 0.38 days and it has 731 pending so it would need to be allowed about 6000 rather than 2200 so it could last 3 days. If each host had a quota that got adjusted to meet a target turn-a-round time all the hosts would get a fair share of WUs and overall there would be far less results in progress. Ideally you would have separate GPU and CPU quotas and turn-a-round times. It might be possible, or they could already be doing it, to include that into the DCF calculations. Not that DCF isn't already messed up enough as it is. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1198302 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19064 Credit: 40,757,560 RAC: 67	Message 1198414 - Posted: 22 Feb 2012, 5:54:08 UTC - in response to Message 1198302. Last modified: 22 Feb 2012, 5:58:10 UTC I keep wondering why the Average turnaround time can't be used to decide how many WUs you are allowed. Currently for my 980X it is 0.38 days and it has 731 pending so it would need to be allowed about 6000 rather than 2200 so it could last 3 days. If each host had a quota that got adjusted to meet a target turn-a-round time all the hosts would get a fair share of WUs and overall there would be far less results in progress. Ideally you would have separate GPU and CPU quotas and turn-a-round times. It might be possible, or they could already be doing it, to include that into the DCF calculations. Not that DCF isn't already messed up enough as it is. Obviously it doesn't work for new or re-attached computers but why not work out downloads on RAC. By definition that says everything about what the computer can crunch in an average day. edit] Silly me, I just remembered why it cannot happen, although Eric K did a pretty good job at it, the BOINC people cannot work out what the credits/time are. ID: 1198414 ·

tbret Volunteer tester Send message Joined: 28 May 99 Posts: 3380 Credit: 296,162,071 RAC: 40	Message 1198417 - Posted: 22 Feb 2012, 5:59:28 UTC - in response to Message 1198137. Last modified: 22 Feb 2012, 5:59:45 UTC Yeah, I wish we could get things sorted and get back to Boinc calling the shots. So depressing when I can't have the rigs weather a day or two outage. Just get them to send you a thumb-drive full of WUs like I did. ID: 1198417 ·

red-ray Send message Joined: 24 Jun 99 Posts: 308 Credit: 9,029,848 RAC: 0	Message 1198448 - Posted: 22 Feb 2012, 9:35:46 UTC - in response to Message 1198414. Obviously it doesn't work for new or re-attached computers but why not work out downloads on RAC. By definition that says everything about what the computer can crunch in an average day. There would be a minimum quota to deal with new and returning systems. No using the RAC is not ideal. You could have an high RAC and be turning WUs round in 10 days. The regime should favour systems that return WUs before the turn-a-round target time to try and keep the number of WUs in progess to a minimum. ID: 1198448 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19064 Credit: 40,757,560 RAC: 67	Message 1198455 - Posted: 22 Feb 2012, 10:11:00 UTC - in response to Message 1198448. Obviously it doesn't work for new or re-attached computers but why not work out downloads on RAC. By definition that says everything about what the computer can crunch in an average day. There would be a minimum quota to deal with new and returning systems. No using the RAC is not ideal. You could have an high RAC and be turning WUs round in 10 days. The regime should favour systems that return WUs before the turn-a-round target time to try and keep the number of WUs in progess to a minimum. But they could still use the RAC * "turn-a-round target" where RAC would be translated into flops/time in the formula to calculate max work downloaded. ID: 1198455 ·

red-ray Send message Joined: 24 Jun 99 Posts: 308 Credit: 9,029,848 RAC: 0	Message 1200511 - Posted: 28 Feb 2012, 1:35:40 UTC I just got: 28/02/2012 01:27:34 \| SETI@home \| Reporting 2 completed tasks, requesting new tasks for NVIDIA GPU 28/02/2012 01:27:46 \| SETI@home \| Scheduler request completed: got 0 new tasks 28/02/2012 01:27:46 \| SETI@home \| No tasks sent 28/02/2012 01:27:46 \| SETI@home \| No tasks are available for Astropulse v5 28/02/2012 01:27:46 \| SETI@home \| No tasks are available for Astropulse v505 28/02/2012 01:27:46 \| SETI@home \| This computer has reached a limit on tasks in progress But only have 1,437 WUs. Given I have 4 GPUs I would be allowed 1,600 for the GPUs alone. Is there yet another limit? ID: 1200511 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1200512 - Posted: 28 Feb 2012, 1:41:53 UTC - in response to Message 1200511. I just got: 28/02/2012 01:27:34 \| SETI@home \| Reporting 2 completed tasks, requesting new tasks for NVIDIA GPU 28/02/2012 01:27:46 \| SETI@home \| Scheduler request completed: got 0 new tasks 28/02/2012 01:27:46 \| SETI@home \| No tasks sent 28/02/2012 01:27:46 \| SETI@home \| No tasks are available for Astropulse v5 28/02/2012 01:27:46 \| SETI@home \| No tasks are available for Astropulse v505 28/02/2012 01:27:46 \| SETI@home \| This computer has reached a limit on tasks in progress But only have 1,437 WUs. Given I have 4 GPUs I would be allowed 1,600 for the GPUs alone. Is there yet another limit? Probably ghost tasks but that can't be checked right now with the results disabled. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1200512 ·

red-ray Send message Joined: 24 Jun 99 Posts: 308 Credit: 9,029,848 RAC: 0	Message 1200561 - Posted: 28 Feb 2012, 8:42:12 UTC - in response to Message 1200512. Last modified: 28 Feb 2012, 8:42:34 UTC OK. Is there a way to expunge any ghost tasks that do exist pelase? ID: 1200561 ·

Lionel Send message Joined: 25 Mar 00 Posts: 680 Credit: 563,640,304 RAC: 597	Message 1200566 - Posted: 28 Feb 2012, 9:10:17 UTC - in response to Message 1198161. I hope Eric can get that assimilator sorted out for when the new download server arrives - after all, we'll want massive download congestion so we can see what it can do under real load (and give them a chance to fine tune the software). Well at least the number to be assimilated has dropped but I wonder why the AP creation rate is so low. Cheers. I think you should do the honourable thing ... and move a bit further away ... :) ID: 1200566 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.