BOINC 4.40 thinks it's overcommitted after WU download

Message boards : Number crunching : BOINC 4.40 thinks it's overcommitted after WU download
Message board moderation

To post messages, you must log in.

AuthorMessage
guidoprost
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 7
Credit: 19,732
RAC: 0
Germany
Message 110842 - Posted: 13 May 2005, 16:38:11 UTC

Hi,

there seems to be a problem with the way the client decides whether it is overcommitted or not.
I've seen this twice, once with 4.39, and now with 4.40
It goes like this:

1) Finish WU
5/13/2005 5:24:18 PM|SETI@home|Computation for result 19dc04aa.17564.28466.523582.31_0 finished
5/13/2005 5:24:18 PM||schedule_cpus: must schedule

2) Start new one, download another
5/13/2005 5:24:18 PM|SETI@home|Starting result 28ja05ab.23571.8594.579808.105_2 using setiathome version 4.11
5/13/2005 5:24:19 PM|SETI@home|Requesting 8640.00 seconds of work
5/13/2005 5:24:19 PM|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
5/13/2005 5:24:19 PM|SETI@home|Started upload of 19dc04aa.17564.28466.523582.31_0_0
5/13/2005 5:24:23 PM|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded
5/13/2005 5:24:24 PM|SETI@home|Finished upload of 19dc04aa.17564.28466.523582.31_0_0
5/13/2005 5:24:24 PM|SETI@home|Throughput 261973 bytes/sec
5/13/2005 5:24:24 PM|SETI@home|Started download of 30dc04ab.22830.3376.165912.12
5/13/2005 5:24:27 PM|SETI@home|Finished download of 30dc04ab.22830.3376.165912.12
5/13/2005 5:24:27 PM|SETI@home|Throughput 175501 bytes/sec
5/13/2005 5:24:27 PM||request_reschedule_cpus: files downloaded
5/13/2005 5:24:27 PM||schedule_cpus: must schedule

3) Suddenly overcommitted??
5/13/2005 5:24:37 PM||Computer is overcommitted
5/13/2005 5:24:37 PM||Nearly overcommitted.
5/13/2005 5:24:37 PM||New work fetch policy: no work fetch allowed.
5/13/2005 5:24:37 PM||New CPU scheduler policy: earliest deadline first.

4) Manual update, everything's fine again
5/13/2005 5:44:51 PM||request_reschedule_cpus: project op
5/13/2005 5:44:51 PM||schedule_cpus: must schedule
5/13/2005 5:44:51 PM||New work fetch policy: work fetch allowed.
5/13/2005 5:44:51 PM||New CPU scheduler policy: highest debt first.
5/13/2005 5:44:51 PM|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
5/13/2005 5:44:52 PM|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded

Client is clearly not overcommitted, cache set at 0.3 days, at most 3 WUs, I usually return WUs after 2 days. S@H: 100, PP: 20 resource share. Standard BOINC 4.40, optimized Seti 4.11.

My theory: Just seconds after starting a WU the estimated to-completion time could be WAY off. Several DAYS off. I don't know which part of the client is responsible for these estimations, but it should probably stay at the standard to-completion time for the first minute or so. But why the 10 second delay after the "schedule_cpus: must schedule" message?

This is not a real bug because the next time the client schedules it realizes there's nothing wrong, but this might really confuse users. I especially like the "computer is overcommitted" - "Nearly overcommitted" message combination during the same second :-)

Speaking of confusing users: This version is a DEVELOPMENT version. NOT a stable recommended version.
ID: 110842 · Report as offensive
Profile Keck_Komputers
Volunteer tester
Avatar

Send message
Joined: 4 Jul 99
Posts: 1575
Credit: 4,152,111
RAC: 1
United States
Message 110910 - Posted: 13 May 2005, 22:15:17 UTC

I notice PP@H has fairly small resource share. It may be right at the edge of being overcommitted because of that.
BOINC WIKI

BOINCing since 2002/12/8
ID: 110910 · Report as offensive
guidoprost
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 7
Credit: 19,732
RAC: 0
Germany
Message 110965 - Posted: 14 May 2005, 0:42:56 UTC - in response to Message 110910.  

<blockquote>I notice PP@H has fairly small resource share. It may be right at the edge of being overcommitted because of that.</blockquote>

That can't be it because at the time there were no PP@H WUs on my computer. There was the one Seti WU that had just started computing (more than 10 days left until the deadline) and another one that had just finished downloading.
ID: 110965 · Report as offensive
guidoprost
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 7
Credit: 19,732
RAC: 0
Germany
Message 110971 - Posted: 14 May 2005, 1:03:32 UTC

The exact same thing happened again just now:

5/14/2005 2:37:23 AM|SETI@home|Computation for result 28ja05ab.23571.9792.665884.188_3 finished
5/14/2005 2:37:23 AM||schedule_cpus: must schedule
5/14/2005 2:37:23 AM|SETI@home|Starting result 31dc04aa.14083.2961.934666.75_0 using setiathome version 4.11
5/14/2005 2:37:25 AM|SETI@home|Requesting 8640.00 seconds of work
5/14/2005 2:37:25 AM|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
5/14/2005 2:37:25 AM|SETI@home|Started upload of 28ja05ab.23571.9792.665884.188_3_0
5/14/2005 2:37:26 AM|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded
5/14/2005 2:37:27 AM|SETI@home|Started download of 28ja05ab.13836.12370.511086.15
5/14/2005 2:37:30 AM|SETI@home|Finished download of 28ja05ab.13836.12370.511086.15
5/14/2005 2:37:30 AM|SETI@home|Throughput 172883 bytes/sec
5/14/2005 2:37:30 AM||request_reschedule_cpus: files downloaded
5/14/2005 2:37:30 AM||schedule_cpus: must schedule
5/14/2005 2:37:33 AM|SETI@home|Finished upload of 28ja05ab.23571.9792.665884.188_3_0
5/14/2005 2:37:33 AM|SETI@home|Throughput 153408 bytes/sec
5/14/2005 2:37:43 AM||Computer is overcommitted
5/14/2005 2:37:43 AM||Nearly overcommitted.
5/14/2005 2:37:43 AM||New work fetch policy: no work fetch allowed.
5/14/2005 2:37:43 AM||New CPU scheduler policy: earliest deadline first.
5/14/2005 2:51:27 AM||request_reschedule_cpus: project op
5/14/2005 2:51:27 AM||schedule_cpus: must schedule
5/14/2005 2:51:27 AM||New work fetch policy: work fetch allowed.
5/14/2005 2:51:27 AM||New CPU scheduler policy: highest debt first.
5/14/2005 2:51:27 AM|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
5/14/2005 2:51:28 AM|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded

WU finished, start new one, download another one, 10 second delay after dl/ul finished, overcommitted.
The only logical explanation I can think of is still that the estimate for to-completion time is way too high right at the beginning of computation, hence the panic mode.
Still only S@H WUs on my computer.
ID: 110971 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 110972 - Posted: 14 May 2005, 1:05:46 UTC

guido, I'm of the impression(I have NO proof) that the "nearly overcommitted" and "overcommited" do NOT apply to the deadlines, but rather to filling your queue to the "connect to" setting.
ID: 110972 · Report as offensive
Profile Keck_Komputers
Volunteer tester
Avatar

Send message
Joined: 4 Jul 99
Posts: 1575
Credit: 4,152,111
RAC: 1
United States
Message 111025 - Posted: 14 May 2005, 9:12:31 UTC

<blockquote>The only logical explanation I can think of is still that the estimate for to-completion time is way too high right at the beginning of computation, hence the panic mode.</blockquote>
Close, the estimate goes extremly low in the first half of the workunit. That would explain why it goes away quickly.

BOINC WIKI

BOINCing since 2002/12/8
ID: 111025 · Report as offensive

Message boards : Number crunching : BOINC 4.40 thinks it's overcommitted after WU download


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.