Advance (Jun 25 2009)

Message boards : Technical News : Advance (Jun 25 2009)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4

AuthorMessage
EPG

Send message
Joined: 3 Apr 99
Posts: 110
Credit: 10,416,543
RAC: 0
Hungary
Message 912425 - Posted: 28 Jun 2009, 19:16:57 UTC - in response to Message 912390.  


We're going to keep coming back to multiple projects, because while you may not care, a lot of people do, and it is one of the requirements that BOINC must meet.

I do, if you check my sig, you can see, i do/did multiple projects.



If we did strict EDF, most long work units (CPDN units) would be in big trouble before they even start processing. Any work downloaded from any other project would have a shorter deadline, and stop CPDN.
The normal scheduler mode is "round robin" -- work inside a project is done in "downloaded" order, and the projects get time based on resource share (and managed through short-term debt).

But I advice EDF only for choosing for next work inside a project. All I asking why boinc use the "downloaded" order inside, i have no problem with the round-robin project selection.

Example: 1 cpu, 1 project with wus that takes 2 days to compute, cache 5 days.
We got wu A with deadline 10 days from now, and wu B with deadline 5 days from now. Wu A arrived first. And now we have to choose the next wu. Current Boinc would choose A then B, and the computer could finish it normally.
But if we got a electric blackout for 3 days after the first day then it can't do it. Using EDF in this scenario would solve it.
If there are multiple project on this comp. then "fifo goes into edf" maybe can solve it with debt, but edf would solve it without debt.

ID: 912425 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 912436 - Posted: 28 Jun 2009, 19:58:41 UTC - in response to Message 912425.  


We're going to keep coming back to multiple projects, because while you may not care, a lot of people do, and it is one of the requirements that BOINC must meet.

I do, if you check my sig, you can see, i do/did multiple projects.



If we did strict EDF, most long work units (CPDN units) would be in big trouble before they even start processing. Any work downloaded from any other project would have a shorter deadline, and stop CPDN.
The normal scheduler mode is "round robin" -- work inside a project is done in "downloaded" order, and the projects get time based on resource share (and managed through short-term debt).

But I advice EDF only for choosing for next work inside a project. All I asking why boinc use the "downloaded" order inside, i have no problem with the round-robin project selection.

Example: 1 cpu, 1 project with wus that takes 2 days to compute, cache 5 days.
We got wu A with deadline 10 days from now, and wu B with deadline 5 days from now. Wu A arrived first. And now we have to choose the next wu. Current Boinc would choose A then B, and the computer could finish it normally.
But if we got a electric blackout for 3 days after the first day then it can't do it. Using EDF in this scenario would solve it.
If there are multiple project on this comp. then "fifo goes into edf" maybe can solve it with debt, but edf would solve it without debt.

EDF only within a project does now work well either as some projectsm, including SETI, have tasks with varying computation requirements and deadlines. If you keep the queue topped off, and the queue is longer than the time for a single one of the long duration tasks, and you are unlucy enough to get a bunch of thw short duration short deadline tasks, then it is entirely possible that computation will not start on the long duration task until after it is too late for it to complete. Using FIFO when possible tends to clear out the long duration tasks when EDF is not absoloutely required in oreder to meet deadlines.

There is a case for GPU tasks to run to completion, if possible, before a task switch as the cost of the task switch is very high. This does have a detrimental effect on the interesting assortment of work done on the computer, but in the case of GPU tasks, there are likely to be CPU tasks running as well, so this argument is less valid. If using FIFO only, even between projects, then the work fetch needs to be modified such that it does not fetch work from anywhere if there is enough work in the queue to fill it. The work would not be started for a long time anyway, so there is no need to download it.


BOINC WIKI
ID: 912436 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 912457 - Posted: 28 Jun 2009, 20:44:34 UTC - in response to Message 912425.  


But I advice EDF only for choosing for next work inside a project. All I asking why boinc use the "downloaded" order inside, i have no problem with the round-robin project selection.

If you really feel strongly about this, you can download the BOINC source code, compile it, and test it out.

I went through the same process you're doing now.

As a general rule, round-robin works best. The oldest work units make progress toward completion, and even if you download work with shorter deadlines, BOINC continues to work on tasks that are started.

If BOINC finds that round-robin scheduling won't complete work on time, then throwing everything at the shortest deadlines will generally avoid trouble.

This works because it handles the normal cases normally, and the exceptional cases exceptionally.
ID: 912457 · Report as offensive
Chelski
Avatar

Send message
Joined: 3 Jan 00
Posts: 121
Credit: 8,979,050
RAC: 0
Malaysia
Message 912550 - Posted: 29 Jun 2009, 7:18:58 UTC - in response to Message 912193.  

To get around BOINC limitation on the number of WUs, one work around is to let your PC live a double life with 2 separate BOINC instances, each linked to a unique computer name. Take some baby sitting to maintain but can be done. But unfortunately will effectively half the RAC. Did it for a while to feed BOINC to a PC that have a dead LAN connection, but at the end gave up coz the trouble required to change pc name, reboot, then only let BOINC connect, etc...
ID: 912550 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 912643 - Posted: 29 Jun 2009, 15:57:30 UTC - in response to Message 912550.  

To get around BOINC limitation on the number of WUs, one work around is to let your PC live a double life with 2 separate BOINC instances, each linked to a unique computer name. Take some baby sitting to maintain but can be done. But unfortunately will effectively half the RAC. Did it for a while to feed BOINC to a PC that have a dead LAN connection, but at the end gave up coz the trouble required to change pc name, reboot, then only let BOINC connect, etc...

If you run a computer half-time, BOINC will figure out how much time it actually runs (remember, BOINC works on computers that are used a few hours per day and then turned off when they aren't being used) and reduce the queue appropriately.

So, if you take one computer and make it look like two computers running a half-day, the cache on each will be reduced by half.
ID: 912643 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 912764 - Posted: 30 Jun 2009, 0:53:59 UTC - in response to Message 912261.  

...
I first realised this Policy wasn't working in 6.6.31, but had seen it in earlier vesions as well,
Anyway I posted to the Boinc Alpha list about this and the result is:

Changeset 18503

Author: davea 
Message: - client: when suspending a GPU job, 


always remove it from memory, even if it hasn't checkpointed. 
Otherwise we'll typically run another GPU job right away, 
and it will bomb out or revert to CPU mode because it 
can't allocate video RAM 


I think this will be a partial fix, it won't stop the switching of WU's, But it should stop Cuda apps being left in memory and causing CPU fallback mode on later WU's,
and then all hell breaking out.

Claggy


Thanks a lot! :-)

I have no knowledge about the 'Changeset area'.
The Devs are now informed and will change BOINC?

ID: 912764 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 913043 - Posted: 1 Jul 2009, 21:55:25 UTC - in response to Message 912764.  

...
I first realised this Policy wasn't working in 6.6.31, but had seen it in earlier vesions as well,
Anyway I posted to the Boinc Alpha list about this and the result is:

Changeset 18503

Author: davea 
Message: - client: when suspending a GPU job, 


always remove it from memory, even if it hasn't checkpointed. 
Otherwise we'll typically run another GPU job right away, 
and it will bomb out or revert to CPU mode because it 
can't allocate video RAM 


I think this will be a partial fix, it won't stop the switching of WU's, But it should stop Cuda apps being left in memory and causing CPU fallback mode on later WU's,
and then all hell breaking out.

Claggy


Thanks a lot! :-)

I have no knowledge about the 'Changeset area'.
The Devs are now informed and will change BOINC?

A changeset is a change to the program. In other words the code has been changed, not it has to be tested and released.

DaveA is the lead developer.


BOINC WIKI
ID: 913043 · Report as offensive
Previous · 1 · 2 · 3 · 4

Message boards : Technical News : Advance (Jun 25 2009)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.