Question about Duration correction Factor

Message boards : Number crunching : Question about Duration correction Factor
Message board moderation

To post messages, you must log in.

AuthorMessage
HFB1217
Avatar

Send message
Joined: 25 Dec 05
Posts: 102
Credit: 9,424,572
RAC: 0
United States
Message 453069 - Posted: 8 Nov 2006, 0:13:33 UTC

How is it computed and where is it computed at Berkeley or in Boinc on your system.

Why does it go wacky and give weired DCF numbers like 654.xxxx it had work units that had 2000 plus hours manually changed it back to .67xxx and it's running fine for now.

Every once and a while my WU times go from about 2 hours to 7 ot 8 hours I reset it and all is well for who knows weeks or so.
ID: 453069 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 453074 - Posted: 8 Nov 2006, 0:22:52 UTC - in response to Message 453069.  

How is it computed and where is it computed at Berkeley or in Boinc on your system.

It is calculated on your system.

In theory, the benchmark * the expected time factor from the project is a fair predictor of performance.

In practice, the benchmark may be fairly inaccurate for a number of reasons.

If you just leave it alone when it goes to 7 or 8 hours, you'll notice that BOINC will adjust it down and everything will return to normal by itself.

ID: 453074 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 453075 - Posted: 8 Nov 2006, 0:23:00 UTC

As ROM explained DCF and CPU Efficiency to me:

CPU efficiency is the difference between how much CPU time a process received relative to the amount of wall clock time that has passed. It is the answer to the question of "In the last ten minutes or so, how much CPU did BOINC based science applications receive?" The thing to remember here is that the OS is constantly doing things in the background and each of those things eats a little bit of the CPU.

Duration Correction Factor is a per project value that measures the difference between the the expected time to process a result based on the benchmark verses what it actually took. A score of 1.0 means that the benchmark and the application processing time are in sync. The lower the score the greater the variance between what the benchmarks predict verse what it actually took to complete the result.

BOINC tries very hard not to ask for more work than it can actually process in a given period of time, so it tries to keep track of the machine overhead by the CPU efficiency score and Duration Correction Factor. Another thing to keep in mind is that memory speed plays a big part in the Duration Correction Factor. When you see similar processing times for a result for a 3.0Ghz processor and a 2.0Ghz processor it normally means that the 3.0Ghz processor is running with memory that cannot keep up with the processor. Or that both processors are bottlenecked with the memory speed.

We haven't come up with a good solution for measuring the memory bandwidth problem yet. However, we are working on it.

ID: 453075 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 453167 - Posted: 8 Nov 2006, 2:36:42 UTC - in response to Message 453075.  

As ROM explained DCF and CPU Efficiency to me:

CPU efficiency is the difference between how much CPU time a process received relative to the amount of wall clock time that has passed. It is the answer to the question of "In the last ten minutes or so, how much CPU did BOINC based science applications receive?" The thing to remember here is that the OS is constantly doing things in the background and each of those things eats a little bit of the CPU.

Duration Correction Factor is a per project value that measures the difference between the the expected time to process a result based on the benchmark verses what it actually took. A score of 1.0 means that the benchmark and the application processing time are in sync. The lower the score the greater the variance between what the benchmarks predict verse what it actually took to complete the result.

BOINC tries very hard not to ask for more work than it can actually process in a given period of time, so it tries to keep track of the machine overhead by the CPU efficiency score and Duration Correction Factor. Another thing to keep in mind is that memory speed plays a big part in the Duration Correction Factor. When you see similar processing times for a result for a 3.0Ghz processor and a 2.0Ghz processor it normally means that the 3.0Ghz processor is running with memory that cannot keep up with the processor. Or that both processors are bottlenecked with the memory speed.

We haven't come up with a good solution for measuring the memory bandwidth problem yet. However, we are working on it.

Actually, ROM missed a little. A score less than 1 means that your computer finishes work faster than expected for its benchmarks. A score that is higher indicates that it finishes slower than expected for its benchmarks. It is calculated as a safety for the CPU scheduler (finish on time) and work fetch (not too much), therefore under estimates are corrected very quickly (single step) and over estimates are corrected much more cautiously.

The algorithm:

if (original corrected estimate < actual CPU time)
DCF *= actual CPU time / original corrected estimate
else if (original corrected estimate > 100 * actual CPU time)
// this is probably the equivalent of an S@H noisy result - caution
DCF = .99 * DCF + 0.01 * actual CPU time / original corrected estimate
else
// this may or may not be a short result of some sort.
DCF = .9 * DCF + 0.1 * actual CPU time / original corrected estimate

This is from memory, and is not an exact copy of the relevant code.


BOINC WIKI
ID: 453167 · Report as offensive
Pepo
Volunteer tester
Avatar

Send message
Joined: 5 Aug 99
Posts: 308
Credit: 418,019
RAC: 0
Slovakia
Message 454235 - Posted: 9 Nov 2006, 22:48:41 UTC - in response to Message 453069.  
Last modified: 9 Nov 2006, 23:22:23 UTC

Why does it go wacky and give weired DCF numbers like 654.xxxx it had work units that had 2000 plus hours manually changed it back to .67xxx and it's running fine for now.

Every once and a while my WU times go from about 2 hours to 7 ot 8 hours I reset it and all is well for who knows weeks or so.

It happened to me once, maybe half a year ago, I reported about it on Seti Beta pages.

If you just leave it alone when it goes to 7 or 8 hours, you'll notice that BOINC will adjust it down and everything will return to normal by itself.

:-( I was observing it few (2-3) days, DCF's were changing as defined (%-wise) and it would take weeks/months to stabilize :-)
Thus I had to correct it manually too.

[edit]Actually I reported it on Boinc pages, never mind. My DCFs also jumped to range of hundreds, it happened with Boinc 5.3.31 for Windows.[/edit]

Peter
ID: 454235 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 454263 - Posted: 10 Nov 2006, 0:24:05 UTC

DCF values can be strange if you have a bad benchmark, as the DCF is counting on the benchmark numbers as part of the original estimate.


BOINC WIKI
ID: 454263 · Report as offensive
Pepo
Volunteer tester
Avatar

Send message
Joined: 5 Aug 99
Posts: 308
Credit: 418,019
RAC: 0
Slovakia
Message 454582 - Posted: 10 Nov 2006, 14:38:33 UTC

Sure. But in my case, the benchmark values seemed to be plausible, last benchmark was run approx. 2 days prior DCFs jumping high and the estimated run times were also fine previously. I was not able to find an explanation.

Peter
ID: 454582 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 454594 - Posted: 10 Nov 2006, 14:53:03 UTC - in response to Message 454582.  

Sure. But in my case, the benchmark values seemed to be plausible, last benchmark was run approx. 2 days prior DCFs jumping high and the estimated run times were also fine previously. I was not able to find an explanation.

Peter

I know I got a bad DCF when I let a machine reboot while it was running benchmarks.

Presumably that benchmark was extremely slow, and affected the DCF: and remember that DCF is on a sort of ratchet - increases immediately, but only decreases gradually. Even if you run a new benchmark, I don't think that will put things back to normal immediately.
ID: 454594 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 454649 - Posted: 10 Nov 2006, 17:02:14 UTC - in response to Message 454263.  

DCF values can be strange if you have a bad benchmark, as the DCF is counting on the benchmark numbers as part of the original estimate.

... and if I understand it correctly, the DCF depends on reasonable estimates from the project. If the project suggests that their work takes twice as long as it actually does, you will get a DCF somewhere around 0.5.
ID: 454649 · Report as offensive

Message boards : Number crunching : Question about Duration correction Factor


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.