Making initial 'completion time' make sense again

Message boards : Number crunching : Making initial 'completion time' make sense again
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
SlimDiesel

Send message
Joined: 2 Sep 99
Posts: 10
Credit: 15,496,641
RAC: 8
Canada
Message 61361 - Posted: 6 Jan 2005, 20:38:02 UTC

Back in the old days, the initial value of 'completion time' assigned as work units were downloaded seemed to have been fairly close to the actual time recent WU's were taking. Then came the 'anomaly' when WU's took ~35% longer. The initial 'completion time' seems to have adjusted itself UP to that number but now that things are back to near-normal it seems not to have adjusted itself back DOWN. I've been waiting but it just hasn't happened. The result is that when BOINC thinks its downloading 1 days work it is actually downloading only 2/3 of a days work.

I know this is trivial, but how do I get it back into agreement with reality? If its something that would require resetting the project (and abandoning a bunch of WUs) I won't do that, but I'd like to hear if there is a simple fix.
ID: 61361 · Report as offensive
Profile mikey
Volunteer tester
Avatar

Send message
Joined: 17 Dec 99
Posts: 4215
Credit: 3,474,603
RAC: 0
United States
Message 61376 - Posted: 6 Jan 2005, 21:47:46 UTC

Actually since you are running BOTH a Windows and a Linux machine you should notice the time to be more accurate on the Linux machine and therefore less of a problem. The next couple releases of the Boinc software should address the problems with the Windows based machines being "off" on their calculations.
Not sure exactly which release will fix it, but the fix "is coming".

ID: 61376 · Report as offensive
Profile Keck_Komputers
Volunteer tester
Avatar

Send message
Joined: 4 Jul 99
Posts: 1575
Credit: 4,152,111
RAC: 1
United States
Message 61377 - Posted: 6 Jan 2005, 21:48:07 UTC

You can try running your benchmarks manually, but most likely it is a server side setting that needs tweaking. It will most likely never be perfect for all computers since cache size and memory bandwith play a role in crunching but are not included in the benchmarks currently.
BOINC WIKI

BOINCing since 2002/12/8
ID: 61377 · Report as offensive
SlimDiesel

Send message
Joined: 2 Sep 99
Posts: 10
Credit: 15,496,641
RAC: 8
Canada
Message 61395 - Posted: 6 Jan 2005, 22:51:39 UTC - in response to Message 61376.  

I can't seem to find the number that the Linux box thinks is the 'completion time' for WU's it hasn't started yet... I only run the old text boincstat there and it doesn't report it. None of the xml files have a number that jump out at me as being it either. It seems to be calculated by dividing the expected fpops (which is in the client_state.xml element) by the benchmarked fpops from the element. When I do this calculation manually for the Windows box, it gives the value I see on the GUI (but still is ~35% higher than WU's really complete).

The Linux box, despite having a slower CPU, completes WU's in slightly less time than the Windows box. When I do the above calculation on the Linux box, which has significantly lower benchmark numbers recorded, its almost 100% too high, which explains why its queue is always shorter than I would expect.

It looks like the benchmark numbers are not very accurate. Maybe BOINC should adjust the benchmark values after it has some history of how actual WU's performed.

Not much of a problem since I can always change my preferences to get a bit longer queue.
ID: 61395 · Report as offensive
SlimDiesel

Send message
Joined: 2 Sep 99
Posts: 10
Credit: 15,496,641
RAC: 8
Canada
Message 61712 - Posted: 7 Jan 2005, 16:02:03 UTC
Last modified: 7 Jan 2005, 16:07:33 UTC

Through experimentation, I find that if I edit my client_state.xml and increase the p_fpops and p_iops by 40% (while BOINC is down - don't know if this mattered), the 'completion time' of unstarted WU's is reported much closer to reality. Not coincidentally, just after restarting, BOINC realized I no longer had a 2 day queue of work and downloaded the correct number to top it up.

Unfortunately, if you run the benchmarks manually (and probably when BOINC reruns them periodically too), the p_*ops numbers revert to their original state and 'completion time' gains that 2 hours back.

Therefore, the problem seems to be that some time (maybe with BOINC 4.13), the benchmarking algorithm changed from 'just right' to '40% too pessimistic' (at least for this Windows machine - I haven't experimented with the Linux box but the error is even greater).

Is this sufficient evidence to revisit benchmarking for a future BOINC update?
ID: 61712 · Report as offensive
Walt Gribben
Volunteer tester

Send message
Joined: 16 May 99
Posts: 353
Credit: 304,016
RAC: 0
United States
Message 61740 - Posted: 7 Jan 2005, 17:56:20 UTC - in response to Message 61712.  

> Is this sufficient evidence to revisit benchmarking for a future BOINC update?
>

The newer version in the works now is supposed to do it differently. Better, well we just have to wait.

You might try running with the -skip_cpu_benchmarks option. Thats supposed to tell BOINC not to run benchmarks.
ID: 61740 · Report as offensive
SlimDiesel

Send message
Joined: 2 Sep 99
Posts: 10
Credit: 15,496,641
RAC: 8
Canada
Message 61762 - Posted: 7 Jan 2005, 19:03:00 UTC - in response to Message 61740.  
Last modified: 7 Jan 2005, 19:03:30 UTC

> You might try running with the -skip_cpu_benchmarks option. Thats supposed to
> tell BOINC not to run benchmarks.

Thanks for that suggestion. I have scaled the numbers in client_state.xml on both boxes appropriately and restarted BOINC with that option. The Linux box was on the verge of requesting new work anyway and when it restarted it immediately topped up its queue correctly.
ID: 61762 · Report as offensive
Profile Benher
Volunteer developer
Volunteer tester

Send message
Joined: 25 Jul 99
Posts: 517
Credit: 465,152
RAC: 0
United States
Message 61764 - Posted: 7 Jan 2005, 19:06:37 UTC

Dave,

I'm curious what the "claimed" credit is for that host that you scaled the numbers on. I have scaled the numbers for my systems also, but the goal was to have the average WU time claim 30-35 credits on each system.
ID: 61764 · Report as offensive
SlimDiesel

Send message
Joined: 2 Sep 99
Posts: 10
Credit: 15,496,641
RAC: 8
Canada
Message 61772 - Posted: 7 Jan 2005, 19:31:44 UTC - in response to Message 61764.  

> I'm curious what the "claimed" credit is for that host that you scaled the
> numbers on. I have scaled the numbers for my systems also, but the goal was
> to have the average WU time claim 30-35 credits on each system.

In general, the claimed credit for the Windows box has recently been in the low 40s and, for the Linux box, in the mid 20's. I do notice that the claimed credit is now 63 and 46 respectively for the single WU returned by each box since the scaling. I did notice that around the time the issue started (benchmarks/completion time became overly pessimistic), my RAC took a dive against someone I used to be matching on my team. I'm going to assume for now that the Validator will figure it out and award the appropriate credit.

I do notice that the host information on the site seems to have recorded the newly set benchmark values when each machine reported their first WU's. I hope I have not just discovered a way for the unscupulous to claim (and have granted) huge credit just by fudging upwards their benchmark numbers and having the Validator believe that.
ID: 61772 · Report as offensive
Walt Gribben
Volunteer tester

Send message
Joined: 16 May 99
Posts: 353
Credit: 304,016
RAC: 0
United States
Message 61910 - Posted: 8 Jan 2005, 0:07:02 UTC - in response to Message 61772.  


> I do notice that the host information on the site seems to have recorded the
> newly set benchmark values when each machine reported their first WU's. I hope
> I have not just discovered a way for the unscupulous to claim (and have
> granted) huge credit just by fudging upwards their benchmark numbers and
> having the Validator believe that.

Of course you have. But if the numbers are set so they represent the longest time it takes to process a workunit, they may be higher but they'll be accurate.

And if its highest it'll get tossed along with the lowest number when credit is granted.

Right now requested credit is low for most users because the estimated times are too high. If estimate says 10 hours, its processed in 5 fours, its figured to be a short WU with only half the normal processing. So only half the requested credit.

I used your method on some of my systems and it works just fine. Estimates are right on, it downloads enough work to match my preferences, and requesdted credit is back to what it was before they messed up the benchmarks in the first place.
ID: 61910 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 61965 - Posted: 8 Jan 2005, 2:11:12 UTC - in response to Message 61910.  

>
> Right now requested credit is low for most users because the estimated times
> are too high. If estimate says 10 hours, its processed in 5 fours, its
> figured to be a short WU with only half the normal processing. So only half
> the requested credit.

The estimated crunch-time is dependent on 2 things, the benchmark and the wu-parameters used at the time of splitting.

This means, changing the wu-paramets so example estimated crunch-time is 1 hours but instead takes 5 hours would give exactly the same claimed credit as if the wu-parameters gives estimated crunch-time 10 hours but crunches in 5 hours.

>
> I used your method on some of my systems and it works just fine. Estimates
> are right on, it downloads enough work to match my preferences, and requesdted
> credit is back to what it was before they messed up the benchmarks in the
> first place.
>

Uhm, the point with the changes to the benchmark is so different OS should claim roughly the same credit for the same wu, and not like today most windows-computers is claiming much higher than if crunched on something different...

BTW, because of the big difference in benchmarks between OS, at the same time the estimated crunch-time is too high under windows, isn't it too low on other OS? If so, doing any changes to get more accurate estimate under windows before benchmark is "fixed" will also increase the number of users downloading too much work...
ID: 61965 · Report as offensive
SlimDiesel

Send message
Joined: 2 Sep 99
Posts: 10
Credit: 15,496,641
RAC: 8
Canada
Message 62200 - Posted: 8 Jan 2005, 14:38:27 UTC - in response to Message 61965.  

> BTW, because of the big difference in benchmarks between OS, at the same time
> the estimated crunch-time is too high under windows, isn't it too low on other
> OS? If so, doing any changes to get more accurate estimate under windows
> before benchmark is "fixed" will also increase the number of users downloading
> too much work...

Actually, while the estimated time was too high by 40% on the Windows box, it was 80% too high on the Linux box. Now that I've scaled the benchmarks by 1.4 and 1.8 respectively, the estimated times are within a few minutes of actual and the size of the queue is exactly what I'd expect for my desired WU queue of 1 to 2 days. It works for me, but the project may have other goals.
ID: 62200 · Report as offensive
STE\/E
Volunteer tester

Send message
Joined: 29 Mar 03
Posts: 1137
Credit: 5,334,063
RAC: 0
United States
Message 62204 - Posted: 8 Jan 2005, 14:52:02 UTC
Last modified: 8 Jan 2005, 14:55:49 UTC

All this discussion about who's OS is claiming higher credit reminds me of the NASCAR Racing Circuit where each Car Manufacture claims the other Manufacture is getting an unfair advantage.

The same thing is going on here, each OS is claiming the other OS has an unfair advantage when it comes to claiming credit & neither side is going to be happy unless their OS comes out on top ... IMO
ID: 62204 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 62207 - Posted: 8 Jan 2005, 15:49:57 UTC - in response to Message 62200.  

>
> Actually, while the estimated time was too high by 40% on the Windows box, it
> was 80% too high on the Linux box. Now that I've scaled the benchmarks by 1.4
> and 1.8 respectively, the estimated times are within a few minutes of actual
> and the size of the queue is exactly what I'd expect for my desired WU queue
> of 1 to 2 days. It works for me, but the project may have other goals.
>

Ah, if Linux is even longer away it wouldn't be so big a problem decreasing the wu-limits, but this will also probably mean more users will download too much work to be returned before the limit so not sure if it's a good idea to fix this at the moment.

But, with you artificially increasing your benchmark, you're also claiming 40% more credit than you should. Increasing the linux-benchmark to be comparable with windows shouldn't be a problem, but there's no reason to increase the windows-benchmark back to the inflated scores it had then much of the benchmark wasn't even executed under windows.
ID: 62207 · Report as offensive
Allan Taylor

Send message
Joined: 31 Jan 00
Posts: 32
Credit: 270,259
RAC: 0
Canada
Message 62316 - Posted: 8 Jan 2005, 20:43:06 UTC

I have to wonder if the blame is all on the benchmark scores. While it seems to be well documented that the scores are not correct, I have noyiced that on other projects, the estimated completion times are much closer to actual times.

For me, Seti estimates the time at almost double the actual time. On Preidictor, it is only about an extra hour for every 5 hours. I remebr LHC as being out by about the same as Predictor (I might be wrong on that one, as I have no current WU to compare). The ones that really brought all this tro focus was Einstein which seem to be almost perfect. If it estimates 9 hours, it takes about 9 hours.

All this makes me wonder if the scheduler (or whichever peice of software does the original estimates) has a problem coming up with a good estimate of operations to be preformed. It seems strange that Seti should be so out of whack when the other projects are using the same benchmarks to calculate the time estimates.

Just my 2 cents.
ID: 62316 · Report as offensive
Profile Keck_Komputers
Volunteer tester
Avatar

Send message
Joined: 4 Jul 99
Posts: 1575
Credit: 4,152,111
RAC: 1
United States
Message 62348 - Posted: 8 Jan 2005, 22:19:54 UTC

@Allan

Cache size can have a major impact on seti processing speed, because the working set is often small enough to stay in cache. That seems to be where the big differences are coming in alot of cases. Cache size and speed are not currently included in the figures. The other projects have larger working sets so even with the largest caches there are still lots of calls to system memory.
BOINC WIKI

BOINCing since 2002/12/8
ID: 62348 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 62361 - Posted: 8 Jan 2005, 23:00:16 UTC - in response to Message 62316.  
Last modified: 9 Jan 2005, 15:33:42 UTC

estimate_cpu_duration = wu.rsc_fpops_est / host.p_fpops

host.p_fpops is a hosts floating-point-benchmark-score (Whetstone)
wu.rsc_fpops_est is one of the many parameters set on a wu then generated, in SETI this is done by the Splitter.


Then SETI@home was launched back in June, the windows-benchmark wasn't correctly executed so everyone got an inflated benchmark-score. The wu.rsc_fpops_est was very likely set so most computers would get more or less "correct" estimated crunch-time, based on this uncorrect benchmark.
Later, bug-fixes to benchmark have decreased the benchmark-score under windows, but SETI@home haven't done anything to wu.rsc_fpops_est yet, except a short time this was 6x. Lower benchmark means higher estimated crunch-time.

Since Predictor@home and LHC also was started many months ago, they very likely also used the old overflated benchmark-score then estimating their crunch-times. Einstein@home on the other hand isn't released yet, so they're probably used the "fixed" benchmark-score then deciding to set their wu.rsc_fpops_est.

So to get more accurate estimated crunch-time, it's just for different projects to get around to change this parameter. But, since setting this a little bit too high only means some users must connect a little more often to keep crunching, it's probably very low priority to do anything with. Installing hardware and fixing bugs is probably much more important to do. ;)


BTW, since claimed_credit = cpu_time * (host.p_iops + host.p_fpops) / 1.728e12
and estimated crunch-time only depends on fpops, anyone manually increasing their p_fpops-score to get "correct" estimated crunch-time, should at the same time decrease p_iops with as much to not inflate their claimed credit.
ID: 62361 · Report as offensive
Hans Dorn
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2262
Credit: 26,448,570
RAC: 0
Germany
Message 62365 - Posted: 8 Jan 2005, 23:16:52 UTC - in response to Message 62361.  

Hi Ingleside!
(sorry, don't know your first name...)

> estimate_cpu_duration = wu.rsc_fpops_est / host.p_fpops
>
> host.p_fpops is a hosts floating-point-benchmark-score (Whetstone)
> wu.rsc_fpops_est is one of the many parameters set on a wu then generated, in
> SETI this is done by the Splitter.

If you fix wu.rsc_fpops_est, you can throw away the benchmarks altogether!

Just make the seti client claim wu.rsc_fpops_est * X.

Regards Hans

ID: 62365 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 62394 - Posted: 9 Jan 2005, 0:41:09 UTC - in response to Message 62365.  

> If you fix wu.rsc_fpops_est, you can throw away the benchmarks altogether!
>
> Just make the seti client claim wu.rsc_fpops_est * X.
>

wu.rsc_fpops_est is just a guess of how many floating-point-operations a cpu needs to do to crunch a wu, and this can't be known before wu have been crunched. This, since example it can be a seti-wu with too much noise so terminates after one minute, or a LHC-wu there all particles is lost after 1000 rounds, or due to differences in angle-range or only one of the particles got lost or something.


Using an application-specific benchmark can make things better, while starting to count all cpu-operations done probably will be even better but this will also waste some time keeping track of all the counts so this wasted time can be a bigger problem for small gains in crediting.


BTW, some months ago some users calculated their granted average, and this seems to have been less than +-10% difference. In seti "classic", even the difference due to VLAR/VHAR is larger than +-10%, not to count all the 1-minute-wu...
ID: 62394 · Report as offensive
7822531

Send message
Joined: 3 Apr 99
Posts: 820
Credit: 692
RAC: 0
Message 62561 - Posted: 9 Jan 2005, 12:20:14 UTC - in response to Message 62361.  

claimed_credit = cpu_time * (host.p_iops + host.p_fpops) / 1.728e12
I know I'm going to get flack for this, but where are you getting 1.728 trillion from?
ID: 62561 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : Making initial 'completion time' make sense again


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.