again less credits?

Message boards : Number crunching : again less credits?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

AuthorMessage
Larry256
Volunteer tester

Send message
Joined: 11 Nov 05
Posts: 25
Credit: 5,715,079
RAC: 8
United States
Message 902503 - Posted: 1 Jun 2009, 15:58:28 UTC - in response to Message 902499.  

And in this thread thread they didnt think the benchmarks made a difference.LOL

ID: 902503 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 902515 - Posted: 1 Jun 2009, 16:30:06 UTC

I've been running a AP for a while and finally started running the MB's that are now coming out. I noticed that the credits have dropped significantly. WU's that I run in 50 minutes are only worth ~35-40 credits where they were coming in at about 1 credit/minute run time/core.


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 902515 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 902517 - Posted: 1 Jun 2009, 16:32:58 UTC - in response to Message 902499.  

I don't know why the S@H Enhanced multiplier has decreased that much. By design, CUDA shouldn't effect it unless a majority of the sampled hosts are using CUDA.

Looking back to Eric's original announcement of the variable multiplier on boinc_dev, the exact wording is:

The script looks at a day's worth of returned results for each app (up to 10 000).

If that wording holds, he's sampling results, not sampling hosts.

My single 9800GTX+ card turns in about 100 results a day: each of my two 9800GT cards turn in about 80 results a day. With those sorts of numbers, it wouldn't surprise me if a CUDA effect was at work - especially since BoincView tells me that a mid-AR CUDA task claims about 0.28 credits per task (yes, that's right - less than a third of a cobblestone) on the old benchmark * CPU time method.
ID: 902517 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 902525 - Posted: 1 Jun 2009, 17:11:52 UTC - in response to Message 902517.  

I don't know why the S@H Enhanced multiplier has decreased that much. By design, CUDA shouldn't effect it unless a majority of the sampled hosts are using CUDA.

Looking back to Eric's original announcement of the variable multiplier on boinc_dev, the exact wording is:

The script looks at a day's worth of returned results for each app (up to 10 000).

If that wording holds, he's sampling results, not sampling hosts.
...


The script gathers 10000 results from those which have been granted credit and not yet purged from the database (also ignoring those which were sent over 30 days earlier). Then it reduces that list, combining all results per host, so a host which has reported 500 of the results has just one place as does a host which only reported one of the results. The median host in that reduced list is the one which determines the multiplier.
                                                           Joe
ID: 902525 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 902532 - Posted: 1 Jun 2009, 17:54:42 UTC - in response to Message 902525.  

The script gathers 10000 results from those which have been granted credit and not yet purged from the database (also ignoring those which were sent over 30 days earlier). Then it reduces that list, combining all results per host, so a host which has reported 500 of the results has just one place as does a host which only reported one of the results. The median host in that reduced list is the one which determines the multiplier.
                                                           Joe

Thanks for the script link. I've had a read of it - haven't teased out all the details, but the basic SQL stuff is standard enough.

It does look as if the fast CUDA hosts get 80, or 100, or 500 'lottery tickets' for entry into the long list of recent results, and hence stand a far better chance of being included in the reduced list. Every time a host is double-entered in the long list, it reduces the diversity and representativeness of the short list.

And averaging CPU and CUDA tasks by "sum(granted_credit)/sum(cpu_time)" is just plain wrong.

Do you know if anything representing plan_class is stored in the result table? I think it would be fair for the results my CUDA hosts return from the 603 CPU app to be included in the 10000 lottery, but results from the 608 CUDA app should be excluded from the average. That should still be OK for the VLARs which are sent out as 608 CUDA but I return as 603 CPU (not so sure about Raismer's Perl script, which can do the reverse transition).

The only other alternative would be to exclude CUDA in the hosts table, on the grounds that there should still be enough MB work to get a reasonable average from 10000 results from non-CUDA hosts.

But then we'd be excluding CUDA hosts from the AP calculation too - needlessly, and the smaller number of results means that there probably would be distortion.

Hmmmm. I don't think Eric has thought all this through - and now that I've tried, I can see why not!
ID: 902532 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 902552 - Posted: 1 Jun 2009, 19:09:03 UTC - in response to Message 902503.  
Last modified: 1 Jun 2009, 19:10:46 UTC

And in this thread thread they didnt think the benchmarks made a difference.LOL


LOL That's because they don't make a difference anymore! LOL


Credits for SETI@Home MultiBeam and AstroPulse are based upon an approximate Floating Point Operations Per Second, giving a best guess as to how many FLOPs each particular operation will take. The benchmarks are no longer used for credit calculations.
ID: 902552 · Report as offensive
Larry256
Volunteer tester

Send message
Joined: 11 Nov 05
Posts: 25
Credit: 5,715,079
RAC: 8
United States
Message 902560 - Posted: 1 Jun 2009, 19:35:06 UTC - in response to Message 902552.  

LOL Please read how they get that, based upon an approximate Floating Point Operations Per Second, giving a best guess as to how many FLOPs each particular operation will take..

Eric's automatic credit adjustment relies on the original benchmark * time = credit calculation.


If you change the benchmarks to read 1 in BOINC ver. 9.9.9 and everyone installed the new ver., what will the credit be in 60 days?
LOL
ID: 902560 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 902563 - Posted: 1 Jun 2009, 19:42:50 UTC - in response to Message 902560.  
Last modified: 1 Jun 2009, 19:50:57 UTC

LOL Please read how they get that, based upon an approximate Floating Point Operations Per Second, giving a best guess as to how many FLOPs each particular operation will take..

Eric's automatic credit adjustment relies on the original benchmark * time = credit calculation.


LOL That's incorrect. The guesstimate is built into the application as a hard-coded routine. You've confused the routine for claiming credit within the app for Eric's automatic credit adjustment which is applied at validation time. LOL
ID: 902563 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 902565 - Posted: 1 Jun 2009, 19:56:00 UTC

Hmmm...

Well it seems to me you're both right, and you're both wrong.

IIRC, the server side part of the equation calculates what the basis rate should be based on the performance of the 'median host' as determined from a 'qualified' sampling of the most recently returned tasks. This number is what gets set as the 'Credit Multiplier'.

That value is passed along in WU's and is read in by hosts running an app which was built to be compliant with 'on the fly' rate adjustment. This would include all Berkeley provided apps, and IIRC all recent apps from the Coop.

At that point, the app will adjust what it will claim based on the value for the multiplier passed in with the task, and then granting proceeds as it always did using the standard BOINC scoring rules.

Alinator

ID: 902565 · Report as offensive
Larry256
Volunteer tester

Send message
Joined: 11 Nov 05
Posts: 25
Credit: 5,715,079
RAC: 8
United States
Message 902569 - Posted: 1 Jun 2009, 20:10:14 UTC - in response to Message 902552.  
Last modified: 1 Jun 2009, 20:14:14 UTC

And in this thread thread they didnt think the benchmarks made a difference.LOL


LOL That's because they don't make a difference anymore! LOL


Credits for SETI@Home MultiBeam and AstroPulse are based upon an approximate Floating Point Operations Per Second, giving a best guess as to how many FLOPs each particular operation will take. The benchmarks are no longer used for credit calculations.


OK,The benchmarks still make a difference,now its just 10000 at at time to get rid of the so called "cheats".Thats why the did it,not for new apps.People had optimize boinc itself to get better benchmarks to go along with the optimize science apps.
ID: 902569 · Report as offensive
Profile Westsail and *Pyxey*
Volunteer tester
Avatar

Send message
Joined: 26 Jul 99
Posts: 338
Credit: 20,544,999
RAC: 0
United States
Message 902573 - Posted: 1 Jun 2009, 20:15:50 UTC

Ok, now I am just confused... Fiat credits? *calling CAMPAIGN FOR LIBERTY*

Would it be better that a certain number of "estimated flops" = a cobelstone and this value doesn't change with time?

I don't understand this self deflationary scripted variable credit. That doesn't make any sense. Shouldn't you be able to buy a new host and compare it to the host you had last year etc? Now I don't even know if my machine is doing the same thing it was doing a few weeks ago???

So credits have really and truly been made useless? Why? Credits give you some measure of your contribution over time. If they are aways changing realative to a variable median host who will get faster as computing advances...how can any meaningful data be obtained?

"The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, 'hmm... that's funny...'" -- Isaac Asimov
ID: 902573 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 902579 - Posted: 1 Jun 2009, 20:30:28 UTC - in response to Message 902565.  

Hmmm...

Well it seems to me you're both right, and you're both wrong.

IIRC, the server side part of the equation calculates what the basis rate should be based on the performance of the 'median host' as determined from a 'qualified' sampling of the most recently returned tasks. This number is what gets set as the 'Credit Multiplier'.

That value is passed along in WU's and is read in by hosts running an app which was built to be compliant with 'on the fly' rate adjustment. This would include all Berkeley provided apps, and IIRC all recent apps from the Coop.

At that point, the app will adjust what it will claim based on the value for the multiplier passed in with the task, and then granting proceeds as it always did using the standard BOINC scoring rules.

Alinator

Sorry Alinator, I think you've fallen into a trap there.

See my Beta message 34420. My parity (1) task multiplier is what is passed out to clients: what should have been a parity (4) application multiplier (I got that bit wrong) is what is calculated by Eric's script, and it stays on the server. For confirmation, look at any recent optimised MB result, and read: "Credit multiplier is : 2.85" - same as it always has been.

As I wrote last July, "it's perhaps a shame that it should have been implemented via a multiplier" - the use of the same word for these two separate and completely distinct values causes confusion.
ID: 902579 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 902582 - Posted: 1 Jun 2009, 20:36:52 UTC - in response to Message 902569.  
Last modified: 1 Jun 2009, 20:37:30 UTC

And in this thread thread they didnt think the benchmarks made a difference.LOL


LOL That's because they don't make a difference anymore! LOL


Credits for SETI@Home MultiBeam and AstroPulse are based upon an approximate Floating Point Operations Per Second, giving a best guess as to how many FLOPs each particular operation will take. The benchmarks are no longer used for credit calculations.


OK,The benchmarks still make a difference,now its just 10000 at at time to get rid of the so called "cheats".Thats why the did it,not for new apps.People had optimize boinc itself to get better benchmarks to go along with the optimize science apps.


If you were to use one of those optimized BOINC programs now, it wouldn't make a difference because the benchmark is no longer used on individual machines - not necessarily because of cheaters, but because the Linux benchmark was considerably slower than the Windows, meaning that Linux users were being cheated out of credit. That's why they did it this way, to make it fair for everyone.

The benchmarks are only left in for projects that wish to use them; SETI does not.
ID: 902582 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 902589 - Posted: 1 Jun 2009, 20:56:04 UTC - in response to Message 902563.  

LOL Please read how they get that, based upon an approximate Floating Point Operations Per Second, giving a best guess as to how many FLOPs each particular operation will take..

Eric's automatic credit adjustment relies on the original benchmark * time = credit calculation.


LOL That's incorrect. The guesstimate is built into the application as a hard-coded routine. You've confused the routine for claiming credit within the app for Eric's automatic credit adjustment which is applied at validation time. LOL

And in turn, I think you've got it confused.

The application claims nothing. The application (well, to be pedantic - again - BOINC on the application's behalf) merely passes on certain values to the server - notably in this context, the host benchmarks, the CPU time (but not GPU time), and the FLOP count/estimate (see below). No credit claim number is passed back in the sched_request_setiathome.berkeley.edu.xml file.

The credit 'claim' is worked out by the server, and incorporates Eric's script multiplier - that's why the claim value for MB tasks has been falling recently, which is what started this discussion in the first place.

Eric's script does refer back to the old benchmark * time credit calculation (as following the link in Joe's post confirms), but only as a comparison point to 'refine' the newer FLOP-based calculation. [You can read Eric's full description quoted in the Beta post from last July I just linked].

The other multiplier, that 2.85 which has remained unchanged since it was first introduced, is indeed used in the "guesstimate [is] built into the application as a hard-coded routine" - but only to calculate the number of FLOPs reported back to the server.

For confirmation, here is the entire report for task 1245248102, completed and reported while I've been composing this post. No credit claim.

<result>
    <name>02mr09ab.12439.20522.14.8.111_0</name>
    <final_cpu_time>6591.432000</final_cpu_time>
    <exit_status>0</exit_status>
    <state>5</state>
    <platform>windows_intelx86</platform>
    <version_num>603</version_num>
    <fpops_cumulative>46430980000000.000000</fpops_cumulative>
    <app_version_num>603</app_version_num>
<stderr_out>
<core_client_version>5.10.13</core_client_version>
<![CDATA[
<stderr_txt>
Windows optimized S@H Enhanced application by Alex Kan
Version info: SSSE3x (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSSE3x Win32 Build 76 , Ported by : Jason G, Raistmer, JDWhale

     CPUID: Intel(R) Xeon(R) CPU           E5320  @ 1.86GHz 
     Speed: 4 x 1862 MHz 
     Cache: L1=64K L2=4096K
  Features: MMX SSE SSE2 SSE3 SSSE3 
 
Work Unit Info:
...............
Credit multiplier is :  2.85
WU true angle range is :  0.411547

Flopcounter: 16286591127437.027000

Spike count:    9
Pulse count:    0
Triplet count:  1
Gaussian count: 0
called boinc_finish

</stderr_txt>
]]>
</stderr_out>
<file_info>
    <name>02mr09ab.12439.20522.14.8.111_0_0</name>
    <nbytes>28782.000000</nbytes>
    <max_nbytes>65536.000000</max_nbytes>
    <md5_cksum>2555c8fad95bd620da1c87fe9ba91f84</md5_cksum>
    <url>http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler</url>
</file_info>
</result>
ID: 902589 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 902592 - Posted: 1 Jun 2009, 21:03:29 UTC - in response to Message 902582.  

If you were to use one of those optimized BOINC programs now, it wouldn't make a difference because the benchmark is no longer used on individual machines - not necessarily because of cheaters, but because the Linux benchmark was considerably slower than the Windows, meaning that Linux users were being cheated out of credit. That's why they did it this way, to make it fair for everyone.

Agreed - absolutely right. The benchmark is no longer used by SETI for individual tasks.

The benchmarks are only left in for projects that wish to use them; SETI does not.

Not quite. The benchmarks are still used by SETI to adjust the overall global values - through Eric's script. So if everybody suddently started using a BOINC client with a falsified benchmark, the overall effect would become noticable over 30 days.

That's effectively what has happened with CUDA, except it's the time which has been falsified, rather than the benchmarks.
ID: 902592 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 902600 - Posted: 1 Jun 2009, 21:25:25 UTC

A wee bit of a history lesson might be in order at this point.

Once upon a time, some guys at Berkeley came up with this BOINC thingy, and they thought it'd be good if they could bring in all the other projects and issue credit for work done, and have the credit be comparable between projects.

So if you got 50 credits on SETI and 50 credits on CPDN, it is because you had done equal work.

And Jeff said "how about defining a credit as 1/100th of the work a specific machine can do?" and they named it Cobblestone.

... and that is when all the problems started.

Initially, BOINC granted credit based on Benchmarks and Time because the Cobblestone is defined in terms of Benchmarks and Time.

I'd even suggest that this original scheme was the most accurate, since it did come right from the definition -- if you averaged it across a bunch of work.

It just wasn't very repeatable, and all we talked about back then was how one could claim "20" when the next cruncher claimed "50" and sometimes you got paid too well, and other times got cheated -- but it averaged out.

Now, we count FLOPs, which have no connection to the Cobblestone definition at all, and a scaling factor (2.85) is applied on Multibeam to try to bring the two into line.

Eric's script is trying to look at work, find a median, compare Benchmark * Time vs. FLOPs and slowly refine the scaling so it tracks, on average, back to the original "Gold Standard" cobblestone.

The big problem is that we (in the U.S. and probably most countries) don't really remember when most currencies were "hard currencies" and a dollar literally represented a specific amount of gold in a vault somewhere.
ID: 902600 · Report as offensive
Larry256
Volunteer tester

Send message
Joined: 11 Nov 05
Posts: 25
Credit: 5,715,079
RAC: 8
United States
Message 902605 - Posted: 1 Jun 2009, 21:53:17 UTC - in response to Message 902592.  

If you were to use one of those optimized BOINC programs now, it wouldn't make a difference because the benchmark is no longer used on individual machines - not necessarily because of cheaters, but because the Linux benchmark was considerably slower than the Windows, meaning that Linux users were being cheated out of credit. That's why they did it this way, to make it fair for everyone.

Agreed - absolutely right. The benchmark is no longer used by SETI for individual tasks.

The benchmarks are only left in for projects that wish to use them; SETI does not.

Not quite. The benchmarks are still used by SETI to adjust the overall global values - through Eric's script. So if everybody suddently started using a BOINC client with a falsified benchmark, the overall effect would become noticable over 30 days.

That's effectively what has happened with CUDA, except it's the time which has been falsified, rather than the benchmarks.


Go back to my post from yesterday

Install boinc ver. 6.6.28 and note the benchmark numbers.Then install ver. 6.4.7 and take a look at the numbers.

They are playing around with the benchmarks also.

Im done and will back to the corner again and watch.
In six months when this thread is open for the same question, I'll come out.
ID: 902605 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 902607 - Posted: 1 Jun 2009, 21:59:06 UTC - in response to Message 902605.  

Go back to my post from yesterday

Install boinc ver. 6.6.28 and note the benchmark numbers.Then install ver. 6.4.7 and take a look at the numbers.

They are playing around with the benchmarks also.

Either that, or they've changed compilers or compiler switches.

A perfect, optimizing compiler should optimize-out most of the code in a benchmark loop.

Good "benchmark" code in a high-level language is always trying to code around optimizers to get them to leave it alone.
ID: 902607 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 902613 - Posted: 1 Jun 2009, 22:06:55 UTC

LOL...

Well, credit system discussion is always interesting, but rarely results in consensus. :-D

I guess the easiest way to sum it up is as far as BOINC is concerned; Scoring was a mess right from the start, it's still a mess today, and I don't see any real signs that it won't still be a mess in the foreseeable future!

The one thing I know for sure is that my G3 iMac, which doesn't do any 'tricky' operations (floating point or otherwise, mostly because it can't), has gotten paid progressively less for doing the exact same work it always has for the last year and a half. Same thing is true for my MMX only hosts. :-(

How is that keeping within the definition of the Cobblestone?

Alinator
ID: 902613 · Report as offensive
Larry256
Volunteer tester

Send message
Joined: 11 Nov 05
Posts: 25
Credit: 5,715,079
RAC: 8
United States
Message 902619 - Posted: 1 Jun 2009, 22:13:20 UTC - in response to Message 902600.  

A wee bit of a history lesson might be in order at this point.

Once upon a time, some guys at Berkeley came up with this BOINC thingy, and they thought it'd be good if they could bring in all the other projects and issue credit for work done, and have the credit be comparable between projects.

So if you got 50 credits on SETI and 50 credits on CPDN, it is because you had done equal work.

And Jeff said "how about defining a credit as 1/100th of the work a specific machine can do?" and they named it Cobblestone.

... and that is when all the problems started.

Initially, BOINC granted credit based on Benchmarks and Time because the Cobblestone is defined in terms of Benchmarks and Time.

I'd even suggest that this original scheme was the most accurate, since it did come right from the definition -- if you averaged it across a bunch of work.

It just wasn't very repeatable, and all we talked about back then was how one could claim "20" when the next cruncher claimed "50" and sometimes you got paid too well, and other times got cheated -- but it averaged out.

Now, we count FLOPs, which have no connection to the Cobblestone definition at all, and a scaling factor (2.85) is applied on Multibeam to try to bring the two into line.

Eric's script is trying to look at work, find a median, compare Benchmark * Time vs. FLOPs and slowly refine the scaling so it tracks, on average, back to the original "Gold Standard" cobblestone.

The big problem is that we (in the U.S. and probably most countries) don't really remember when most currencies were "hard currencies" and a dollar literally represented a specific amount of gold in a vault somewhere.



Say once upon a time I had a computer that met the "original Gold Standard". It used to get x number cobblestone a day.Now that they are finding the median,which now my old computer is way behind the curve,it will get less. One way to get rid of old computers I guess.:)
The computer is doing the same amount of work today that it did before the new mean went into affect today.Computers will have to be replaced to keep getting the same amount of credit.The more that are replaced, the more that will need to be replace to stay still on the credit uphill road.
ID: 902619 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

Message boards : Number crunching : again less credits?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.