Average Credit Decreasing?

Message boards : Number crunching : Average Credit Decreasing?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 28 · 29 · 30 · 31 · 32 · Next

AuthorMessage
KLiK
Volunteer tester

Send message
Joined: 31 Mar 14
Posts: 1304
Credit: 22,994,597
RAC: 60
Croatia
Message 1821660 - Posted: 4 Oct 2016, 6:54:55 UTC - in response to Message 1821596.  

You have some -v N switch which "deletes" too much info from stderr:
http://setiathome.berkeley.edu/result.php?resultid=5129640069

Do you need that "Verbose" switch?

that is a WU for a intel HD2500 GPU on CPU...I'm using that also! ;)

Why do you think it is for iGPU ?
It is marked for NVIDIA:
SETI@home v8 v8.12 (opencl_nvidia_sah)

anyway...running that nVidia Quadro2000 with command line:
<cmdline>-use_sleep_ex 1 -sbs 256 -v 6 -period_iterations_num 100</cmdline>

would you (or anyone other) suggest some other configuration for that card?

p.s. Even though it goes against my "non-programming idea" of BOINC with SETi@home! ;)


non-profit org. Play4Life in Zagreb, Croatia, EU
ID: 1821660 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1821803 - Posted: 5 Oct 2016, 2:11:19 UTC - in response to Message 1821660.  

anyway...running that nVidia Quadro2000 with command line:
<cmdline>-use_sleep_ex 1 -sbs 256 -v 6 -period_iterations_num 100</cmdline>

would you (or anyone other) suggest some other configuration for that card?

Not suggesting, just commenting:

-use_sleep_ex 1 : I think this is the same as just -use_sleep

-v 6 : If you don't know why you put this Verbose level (for stderr) - remove it
(From ReadMe_MultiBeam_OpenCL_NV_SoG.txt
    -v 6 enables delays printing where sleep loops used.
)

-period_iterations_num 100 : The default is 50 or 500 depending on GPU (# of CU) (because of -v 6 your stderr don't show # of CU to know the default, I think if CU < 4 (?) app sets 500)
 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1821803 · Report as offensive
AMDave
Volunteer tester

Send message
Joined: 9 Mar 01
Posts: 234
Credit: 11,671,730
RAC: 0
United States
Message 1821975 - Posted: 5 Oct 2016, 14:40:43 UTC - in response to Message 1821803.  
Last modified: 5 Oct 2016, 14:45:27 UTC

anyway...running that nVidia Quadro2000 with command line:
<cmdline>-use_sleep_ex 1 -sbs 256 -v 6 -period_iterations_num 100</cmdline>

would you (or anyone other) suggest some other configuration for that card?

Not suggesting, just commenting:

-use_sleep_ex 1 : I think this is the same as just -use_sleep

"It enables 1ms Sleep() call ( for Windows 1 ms is the lowest possible sleep duration besides of just yeld[sic] with value of 0) in cycle while awaiting completion of event marker specially inserted into GPU processing queue." (*source)  So, its usage here is redundant.

-v 6 : If you don't know why you put this Verbose level (for stderr) - remove it
(From ReadMe_MultiBeam_OpenCL_NV_SoG.txt
    -v 6 enables delays printing where sleep loops used.
)

-period_iterations_num 100 : The default is 50 or 500 depending on GPU (# of CU) (because of -v 6 your stderr don't show # of CU to know the default, I think if CU < 4 (?) app sets 500)

"Current (as of 6 Juny 2016) defaults are 50 for most of GPUs and 500 for so called "low performance path" GPUs with only 3 or less CUs per device." (*source)

Given this info, here are two possibilities for a command line config:
    ►  -low_perf
    ►  -sbs 128 -spike_fft_thresh 2048 -tune 1 2 1 16 (from the entry level NV cards listing of the ReadMe file)

ID: 1821975 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6324
Credit: 106,370,077
RAC: 121
Russia
Message 1825042 - Posted: 17 Oct 2016, 22:55:10 UTC

http://clip2net.com/s/3DoidFJ
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1825042 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1849
Credit: 268,616,081
RAC: 1,349
United States
Message 1825080 - Posted: 18 Oct 2016, 1:01:08 UTC - in response to Message 1825042.  

http://clip2net.com/s/3DoidFJ

So were the averages from Nov 2012 through Nov 2014 really that anomalous, or was there perhaps a problem with the data?
Sure is an extreme swing ...
ID: 1825080 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 28517
Credit: 261,360,520
RAC: 489
Australia
Message 1825098 - Posted: 18 Oct 2016, 1:59:10 UTC - in response to Message 1825080.  

http://clip2net.com/s/3DoidFJ

So were the averages from Nov 2012 through Nov 2014 really that anomalous, or was there perhaps a problem with the data?
Sure is an extreme swing ...

Just swap my 3570K for a Q6600 and the rest of my hardware under MB V6 use to get an average RAC of 92-94K, the same hardware under MB V7 dropped the RAC down to 62-63K. Replacing the Q6600 with the 3570K made the RAC jump up to 66-67K.

<- Now under MB V8 you can see where that same hardware sits these days. ;-)

Cheers.
ID: 1825098 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1825136 - Posted: 18 Oct 2016, 5:42:02 UTC - in response to Message 1825098.  
Last modified: 18 Oct 2016, 5:43:04 UTC

http://clip2net.com/s/3DoidFJ

So were the averages from Nov 2012 through Nov 2014 really that anomalous, or was there perhaps a problem with the data?
Sure is an extreme swing ...

Just swap my 3570K for a Q6600 and the rest of my hardware under MB V6 use to get an average RAC of 92-94K, the same hardware under MB V7 dropped the RAC down to 62-63K. Replacing the Q6600 with the 3570K made the RAC jump up to 66-67K.

<- Now under MB V8 you can see where that same hardware sits these days. ;-)

Cheers.


What are the rough ratios per machine:
--> Boinc_Whetstone, over Sisoft-Sandra-Lite Whetstone, Single threaded single precision SSE+ (nonAVX)
--> Boinc_Whetstone, over Sisoft-Sandra-Lite Whetstone, Single threaded single precision AVX (Where available)

Anything in the ballpark of these ones respectively ?:
--> v7Rac/v6Rac
--> v8Rac/v6Rac
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1825136 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 28517
Credit: 261,360,520
RAC: 489
Australia
Message 1825139 - Posted: 18 Oct 2016, 5:55:07 UTC - in response to Message 1825136.  

http://clip2net.com/s/3DoidFJ

So were the averages from Nov 2012 through Nov 2014 really that anomalous, or was there perhaps a problem with the data?
Sure is an extreme swing ...

Just swap my 3570K for a Q6600 and the rest of my hardware under MB V6 use to get an average RAC of 92-94K, the same hardware under MB V7 dropped the RAC down to 62-63K. Replacing the Q6600 with the 3570K made the RAC jump up to 66-67K.

<- Now under MB V8 you can see where that same hardware sits these days. ;-)

Cheers.


What are the rough ratios per machine:
--> Boinc_Whetstone, over Sisoft-Sandra-Lite Whetstone, Single threaded single precision SSE+ (nonAVX)
--> Boinc_Whetstone, over Sisoft-Sandra-Lite Whetstone, Single threaded single precision AVX (Where available)

Anything in the ballpark of these ones respectively ?:
--> v7Rac/v6Rac
--> v8Rac/v6Rac

If your trying to ask me a question there Jason I don't understand it.

Mind you that could have a lot to with mowing here today and a few drinks to ease the pain. ;-)

Cheers.
ID: 1825139 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1825145 - Posted: 18 Oct 2016, 6:14:26 UTC - in response to Message 1825139.  

If your trying to ask me a question there Jason I don't understand it.

Mind you that could have a lot to with mowing here today and a few drinks to ease the pain. ;-)

Cheers.


Haha, yeah having first after work beer myself ;).

It's simply that past team effort digging into the innards of CreditNew found Boinc Whetstone lurking deep within.

Given that stock MB CPU transitioned to full AVX support, and credit has been observed to have further dropped, I'm looking for rough:
Boinc_Whetstone / Actual_Whetstone , the first number coming from the boinc client's CPU benchmark, the second coming from a 'proper' Whetstone benchmark tool (such as Sisoft Sandra Lite, set to appropriate SSE/AVX level)

If the resulting fraction resembles the fraction of NewRac/OldRac, even loosely, then it just adds weight to potential future modelling and fixes.

...
Put these complexities aside for the moment, to avoid confusion:
The stock applications used for CPU, are used to scale the credit (down), and these successively gained SIMD vectorisation ( i.e. SSE through AVX). However, Boinc WHetstone never received similar SSE+ variants.

Old hosts (e.g. without AVX) drop off, and new hosts (usually with AVX these days) come online, and dominate the numbers by throughput.

The inappropriate (low) Boinc Whetstone then renders an effective underclaim, driving down granted credit, and so reference for every other application.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1825145 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1825148 - Posted: 18 Oct 2016, 6:37:14 UTC - in response to Message 1825139.  
Last modified: 18 Oct 2016, 6:39:54 UTC

Quick example from my Win7, Core2Duo (max level SSSE3)
Apollo

54 18/10/2016 4:52:23 PM Suspending computation - running CPU benchmarks
55 18/10/2016 4:52:54 PM Benchmark results:
56 18/10/2016 4:52:54 PM Number of CPUs: 1
57 18/10/2016 4:52:54 PM 3327 floating point MIPS (Whetstone) per CPU
58 18/10/2016 4:52:54 PM 9477 integer MIPS (Dhrystone) per CPU


Sisoft Sandra Lite (making sure boinc isn't crunching, and single threaded bench selected):
Whetstone Float Native SSE3: 11 GFlops


Boinc_whetstone / SisoftSSE3_Whetstone = ~3.3/11 --> 30% claim (app is well and truly using SSE3)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1825148 · Report as offensive
KLiK
Volunteer tester

Send message
Joined: 31 Mar 14
Posts: 1304
Credit: 22,994,597
RAC: 60
Croatia
Message 1825151 - Posted: 18 Oct 2016, 6:50:30 UTC

Still SoG files with data from Arecibo have problems in "building up CL base":
http://i.imgur.com/hfY0ivE.png
Same thing happens in SETi@home & BETAs.
:/


non-profit org. Play4Life in Zagreb, Croatia, EU
ID: 1825151 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6324
Credit: 106,370,077
RAC: 121
Russia
Message 1825159 - Posted: 18 Oct 2016, 7:03:45 UTC - in response to Message 1825080.  

http://clip2net.com/s/3DoidFJ

So were the averages from Nov 2012 through Nov 2014 really that anomalous, or was there perhaps a problem with the data?
Sure is an extreme swing ...

It was "almost AP only" time.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1825159 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1849
Credit: 268,616,081
RAC: 1,349
United States
Message 1825161 - Posted: 18 Oct 2016, 7:05:21 UTC - in response to Message 1825159.  

http://clip2net.com/s/3DoidFJ

So were the averages from Nov 2012 through Nov 2014 really that anomalous, or was there perhaps a problem with the data?
Sure is an extreme swing ...

It was "almost AP only" time.

Ah yes, the "Golden Days" How could I have forgotten! lol
ID: 1825161 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 28517
Credit: 261,360,520
RAC: 489
Australia
Message 1825162 - Posted: 18 Oct 2016, 7:08:25 UTC - in response to Message 1825148.  

Quick example from my Win7, Core2Duo (max level SSSE3)
Apollo

54 18/10/2016 4:52:23 PM Suspending computation - running CPU benchmarks
55 18/10/2016 4:52:54 PM Benchmark results:
56 18/10/2016 4:52:54 PM Number of CPUs: 1
57 18/10/2016 4:52:54 PM 3327 floating point MIPS (Whetstone) per CPU
58 18/10/2016 4:52:54 PM 9477 integer MIPS (Dhrystone) per CPU


Sisoft Sandra Lite (making sure boinc isn't crunching, and single threaded bench selected):
Whetstone Float Native SSE3: 11 GFlops


Boinc_whetstone / SisoftSSE3_Whetstone = ~3.3/11 --> 30% claim (app is well and truly using SSE3)

BOINC shows my,
3570K @ Stock is 3937 floating point MIPS (Whetstone) per CPU,
2500K @ 3.4GHz is 3585 floating point MIPS (Whetstone) per CPU,
Athlon II X4 630 @ stock is 2288 floating point MIPS (Whetstone) per CPU.

I've just downloaded the latest SiSoft Sandra Lite version so with some luck I'll run it on all 3 rigs in the morning and report those numbers after tomorrow's outrage.

Cheers.
ID: 1825162 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1825166 - Posted: 18 Oct 2016, 7:17:25 UTC - in response to Message 1825162.  

Cheers! (time for second beer :) )

Yeah, at least my old version of Sisoft Sandra Lite is a bit fiddly to make sure the right numbers come out (Single threaded, SSE level matching app + CPU capability) . At least having the Athlon in there will give some comparison to my Core2Duo, and the others some idea of AVX impact
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1825166 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 28517
Credit: 261,360,520
RAC: 489
Australia
Message 1825170 - Posted: 18 Oct 2016, 7:51:42 UTC - in response to Message 1825166.  

Cheers! (time for second beer :) )

Yeah, at least my old version of Sisoft Sandra Lite is a bit fiddly to make sure the right numbers come out (Single threaded, SSE level matching app + CPU capability) . At least having the Athlon in there will give some comparison to my Core2Duo, and the others some idea of AVX impact

The version I had here was from 2006 so I decided that a later version was well needed.

That old Athlon @ stock use to keep my Q6600 @ 3GHz very honest under MB V6 while using a little less power (those where the last of AMD's Intel competitive days).

Anyhow I had a few beers, then tried a couple of rums, but I've settled down now on some sipping whiskey (I'm flexible with whatever comes out of my brewroom) so I'm just going to relax for the rest of the evening. ;-)

Cheers.
ID: 1825170 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 28517
Credit: 261,360,520
RAC: 489
Australia
Message 1825258 - Posted: 18 Oct 2016, 23:35:46 UTC

BOINC shows my,
3570K @ Stock is 3937 floating point MIPS (Whetstone) per CPU,
2500K @ 3.4GHz is 3585 floating point MIPS (Whetstone) per CPU,
Athlon II X4 630 @ stock is 2288 floating point MIPS (Whetstone) per CPU.

Ok after working out the 2010 & 2016 versions of Sandra it seems that the minimum test is SSE2 (I can't find a SSE only test) but I did get these results,
3570K Whetstone single point 15.41 GFLOPS,
2500K Whetstone single point 13.23 GFLOPS,
Athlon II X4 630 Whetstone single point 7.2 GFLOPS.

Cheers.
ID: 1825258 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1825278 - Posted: 19 Oct 2016, 1:10:41 UTC - in response to Message 1825258.  
Last modified: 19 Oct 2016, 1:21:33 UTC

BOINC shows my,
3570K @ Stock is 3937 floating point MIPS (Whetstone) per CPU,
2500K @ 3.4GHz is 3585 floating point MIPS (Whetstone) per CPU,
Athlon II X4 630 @ stock is 2288 floating point MIPS (Whetstone) per CPU.

Ok after working out the 2010 & 2016 versions of Sandra it seems that the minimum test is SSE2 (I can't find a SSE only test) but I did get these results,
3570K Whetstone single point 15.41 GFLOPS,
2500K Whetstone single point 13.23 GFLOPS,
Athlon II X4 630 Whetstone single point 7.2 GFLOPS.

Cheers.


Yeah, if they moved to a 64 bit executable, then there won't be an fpu example, and SSE2 is presumed the minimum. Fortunately the applications will be using the highest available, sse2 is predominantly about double precision and some minor caching, and sse3 just a few horizontal math operations, so we're talking maybe 10'percent or so difference anyway and the numbers will be representative.

3.94/15.41 ~ 26%
3.59/13.23 ~ 27%
2.29/7.2 ~ 32%

Presuming they were AVX enabled SiSioft figures for the Intel processor cases, they seem to be in the 30% ballpark of my Core2 (but correspondingly faster due to AVX and architecture refinement).

-- In pre v6 & creditnew days, the credit multiplier was 2.85, and probably created close to cobblestone scale credits, due to fpop 'counting' within the app.

-- During v6 (pre AVX) days, logically there still might have been more FPU only hosts around returning results, and the architectures with SSE+ using less efficient code than now, so probably credit wouldn't have dropped the full 2 thirds. A credit halving seems to gel with my memory.

-- During v7 not a lot would have changed, though more Pre-SSE and early SSE machines dropping off their perches. Probably driving credit down to the third.

-- Looks like AVX will be impacting the numbers now (with v8), and likely dominating numbers, so settling around a quarter of original cobblestone credit seems likely (until stock CPU gets more optimisation...which would drive it down further)

Seems to match up with history, and the realities of having optimised stock applications without the credit mechanism being aware. Thanks for the numbers. Though they will vary, they will come in handy as gotchas should someone start developing simulations for CreditNew refinement. Discrepancies on the order of low 25-33% claims driving the whole thing will show when they do (if they are doing it right)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1825278 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 28517
Credit: 261,360,520
RAC: 489
Australia
Message 1825360 - Posted: 19 Oct 2016, 8:52:54 UTC
Last modified: 19 Oct 2016, 8:54:37 UTC

Presuming they were AVX enabled SiSioft figures for the Intel processor cases, they seem to be in the 30% ballpark of my Core2 (but correspondingly faster due to AVX and architecture refinement).

The test figures are SSE2 results Jason and many kilometres behind the AVX results (the 1st standard Sandra test before selecting SSE2 tests) which were above standard AVX standards for both normal 2500K's (which I expected due to the slight o/c) and 3570K's.

Cheers.
ID: 1825360 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1825368 - Posted: 19 Oct 2016, 10:05:31 UTC - in response to Message 1825360.  
Last modified: 19 Oct 2016, 10:19:09 UTC

Presuming they were AVX enabled SiSioft figures for the Intel processor cases, they seem to be in the 30% ballpark of my Core2 (but correspondingly faster due to AVX and architecture refinement).

The test figures are SSE2 results Jason and many kilometres behind the AVX results (the 1st standard Sandra test before selecting SSE2 tests) which were above standard AVX standards for both normal 2500K's (which I expected due to the slight o/c) and 3570K's.

Cheers.


OK, that just places the drops even further than half to 2/3 drop, and 3/4 drop from cobblestone scale.

It will be interesting down the road when someone gets around to computing the expected credit, to see just how far down the normalisation step has pushed it.

[Edit:]for info, Downscaling comes from this:
The claimed credit of a job (in Cobblestones) is
C = F * cobblestone_scale

if app.min_avg_pfc is defined
C = app.min_avg_pfc*wu.fpops_est*cobblestone_scale
else
C = wu.fpops_est * cobblestone_scale

with cobblestone_scale = 200/86400e9,
and app.min_avg_pfc derived from Boinc Whetstone, as opposed to true Whetstone representing the App and host.

Pretty big drop :)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1825368 · Report as offensive
Previous · 1 . . . 28 · 29 · 30 · 31 · 32 · Next

Message boards : Number crunching : Average Credit Decreasing?


 
©2022 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.