Message boards :
Number crunching :
Collected data of various Cuda s@h executables public/nonpublic (Alpha/Beta)
Message board moderation
Author | Message |
---|---|
-= Vyper =- Send message Joined: 5 Sep 99 Posts: 1652 Credit: 1,065,191,981 RAC: 2,537 |
Hi When the servers was down i was bored and made a benchmark with various full length wu's and made some statistics over that benchmark. The last two pages contains a summary of one executable which wouldn't be considered accurate but other than that. No problem at all. You even see the benefit in numbers of running more than one wu on that particular card. Consider this as a base of what i can compare data with in the future when running newer compiled Cuda 3.2 beta exes and other performance enhancements with the Cuda executables. Head over to my blog to download the file! http://vyper.kafit.se Kind regards Vyper _________________________________________________________________________ Addicted to SETI crunching! Founder of GPU Users Group |
ML1 Send message Joined: 25 Nov 01 Posts: 20331 Credit: 7,508,002 RAC: 20 |
... benchmark with various full length wu's and made some statistics over that benchmark. Very interesting, but some explanation needed... How do you calculate the efficiency for 2 WUs vs 1 WU? Is that using hyperthreading with one WU per virtual core? Or 2 WUs on one core? There's one very obvious 'interesting' result for your 'fermipfForceSerial'... Can you explain further? I've seen comments about running multiple WUs in parallel on nVidia GPUs... Good idea or bad? BTW: Looks like I'll be jumping in to get a: Palit 1GB GeForce GTS 450 Sonic DDR5 NVIDIA Graphics Card 1GB Palit GTS 450 Sonic, 40nm, 3900MHz GDDR5, GPU 880MHz, Shader 1760MHz, 192 Cores, 2x DVI-I/ HDMI to add to my crunching count... Thanks all for the comments in another thread. Happy crunchin', Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
-= Vyper =- Send message Joined: 5 Sep 99 Posts: 1652 Credit: 1,065,191,981 RAC: 2,537 |
Very interesting, but some explanation needed... Hi Well the efficiency is measured by running two exactly same benchmarks on one gpu at the same time! If you take the total time taken in seconds which of course is higher than running only one single WU and compare that with the dual wu seconds required. So the formula is as follows: 86400 / 600 seconds = 144 results / day on a 1WU per GPU config.. 86400 / 900 seconds = 96 results / multiplied per app on a single GPU config.. In this testbed i had two parallel Wu's = 192 results / day. The numbers are only example of measurement. And you see that particular config has a 33.3% increase in throughput RAC wise. The interesting phenomena with fermipfforceserial is that it suddenly exited when the test result wasn't done and thus that test exited faster than the other ones, but the fermipfforceserial tests in a 2WU configuration is not considered valid at all. So the point is that with the newer Fermi architecture there is some headroom in each GPU to make it possible and increase the output / day. In that terms i would consider it a good thing.. Kind regards Vyper _________________________________________________________________________ Addicted to SETI crunching! Founder of GPU Users Group |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.