Deprecated: Function get_magic_quotes_gpc() is deprecated in /disks/centurion/b/carolyn/b/home/boincadm/projects/beta/html/inc/util.inc on line 663
CUDA MB benchmarking

CUDA MB benchmarking

Message boards : SETI@home Enhanced : CUDA MB benchmarking
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 35697 - Posted: 13 Dec 2008, 14:13:55 UTC
Last modified: 13 Dec 2008, 14:15:10 UTC

Benchmark conditions:

All app ran in 3+1 configuration i.e. 3 cores of quad were busy with BOINC (AKv8 SSE4.1) and 4th core (w/o process affinity) ran app in testing.
No priority boost was added for CUDA app (but it had whole free core for GPU feeding).

In short:
CPU opt app wall time ~120 secons for this test WU (WU-1 )
GPU CUDA beta app wall time ~240 seconds.

Full benchmark with host configuration :

============
AK_v8_win_SSE41.exe -verb -st / testWU-1.wu :
Started at : 16:33:47.267
Ended at : 16:35:43.862
116.454 secs Elapsed
114.427 secs CPU time

[ stderr ]
Can't set up shared mem: -1
Will run in standalone mode.
Windows optimized S@H Enhanced application by Alex Kan
Version info: SSE4.1 (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSE4.1 Win32 Build 41 , Ported by : Jason G, Raistmer, JDWhale

CPUID: Intel(R) Core(TM)2 Quad CPU Q9450 @ 2.66GHz
Speed: 4 x 2676 MHz
Cache: L1=64K L2=6144K
Features: MMX SSE SSE2 SSE3 SSSE3 SSE4.1

Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 0.604884

Flopcounter: 636762899949.714840

Spike count: 0
Pulse count: 0
Triplet count: 0
Gaussian count: 0
called boinc_finish
[ /stderr ]
------------

setiathome_6.04_windows_intelx86_cuda_MyBuild.exe -verb -st / testWU-1.wu :
Started at : 16:43:34.061
Ended at : 16:47:35.034
240.926 secs Elapsed
19.204 secs CPU time
Speedup : 83.22%
Ratio : 5.96 x

Result : Strongly similar, Q= 99.90%
[ stderr ]
cudaAcc_initializeDevice: Found 1 CUDA device(s):
Device 1 : GeForce 9400 GT
cudaAcc_initializeDevice is determiming what CUDA device to use...
determined to use CUDA device 1: GeForce 9400 GT
SETI@home using CUDA accelerated device GeForce 9400 GT
setiathome_enhanced 6.02 Visual Studio/Microsoft C++
libboinc: 6.3.22

Work Unit Info:
...............
WU true angle range is : 0.604884
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)

v_GetPowerSpectrum 0.00020 0.00000 test
v_GetPowerSpectrum 0.00020 0.00000 choice

v_ChirpData 0.02000 0.00000 test
v_ChirpData 0.02000 0.00000 choice

v_Transpose 0.00662 0.00000 test
v_Transpose2 0.00614 0.00000 test
v_Transpose4 0.00397 0.00000 test
v_Transpose8 0.00844 0.00000 test
v_Transpose4 0.00397 0.00000 choice

FPU opt folding 0.00285 0.00000 test
FPU opt folding 0.00285 0.00000 choice


Flopcounter: 745980599434.345090

Spike count: 0
Pulse count: 0
Triplet count: 0
Gaussian count: 0
called boinc_finish
[ /stderr ]


------------

setiathome_6.05_windows_intelx86__cuda.exe -verb -st / testWU-1.wu :
Started at : 16:47:35.175
Ended at : 16:51:37.770
242.549 secs Elapsed
20.857 secs CPU time
Speedup : 81.77%
Ratio : 5.49 x

Result : Strongly similar, Q= 99.90%
[ stderr ]
cudaAcc_initializeDevice: Found 1 CUDA device(s):
Device 1 : GeForce 9400 GT
cudaAcc_initializeDevice is determiming what CUDA device to use...
determined to use CUDA device 1: GeForce 9400 GT
SETI@home using CUDA accelerated device GeForce 9400 GT
setiathome_enhanced 6.02 Visual Studio/Microsoft C++
libboinc: 6.3.22

Work Unit Info:
...............
WU true angle range is : 0.604884
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)

v_GetPowerSpectrum 0.00020 0.00000 test
v_GetPowerSpectrum 0.00020 0.00000 choice

v_ChirpData 0.01933 0.00000 test
v_ChirpData 0.01933 0.00000 choice

v_Transpose 0.00660 0.00000 test
v_Transpose2 0.00786 0.00000 test
v_Transpose4 0.00567 0.00000 test
v_Transpose8 0.00868 0.00000 test
v_Transpose4 0.00567 0.00000 choice

FPU opt folding 0.00282 0.00000 test
FPU opt folding 0.00282 0.00000 choice


Flopcounter: 745980599434.345090

Spike count: 0
Pulse count: 0
Triplet count: 0
Gaussian count: 0
called boinc_finish
[ /stderr ]


------------

Quick timetable

WU : testWU-1.wu
AK_v8_win_SSE41.exe : 114.427 secs CPU
setiathome_6.04_windows_intelx86_cuda_MyBuild.exe : 19.204 secs CPU
Speedup : 83.22%
Ratio : 5.96 x
setiathome_6.05_windows_intelx86__cuda.exe : 20.857 secs CPU
Speedup : 81.77%
Ratio : 5.49 x

------------
CPU:
Number of processors 1
Number of cores 4 (max 4)
Specification Intel(R) Core(TM)2 Quad CPU Q9450 @ 2.66GHz
Codename Yorkfield
Core Speed 2660.1 MHz (8.0 x 332.5 MHz)
Core Stepping C1
Technology 45 nm
Stock frequency 2666 MHz
------------
Chipset:
Northbridge Intel Q35 rev. A2
Southbridge Intel ID2914 rev. 02
------------
RAM:
Memory Type DDR2
Memory Size 4096 MBytes
Memory Frequency 399.0 MHz (5:6)
Max bandwidth PC2-6400 (400 MHz)
CAS# 5.0
RAS# to CAS# 5
RAS# Precharge 5
Cycle Time (tRAS) 15
------------
OS:
Windows Version Microsoft Windows Vista (6.0) Business Edition (Build 6000)
============
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 35697 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 35702 - Posted: 13 Dec 2008, 16:11:33 UTC - in response to Message 35697.  
Last modified: 13 Dec 2008, 16:11:57 UTC

The same host, PG1327 test WU:

AK_v8_win_SSE41.exe -verb -st / PG1327.wu :
Started at : 18:09:35.789
Ended at : 18:13:18.230
222.409 secs Elapsed
220.305 secs CPU time

[ stderr ]
Can't set up shared mem: -1
Will run in standalone mode.
Windows optimized S@H Enhanced application by Alex Kan
Version info: SSE4.1 (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSE4.1 Win32 Build 41 , Ported by : Jason G, Raistmer, JDWhale

CPUID: Intel(R) Core(TM)2 Quad CPU Q9450 @ 2.66GHz
Speed: 4 x 2655 MHz
Cache: L1=64K L2=6144K
Features: MMX SSE SSE2 SSE3 SSSE3 SSE4.1

Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 1.326684

Flopcounter: 1483630339588.494400

Spike count: 0
Pulse count: 0
Triplet count: 6
Gaussian count: 0
called boinc_finish
[ /stderr ]

------------

setiathome_6.05_windows_intelx86__cuda.exe -verb -st / PG1327.wu :
Started at : 18:35:50.921
Ended at : 18:43:28.828
457.876 secs Elapsed
30.514 secs CPU time
Speedup : 86.15%
Ratio : 7.22 x

Result : Strongly similar, Q= 97.70%
[ stderr ]
Can't set up shared mem: -1
Will run in standalone mode.
cudaAcc_initializeDevice: Found 1 CUDA device(s):
Device 1 : GeForce 9400 GT
cudaAcc_initializeDevice is determiming what CUDA device to use...
determined to use CUDA device 1: GeForce 9400 GT
SETI@home using CUDA accelerated device GeForce 9400 GT
setiathome_enhanced 6.02 Visual Studio/Microsoft C++
libboinc: 6.3.22

Work Unit Info:
...............
WU true angle range is : 1.326684
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)

v_GetPowerSpectrum 0.00023 0.00000 test
v_GetPowerSpectrum 0.00023 0.00000 choice

v_ChirpData 0.02242 0.00000 test
v_ChirpData 0.02242 0.00000 choice

v_Transpose 0.00672 0.00000 test
v_Transpose2 0.00722 0.00000 test
v_Transpose4 0.00539 0.00000 test
v_Transpose8 0.00729 0.00000 test
v_Transpose4 0.00539 0.00000 choice

FPU opt folding 0.00226 0.00000 test
FPU opt folding 0.00226 0.00000 choice


Flopcounter: 1460648473956.838400

Spike count: 0
Pulse count: 0
Triplet count: 6
Gaussian count: 0
called boinc_finish
[ /stderr ]


------------

Quick timetable

WU : PG1327.wu
AK_v8_win_SSE41.exe : 220.305 secs CPU
setiathome_6.05_windows_intelx86__cuda.exe : 30.514 secs CPU
Speedup : 86.15%
Ratio : 7.22 x
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 35702 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 35706 - Posted: 13 Dec 2008, 17:34:10 UTC - in response to Message 35702.  

Same conditions PG0444 test WU.

AK_v8_win_SSE41.exe -verb -st / PG0444.wu :
Started at : 19:13:01.815
Ended at : 19:18:09.712
307.866 secs Elapsed
305.809 secs CPU time

[ stderr ]
Can't set up shared mem: -1
Will run in standalone mode.
Windows optimized S@H Enhanced application by Alex Kan
Version info: SSE4.1 (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSE4.1 Win32 Build 41 , Ported by : Jason G, Raistmer, JDWhale

CPUID: Intel(R) Core(TM)2 Quad CPU Q9450 @ 2.66GHz
Speed: 4 x 2676 MHz
Cache: L1=64K L2=6144K
Features: MMX SSE SSE2 SSE3 SSSE3 SSE4.1

Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 0.444184

Flopcounter: 1543175889339.756800

Spike count: 2
Pulse count: 0
Triplet count: 3
Gaussian count: 1
called boinc_finish
[ /stderr ]


------------

setiathome_6.05_windows_intelx86__cuda.exe -verb -st / PG0444.wu :
Started at : 19:49:08.124
Ended at : 19:59:43.091
634.920 secs Elapsed
32.589 secs CPU time
Speedup : 89.34%
Ratio : 9.38 x

Result : Strongly similar, Q= 97.44%
[ stderr ]
cudaAcc_initializeDevice: Found 1 CUDA device(s):
Device 1 : GeForce 9400 GT
cudaAcc_initializeDevice is determiming what CUDA device to use...
determined to use CUDA device 1: GeForce 9400 GT
SETI@home using CUDA accelerated device GeForce 9400 GT
setiathome_enhanced 6.02 Visual Studio/Microsoft C++
libboinc: 6.3.22

Work Unit Info:
...............
WU true angle range is : 0.444184
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)

v_GetPowerSpectrum 0.00022 0.00000 test
v_GetPowerSpectrum 0.00022 0.00000 choice

v_ChirpData 0.01908 0.00000 test
v_ChirpData 0.01908 0.00000 choice

v_Transpose 0.00601 0.00000 test
v_Transpose2 0.00632 0.00000 test
v_Transpose4 0.00447 0.00000 test
v_Transpose8 0.01049 0.00000 test
v_Transpose4 0.00447 0.00000 choice

FPU opt folding 0.00473 0.00000 test
FPU opt folding 0.00473 0.00000 choice


Flopcounter: 2125673090512.236800

Spike count: 2
Pulse count: 0
Triplet count: 3
Gaussian count: 1
called boinc_finish
[ /stderr ]


------------

Quick timetable

WU : PG0444.wu
AK_v8_win_SSE41.exe : 305.809 secs CPU
setiathome_6.05_windows_intelx86__cuda.exe : 32.589 secs CPU
Speedup : 89.34%
Ratio : 9.38 x



News about SETI opt app releases: https://twitter.com/Raistmer
ID: 35706 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 35708 - Posted: 13 Dec 2008, 19:04:47 UTC - in response to Message 35706.  

Last of PG* WUs. (as already mentioned, PG0009.wu fails)
Test WU is PG0395

============
AK_v8_win_SSE41.exe -verb -st / PG0395.wu :
Started at : 20:22:56.826
Ended at : 20:28:56.344
359.486 secs Elapsed
357.164 secs CPU time

[ stderr ]
Can't set up shared mem: -1
Will run in standalone mode.
Windows optimized S@H Enhanced application by Alex Kan
Version info: SSE4.1 (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSE4.1 Win32 Build 41 , Ported by : Jason G, Raistmer, JDWhale

CPUID: Intel(R) Core(TM)2 Quad CPU Q9450 @ 2.66GHz
Speed: 4 x 2655 MHz
Cache: L1=64K L2=6144K
Features: MMX SSE SSE2 SSE3 SSSE3 SSE4.1

Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 0.394768

Flopcounter: 2204837259790.656200

Spike count: 0
Pulse count: 0
Triplet count: 0
Gaussian count: 1
called boinc_finish
[ /stderr ]

------------

setiathome_6.05_windows_intelx86__cuda.exe -verb -st / PG0395.wu :
Started at : 21:05:47.394
Ended at : 21:17:50.657
723.216 secs Elapsed
34.710 secs CPU time
Speedup : 90.28%
Ratio : 10.29 x

Result : Strongly similar, Q= 97.48%
[ stderr ]
cudaAcc_initializeDevice: Found 1 CUDA device(s):
Device 1 : GeForce 9400 GT
cudaAcc_initializeDevice is determiming what CUDA device to use...
determined to use CUDA device 1: GeForce 9400 GT
SETI@home using CUDA accelerated device GeForce 9400 GT
setiathome_enhanced 6.02 Visual Studio/Microsoft C++
libboinc: 6.3.22

Work Unit Info:
...............
WU true angle range is : 0.394768
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)

v_GetPowerSpectrum 0.00024 0.00000 test
v_GetPowerSpectrum 0.00024 0.00000 choice

v_ChirpData 0.01704 0.00000 test
v_ChirpData 0.01704 0.00000 choice

v_Transpose 0.00654 0.00000 test
v_Transpose2 0.00573 0.00000 test
v_Transpose4 0.00361 0.00000 test
v_Transpose8 0.00851 0.00000 test
v_Transpose4 0.00361 0.00000 choice

FPU opt folding 0.00322 0.00000 test
FPU opt folding 0.00322 0.00000 choice


Flopcounter: 2738901260640.970200

Spike count: 0
Pulse count: 0
Triplet count: 0
Gaussian count: 1
called boinc_finish
[ /stderr ]


------------

Quick timetable

WU : PG0395.wu
AK_v8_win_SSE41.exe : 357.164 secs CPU
setiathome_6.05_windows_intelx86__cuda.exe : 34.710 secs CPU
Speedup : 90.28%
Ratio : 10.29 x

News about SETI opt app releases: https://twitter.com/Raistmer
ID: 35708 · Report as offensive
Profile Raistmer
Volunteer tester
Avatar

Send message
Joined: 18 Aug 05
Posts: 2423
Credit: 15,878,738
RAC: 0
Russia
Message 35722 - Posted: 13 Dec 2008, 23:38:15 UTC - in response to Message 35708.  

The same host but another GPU:

Running app : AK_v8_win_SSE41.exe with -verb -st
with WU : PG0395.wu
Started at : 02:24:52.878
Ended at : 02:31:10.601
377.629 secs Elapsed
375.416 secs CPU time
Result : stored as ref for validation.
------------
Running app : setiathome_6.04_windows_intelx86_cuda_MyBuild.exe with -verb -st
with WU : PG0395.wu
Started at : 02:31:10.694
Ended at : 02:33:57.989
167.248 secs Elapsed
31.325 secs CPU time
Speedup : 91.66%
Ratio : 11.98 x
Result : Strongly similar, Q= 97.48%

It's Asus 9600GSO w/o additional OCing.
News about SETI opt app releases: https://twitter.com/Raistmer
ID: 35722 · Report as offensive

Message boards : SETI@home Enhanced : CUDA MB benchmarking


 
©2022 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.