CPU vs GPU


log in

Advanced search

Message boards : Number crunching : CPU vs GPU

1 · 2 · Next
Author Message
ahj
Send message
Joined: 24 Sep 02
Posts: 11
Credit: 101,600
RAC: 0
Australia
Message 1184730 - Posted: 12 Jan 2012, 10:40:54 UTC

Hello

I've currently got a GTX470 running seti with seemingly good performance. However, I'd like to know how much more productive it is versus a core i5 2500, for example. 2x faster? x5, x10?

Has running seti on CPUs become redundant, or can they still provide useful work?

Profile Michel448a
Volunteer tester
Avatar
Send message
Joined: 27 Oct 00
Posts: 1201
Credit: 2,891,635
RAC: 0
Canada
Message 1184743 - Posted: 12 Jan 2012, 12:40:06 UTC - in response to Message 1184730.

yes if you have a multi core cpu.

but ya the gpu is so fast ... but like my gtx-480 it can go extremly hot ( with my gpu fan at 85% capacity my gpu is still at 75 C (167 F) :P )

Profile Link
Avatar
Send message
Joined: 18 Sep 03
Posts: 824
Credit: 1,545,480
RAC: 290
Germany
Message 1184748 - Posted: 12 Jan 2012, 13:49:54 UTC - in response to Message 1184730.

Has running seti on CPUs become redundant, or can they still provide useful work?

Besides the fact, that any returned (valid) result is useful for the project, CPUs are the only ones that get VLAR tasks assigned, so no, I don't think that using them for SETI is redundant.
____________
.

Profile tullio
Send message
Joined: 9 Apr 04
Posts: 3590
Credit: 362,934
RAC: 201
Italy
Message 1184752 - Posted: 12 Jan 2012, 14:06:07 UTC

I have only a CPU with two OS, one real (Linux) and another virtual (Solaris), I have 4 WUs in a pending state. In two of them the wingman uses cuda_fermi.
Tullio
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5698
Credit: 56,457,389
RAC: 49,094
Australia
Message 1184789 - Posted: 12 Jan 2012, 18:42:45 UTC - in response to Message 1184730.

I've currently got a GTX470 running seti with seemingly good performance. However, I'd like to know how much more productive it is versus a core i5 2500, for example. 2x faster? x5, x10?

My GTX560Ti does 1 shorty every 1.5 minutes. My i7 2600 does 1 shortie every 4.75 minutes.


Has running seti on CPUs become redundant, or can they still provide useful work?

They still provide usefull work.

____________
Grant
Darwin NT.

Josef W. Segur
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4206
Credit: 1,030,844
RAC: 270
United States
Message 1184797 - Posted: 12 Jan 2012, 19:12:15 UTC - in response to Message 1184730.

Hello

I've currently got a GTX470 running seti with seemingly good performance. However, I'd like to know how much more productive it is versus a core i5 2500, for example. 2x faster? x5, x10?

Has running seti on CPUs become redundant, or can they still provide useful work?

From the project statistics at BoincStats, there are 238518 active hosts with a total RAC of 107435347 so half the work is being done by hosts with RAC < 450.43 . Also there are 52082 hosts with RAC which rounds to 451 or higher, so about 186436 active hosts below the mean RAC. I think it's fair to say most of those are not doing GPU crunching, though exceptions for only part time SETI crunching could be found.

One GTX470 will outproduce all cores of an i5 2500, but the factor depends on what applications you run and various other system specifics. x5 may be about right for the stock applications delivered by the project.
Joe

Wedge009
Volunteer tester
Avatar
Send message
Joined: 3 Apr 99
Posts: 309
Credit: 130,705,284
RAC: 248,686
Australia
Message 1184815 - Posted: 12 Jan 2012, 20:49:59 UTC - in response to Message 1184748.

...CPUs are the only ones that get VLAR tasks assigned, so no, I don't think that using them for SETI is redundant.

Maybe it's because of the anonymous platform set-up, but I often get VLARs assigned to my Radeon GPU. While it doesn't suffer as badly as CUDA does in terms of run times, it does cause severe unresponsiveness in the GUI, so I often have to manually push the VLAR tasks back to the CPU. It's a bit frustrating.

But I concur with the point that CPUs are not redundant for S@h processing.
____________
Soli Deo Gloria

Profile Mike
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 23392
Credit: 31,845,746
RAC: 23,958
Germany
Message 1184825 - Posted: 12 Jan 2012, 21:24:55 UTC
Last modified: 12 Jan 2012, 21:25:24 UTC

Each work unit processed is science.
So nothing is redundant in my opinion.

ATI/AMD GPUs can handle VLARs very well.
____________

Profile AllenIN
Send message
Joined: 5 Dec 00
Posts: 159
Credit: 12,602,539
RAC: 14,272
United States
Message 1184884 - Posted: 13 Jan 2012, 2:59:27 UTC

I have just built a system around the AMD A8-3870 APU.

I was wondering if anyone knows if the internal GPU (Radeon 6550D) is available for seti. So far I haven't been able to get any WU's for it.

Thanks, Allen
____________

Profile arkayn
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3595
Credit: 47,380,702
RAC: 4,180
United States
Message 1184894 - Posted: 13 Jan 2012, 4:42:48 UTC - in response to Message 1184884.

I have just built a system around the AMD A8-3870 APU.

I was wondering if anyone knows if the internal GPU (Radeon 6550D) is available for seti. So far I haven't been able to get any WU's for it.

Thanks, Allen


There is no stock app for AMD GPU's, you would need to install the optimized apps to take advantage of the AMD GPU.
http://lunatics.kwsn.net/index.php?module=Downloads;catd=9
____________

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3368
Credit: 46,108,420
RAC: 20,493
Russia
Message 1184915 - Posted: 13 Jan 2012, 8:59:23 UTC - in response to Message 1184815.

While it doesn't suffer as badly as CUDA does in terms of run times, it does cause severe unresponsiveness in the GUI, so I often have to manually push the VLAR tasks back to the CPU. It's a bit frustrating.

ATi GPUs can handle VLARs much better than NV (comparing on the same code base so looks like it's more hardware function not software).
To reduce lag on VLARs increase -period_iterations_num value.

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3368
Credit: 46,108,420
RAC: 20,493
Russia
Message 1184917 - Posted: 13 Jan 2012, 9:00:45 UTC - in response to Message 1184894.

I have just built a system around the AMD A8-3870 APU.

I was wondering if anyone knows if the internal GPU (Radeon 6550D) is available for seti. So far I haven't been able to get any WU's for it.

Thanks, Allen


There is no stock app for AMD GPU's, you would need to install the optimized apps to take advantage of the AMD GPU.
http://lunatics.kwsn.net/index.php?module=Downloads;catd=9

It would be interesting to see CLinfo output for this GPU to say if it capable or not. Perhaps - yes, it's capable for OpenCL apps too. And surely for hybrid AP.

Profile AllenIN
Send message
Joined: 5 Dec 00
Posts: 159
Credit: 12,602,539
RAC: 14,272
United States
Message 1184930 - Posted: 13 Jan 2012, 11:09:52 UTC

Thanks to both of you. I will bone up on the info and see what I get.
Stay tuned!

Allen
____________

hbomber
Volunteer tester
Send message
Joined: 2 May 01
Posts: 437
Credit: 50,852,768
RAC: 216
Bulgaria
Message 1184940 - Posted: 13 Jan 2012, 12:11:17 UTC - in response to Message 1184730.
Last modified: 13 Jan 2012, 12:13:25 UTC

Hello

I've currently got a GTX470 running seti with seemingly good performance. However, I'd like to know how much more productive it is versus a core i5 2500, for example. 2x faster? x5, x10?

Has running seti on CPUs become redundant, or can they still provide useful work?



470 would be about 1.6x faster, compared to all 4 cores. With my setup, GTX 460@810 MHz core is just about same productive as 2500K@4.65 GHz. So having in mind your 2500 is at 3.3 GHz, and 470 is maybe a bit faster than my clocked GPUs, above number seems to me just about right. But CPU is far more efficient. Jag tror it uses around 50-60 watts, while 470 will use something like 150-160.
My comparison is done using optimized applications. With stock apps GPU will perform even more better than CPU.
____________

Profile AllenIN
Send message
Joined: 5 Dec 00
Posts: 159
Credit: 12,602,539
RAC: 14,272
United States
Message 1185125 - Posted: 14 Jan 2012, 2:27:11 UTC - in response to Message 1184894.

I have just built a system around the AMD A8-3870 APU.

I was wondering if anyone knows if the internal GPU (Radeon 6550D) is available for seti. So far I haven't been able to get any WU's for it.

Thanks, Allen


There is no stock app for AMD GPU's, you would need to install the optimized apps to take advantage of the AMD GPU.
http://lunatics.kwsn.net/index.php?module=Downloads;catd=9


Wow, I think that made a difference. I just completed my first wu with the gpu and it was a 4hr 26 min unit that completed in 16 minutes or so.

Seems that the four cpu units are completing much faster now too. It appears that they will complete 4hr plus units in about 1 hour. I don't get it, but I like it.

Thanks again!!

____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5698
Credit: 56,457,389
RAC: 49,094
Australia
Message 1185147 - Posted: 14 Jan 2012, 5:06:10 UTC - in response to Message 1185125.
Last modified: 14 Jan 2012, 5:06:44 UTC

Seems that the four cpu units are completing much faster now too. It appears that they will complete 4hr plus units in about 1 hour. I don't get it, but I like it.

First, the estimated completion times for GPU work tend to be way out. Way, way, way out. This is a result of a bug fix, that has yet to be fixed.

But most importantly- the optimised applications give much improved crunching times.
For my i7 GTX460- shorty WUs take about 40 min for CPU (8 at a time), just over 3 min for GPU (2 at a time). VLARs on the CPU take 2-2.5 hours. The longest running WUs on the GPU are about 22min.
____________
Grant
Darwin NT.

Profile AllenIN
Send message
Joined: 5 Dec 00
Posts: 159
Credit: 12,602,539
RAC: 14,272
United States
Message 1185301 - Posted: 14 Jan 2012, 20:34:11 UTC - in response to Message 1185147.

Thanks for the input Grant.

Those dedicated graphics cards are definitely a boost!!!!!!!!!

However, for the cost of this APU, I believe it's a real value.
Things are still ramping up, so I'm not yet sure just how things will shape up down the road.

You're definitely correct about the optimized apps though, they make a great deal of difference in compute time.

Allen
____________

Profile ignorance is no excuse
Avatar
Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,274
RAC: 0
Korea, North
Message 1185386 - Posted: 15 Jan 2012, 2:19:24 UTC - in response to Message 1185301.

Raistmer I just bought a laptop that has an a4-3300 APU. GPU calls the gpu portion of the chip an SUMO series HD 6480 . It seems to work just fine with the HD5 version of the lunatics app
____________
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

doug
Volunteer tester
Send message
Joined: 10 Jul 09
Posts: 199
Credit: 9,397,616
RAC: 1,359
United States
Message 1185400 - Posted: 15 Jan 2012, 5:30:55 UTC - in response to Message 1185386.

Have an i7 4 core that says it's a Q840 running at 1.87 GHz on my laptop. I have an ATI FirePro M7820 card in it, which I believe is something in the Radeon 5800 series. I can't tell anymore with all these numbers and names. I'm running the Lunatics apps. VLARs are about the same between CPU and GPU with GPU being maybe slightly faster. Shorties are around 6-10 min wall time GPU vs. 30-50 CPU. AP wu are maybe 2-3 hours GPU vs 12 CPU. So I run 3 cores at 100%, 1 at 90% to feed the GPU. I have 4 CPU wus running at a time and 2 GPU. I reschedule all AP wu as GPU wu. I get about an even mix of VLAR wu for CPU and GPU delivered and since they run about the same I don't mess with it. I seem to get about 7500 RAC out of it, but that is quite variable for a lot of reasons. All in all I consider that quite good for a laptop. The ATI GPU has had no heat problems whatsoever, it's running at about 59 C as we speak. I have burned my leg from the i7 heatsink though when I was wearing shorts in the Summer.

Just my input.

Profile AllenIN
Send message
Joined: 5 Dec 00
Posts: 159
Credit: 12,602,539
RAC: 14,272
United States
Message 1185402 - Posted: 15 Jan 2012, 5:39:49 UTC - in response to Message 1184917.

I have just built a system around the AMD A8-3870 APU.

I was wondering if anyone knows if the internal GPU (Radeon 6550D) is available for seti. So far I haven't been able to get any WU's for it.

Thanks, Allen


There is no stock app for AMD GPU's, you would need to install the optimized apps to take advantage of the AMD GPU.
http://lunatics.kwsn.net/index.php?module=Downloads;catd=9

It would be interesting to see CLinfo output for this GPU to say if it capable or not. Perhaps - yes, it's capable for OpenCL apps too. And surely for hybrid AP.


Here is the info from clinfo. Hope it's as interesting as you thought it would be. Maybe you can see something that would be helpful to me in implementing it fully.


Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 1.1 AMD-APP (831.4)
Platform Name: AMD Accelerated Parallel Proces
sing
Platform Vendor: Advanced Micro Devices, Inc.
Platform Extensions: cl_khr_icd cl_amd_event_callbac
k cl_amd_offline_devices cl_khr_d3d10_sharing


Platform Name: AMD Accelerated Parallel Proces
sing
Number of devices: 2
Device Type: CL_DEVICE_TYPE_GPU
Device ID: 4098
Max compute units: 5
Max work items dimensions: 3
Max work items[0]: 256
Max work items[1]: 256
Max work items[2]: 256
Max work group size: 256
Preferred vector width char: 16
Preferred vector width short: 8
Preferred vector width int: 4
Preferred vector width long: 2
Preferred vector width float: 4
Preferred vector width double: 0
Max clock frequency: 600Mhz
Address bits: 32
Max memory allocation: 199753728
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 8
Max image 2D width: 8192
Max image 2D height: 8192
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 16
Max size of kernel argument: 1024
Alignment (bits) of base address: 2048
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: No
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: None
Cache line size: 0
Cache size: 0
Global memory size: 536870912
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 32768
Error correction support: 0
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue properties:
Out-of-Order: No
Profiling : Yes
Platform ID: 6B06C4F4
Name: BeaverCreek
Vendor: Advanced Micro Devices, Inc.
Driver version: CAL 1.4.1646 (VM)
Profile: FULL_PROFILE
Version: OpenCL 1.1 AMD-APP (831.4)
Extensions: cl_khr_global_int32_base_atomic
s cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_lo
cal_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store
cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd
_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_khr_d3d10_sharing


Device Type: CL_DEVICE_TYPE_CPU
Device ID: 4098
Max compute units: 4
Max work items dimensions: 3
Max work items[0]: 1024
Max work items[1]: 1024
Max work items[2]: 1024
Max work group size: 1024
Preferred vector width char: 16
Preferred vector width short: 8
Preferred vector width int: 4
Preferred vector width long: 2
Preferred vector width float: 4
Preferred vector width double: 0
Max clock frequency: 3000Mhz
Address bits: 32
Max memory allocation: 1073741824
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 8
Max image 2D width: 8192
Max image 2D height: 8192
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 16
Max size of kernel argument: 4096
Alignment (bits) of base address: 1024
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: Yes
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 64
Cache size: 65536
Global memory size: 2147483648
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Global
Local memory size: 32768
Error correction support: 0
Profiling timer resolution: 341
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: Yes
Queue properties:
Out-of-Order: No
Profiling : Yes
Platform ID: 6B06C4F4
Name: AMD A8-3870 APU with Radeon(tm)
HD Graphics
Vendor: AuthenticAMD
Driver version: 2.0
Profile: FULL_PROFILE
Version: OpenCL 1.1 AMD-APP (831.4)
Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_
global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int3
2_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store
cl_khr_gl_sharing cl_ext_device_fission cl_amd_device_attribute_query cl_amd_ve
c3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_khr_d3d10_sharing


____________

1 · 2 · Next

Message boards : Number crunching : CPU vs GPU

Copyright © 2014 University of California