Bulldozer vs Vishera vs Sandy Bridge(small comparison)


log in

Advanced search

Message boards : Number crunching : Bulldozer vs Vishera vs Sandy Bridge(small comparison)

1 · 2 · Next
Author Message
hbomber
Volunteer tester
Send message
Joined: 2 May 01
Posts: 437
Credit: 50,852,854
RAC: 31
Bulgaria
Message 1328228 - Posted: 17 Jan 2013, 11:16:50 UTC
Last modified: 17 Jan 2013, 11:20:49 UTC

I've seen many time ppl buy "8-core" AMD CPUs, mislead by number of cores and absolutely convinced its best bang for the money, instead of 2500K or even 3570K, which have only 4 cores. I know which is better for crunching, but recently friend of mine bought 8320 CPU and put it under the high end water cooling, so I decided to run a comparison between this 8320, my old 2500K and a 8120, which he took temporarily to compare Buldozer and Vishera core arhitectures.
We settled at 4.6 GHz frequency for testing, as his MSI mobo has some issues with Visheras and 8120 is barely able to reach higher frequency. My 2500K is perfectly OK with old Noctua NH-U12 cooler, running 4.6 GHz 24/7, crunching our favorite SETI.
Beside completion times, we measured power consumption. Values, given here, are the values from wall socket, where consumption of all components, included in all systems, except motherboard, memory and CPU are excluded. So the numbers bellow are only for those three. Also, values are raw, PSU efficiency is not applied, but both systems are equipped with high end PSU, with similar percentage load, so we can assume, efficiency is close enough for both systems. Energy used per WU is based on load consumption minus idle consumption, only difference between these two.
Also, for comparison sake, I included my shiny new 3930K, running also at 4.6 GHz. But its not part of general comparison, bcs its price is way higher. Anyways, its interesting to see out how energy efficient it is, as it is rarely seen CPU, especially here and even more especially, running at such high frequency.
Test is based on running 24 MB WU tasks with most common AR lately - 0.385, in different scenarios and target is finding average time per task, wattage per task and total number of tasks per day. There are some runs for AMD CPU with different memory settings to see how they affect crunching speed. I also decided, it is need to be seen how good AMDs wil be running only 4 tasks simultanenously, as they have 4 physical FPUs - this is where "MT efficiency" comes from - its actual gain from running 8 tasks, compared with 4 tasks. For 3930K CPU, there is "HT efficiency", standing for difference with running 12 and 6 simultaneous threads(MB tasks in our case). "Average CPU time" is time for a single task, as it is written in client_state.xml file, after completion.
Actually, tasks ran are more, 36, so to not let cores being idle, when they complete some task, but only 24, same for all runs were taken into account. Some note on Intel CPUs - they uses fastest client available, SSE3 Intel. I ran tests with all available clients from Lunatics installer and picked the fastest. I also included 3930K with only 4 cores and hyper-threading enabled, to simulate 8-threaded Sandy Bridge(well, can't ignore 4 MB more cache, but life is a bitch). For AMD CPU, best client is known, not much choices to make :) I ran also AMD SSE3 client on Intel CPUs and 3930K on 4.7 GHz(just to prove it can do it and to see scaling. It can actually do 4.9 GHz with this cooling, but VRM temperature reaches 120 degrees Celsius, which is not healthy for the system). If someone needs full data, let me know, I'll post it.
Important, those msot likely to be scenarious, are made bold for easy viewing.

Here go the numbers:
AMD CPUs:
----- 8120, 4.6 GHz, 8 WUs - 1600/8, idle 93 W, load 360 W
Average CPU time: 7071.01 s, 1:57:51 , 0.51 WU/h, 12.22 WU/day per core, 97.75 WU/day per CPU package, 65.56 W/WU, MT efficiiency, 60.91 %


----- 8120, 4.6 GHz, 4 WUs - 1600/8, idle 93 W, load 310 W
Average CPU time: 5689.96 s, 1:34:50 , 0.63 WU/h, 15.19 WU/day per core, 60.75 WU/day per CPU package, 85.73 W/WU

----- 8320, 4.6 GHz, 8 WUs - 1600/8, idle 80 W, load 310 W
Average CPU time: 6862.40 s, 1:54:22 , 0,53 WU/h, 12.60 WU/day per core, 100.72 WU/day per CPU package, 53.74 W/WU, MT efficiency 62.27 %


----- 8320, 4.6 GHz, 4 WUs - idle 80 W, load 260 W
Average CPU time: 5568.81 s, 1:32:49 , 0.65 WU/h, 15.52 WU/day per core, 62.07 WU/day per CPU package, 69.60 W/WU

Memory frequency and HyperTransport speed changed(default memory speed is 1600 8-8-8-28, 2200 MHz HT):
----- 8320, 4.6 GHz, 8 WUs - Memory 1866 MHz, 8-8-8-24
Average CPU time: 6844.64 s, 1:54:05 , 0,53 WU/h, 12.62 WU/day per core, 100.98 WU/day per CPU package

----- 8320, 4.6 GHz, 8 WUs - Memory 1866 MHz, 8-8-8-24, 2600 MHz Hyper Transport
Average CPU time: 6845.57 s, 1:54:06

Intel CPUs:

---- 2500K, 4.6 GHz, SSE3 Intel - idle 59 W, load 164 W
Average CPU time: 3297.42 s, 0:54:57, 1.09 WU/h, 26.21 WU/day per core, 104.82 WU/day per core package, 24.04 W/WU


---- 3930K, 4.6 GHz, no HT, SSE3 Intel - idle 79 W, load 260 W
Average CPU time: 3217.18 s, 0:53:37 , 1.19 WU/h, 26.86 WU_day per core, 161.14 WU/day per CPU package, 26.96 W/WU


---- 3930K, 4.6 GHz, SSE3 Intel - idle 79 W, load 300 W
Average CPU time: 5519.78 s, 1:31:60 , 0,65 WU/h, 15.65 WU/day per core, 187.86 WU/day per CPU package, 29.51 W/WU, HT efficiency 16.58 %


---- 3930K, 4.6 GHz, SSE3 Intel, 4 cores + HT - idle 77 W, load 221 W
Average CPU time: 5237.97 s, 1:27:18 , 0,69 WU/h, 16.50 WU/day per core, 131.98 WU/day per CPU package, 26.19 W/WU, HT efficiency (none, bcs system refuses to boot with 4 CPU cores and Hyper-Threading disabled)

Well, conclusion is obvious. 2500K is faster than Vishera(which is faster than Buldozer) and it is twice more energy efficient. Based on electricity prices here in Bulgaria, where we have relatively cheap electricity, and prices of CPU + Mobo, able to achieve such frequencies(not counting the price of water cooling for Vishera) it will take three months 24/7 cruching to make cost of ownership of Intel system same as Vishera system. In areas with more expensive energy, it will take less. Lately, things are even better for having 2500K, as Microcenter USA gave them for 99 US bucks for a piece with store pick-up, and greater availability of 2500K second hand(btw, Microcenter Minneapolis was most messy computer parts store I've ever seen - was like post-nuclear scenery :). Unusual for US, but there is was. And maybe still is, will see again this summer). There are more good thing of having 2500K, instead on 8320. 2500K overclocks better with box cooler. 4.2-4.3 is usual range. Vishera can reach 4 GHz, but you wont like temperatures - they are very close to temperatures where most AMD CPUs fail, I wouldn't leave 24/7 like this. And one more thing. Bcs of low completion times per WU with 2500K, I find it more error prone - errors can be caught more faster and less work is lost if CPU trashes units and gives errors.
I'm happy 2500K owner for two years, and above is the prove that it is good piece of hardware, worth the money. 12K RAC average(100 units*120 credits each) from CPU is descent value.
I'm even more happy with 3930K, but as we know, it is in other category. It can go further than 20K RAC per day, which I find very impressive value for CPU. THe most impressive part it the tests with it was, that it is very close as for energy efficiency to its low end counterparts, in contrary to my expectations.
I'm looking forward to replace 2500K with 3570K, which is even more faster(around 15% expected) and more energy efficient, but its only if good deal comes around.

I hope you enjoyed it :)
____________

hbomber
Volunteer tester
Send message
Joined: 2 May 01
Posts: 437
Credit: 50,852,854
RAC: 31
Bulgaria
Message 1328251 - Posted: 17 Jan 2013, 12:55:52 UTC

It is not very clear as I read it again.
This:

Average CPU time: 3297.42 s, 0:54:57, 1.09 WU/h, 26.21 WU/day per core, 104.82 WU/day per core package, 24.04 W/WU

means:

Average CPU time for one WU - 3297.2 s
Same, but in format h:m:s - 0:54:57
1.09 WU/h - number of WU per core for one hour
26.21 WU/day per core - this is self explanatory, above, multiplied by 24 hours
104.82 WU/day per core package - number of active cores, multiplied by above number
24.04 W/WU - doesn't need explanation
____________

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3386
Credit: 46,235,010
RAC: 7,416
Russia
Message 1328324 - Posted: 17 Jan 2013, 18:03:07 UTC

Valuable info, thanks!

____________

TBar
Volunteer tester
Send message
Joined: 22 May 99
Posts: 1198
Credit: 44,227,883
RAC: 114,625
United States
Message 1328357 - Posted: 17 Jan 2013, 19:20:34 UTC

You should run it again using an AVX optimized App on the Sandy Bridge. AVX is new with Sandy Bridge and appears to be killer, IntelĀ® AVX is a new-256 bit instruction set extension to IntelĀ® SSE and is designed for applications that are Floating Point (FP) intensive. Look at the AstroPulse times from this client, All AstroPulse v6 tasks for computer 5643864 He appears to be running the Linux AVX App from here, AstroPulse for Linux That's about 3.5 hours for a CPU AP. That is Fast, about 3x as fast as the last Windows SSE2 CPU AP App from Lunatics, r557. I had never run r557 so I gave it a try after seeing those numbers, I'm disappointed it's so much slower than the AVX. It almost makes me want to build a new machine that can use AVX. Or at least buy another AMD 6850 that my present machines can use.

Anyone know of a Windows SSE4 CPU AstroPulse App? I can at least run that on a Xeon. I'm envious of those AstroPulse CPU times.

hbomber
Volunteer tester
Send message
Joined: 2 May 01
Posts: 437
Credit: 50,852,854
RAC: 31
Bulgaria
Message 1328373 - Posted: 17 Jan 2013, 19:47:55 UTC
Last modified: 17 Jan 2013, 19:54:27 UTC

Actually, I've ran this r557 AP application, but for Windows, on 3930K(4.6 GHz), with and without AVX(Win XP x64 and Win7 x64). Difference in not that big, although it is significant. 4:20 hours without AVX(can be seen in my computers page even now) vs a lil bit less than 4 hours with AVX.
AFAIK, there isnt MB application for Windows, which uses AVX. Only Astropulse. And switching to Linux is not an option :)
____________

TBar
Volunteer tester
Send message
Joined: 22 May 99
Posts: 1198
Credit: 44,227,883
RAC: 114,625
United States
Message 1328376 - Posted: 17 Jan 2013, 19:54:35 UTC - in response to Message 1328373.

What are your times without the overclock? I'm really not into overclocking...

hbomber
Volunteer tester
Send message
Joined: 2 May 01
Posts: 437
Credit: 50,852,854
RAC: 31
Bulgaria
Message 1328380 - Posted: 17 Jan 2013, 19:58:12 UTC
Last modified: 17 Jan 2013, 20:00:59 UTC

I don't know exactly, this CPU never ran on default frequency. But you can calculate, scaling of compeltion times is same as frequency scaling. Add some 30% to times and it will give approximate values for default 3.2 GHz(non-turbo frequency, we assume all cores are loaded) of 3930K.
Perhaps Linux apps are just better optimized.
____________

TBar
Volunteer tester
Send message
Joined: 22 May 99
Posts: 1198
Credit: 44,227,883
RAC: 114,625
United States
Message 1328385 - Posted: 17 Jan 2013, 20:15:03 UTC - in response to Message 1328380.

I don't know exactly, this CPU never ran on default frequency. But you can calculate, scaling of compeltion times is same as frequency scaling. Add some 30% to times and it will give approximate values for default 3.2 GHz(non-turbo frequency, we assume all cores are loaded) of 3930K.
Perhaps Linux apps are just better optimized.

So, without the OC, the AVX, and at 3.2 GHz, the time would be a little less than 6 hours. 3.5 hours sounds so much better. Still, less than 6 hours is much better than with my 2.8 GHz Xeon running r557.

Profile Wiggo
Avatar
Send message
Joined: 24 Jan 00
Posts: 6695
Credit: 92,242,139
RAC: 73,254
Australia
Message 1328400 - Posted: 17 Jan 2013, 20:41:26 UTC - in response to Message 1328385.

On my 2500K the old non-AVX AP app took almost 6hrs to complete @ 3.4GHz but the AVX AP app dropped that down to just over 5hrs.

Back on topic, I've made comments before in threads where people have asked AMD or Intel and I've always stated that the small amount that is saved by buying an AMD CPU setup will soon be cancelled out by the power consumption and still not produce the results that Intel produce which those figures just backup.

No I'm not an Intel fan, even though all 3 of my rigs are currently Intel but I've had plenty of AMD setups in the past, but in my books AMD has not produced a properly balanced CPU since the Athlon X2/X4 range, they just use more power without producing an equivalent higher output, in fact several models since have just gone backwards IMO.

Cheers.
____________

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3386
Credit: 46,235,010
RAC: 7,416
Russia
Message 1328459 - Posted: 18 Jan 2013, 0:18:05 UTC

Actually "pure CPUs" are in the past.
Need to compare APUs now. Intel "APU" (SandyBrige is APU in AMD terms though Intel probably never called it so) and AMD APUs. That is, GPU component too.
____________

hbomber
Volunteer tester
Send message
Joined: 2 May 01
Posts: 437
Credit: 50,852,854
RAC: 31
Bulgaria
Message 1328461 - Posted: 18 Jan 2013, 0:22:01 UTC
Last modified: 18 Jan 2013, 0:22:21 UTC

In that spirit, any chance of having Intel built-in GPUs to work for SETI? You are our only hope :)
____________

Profile tullioProject donor
Send message
Joined: 9 Apr 04
Posts: 3639
Credit: 366,799
RAC: 209
Italy
Message 1328533 - Posted: 18 Jan 2013, 6:15:54 UTC

I am using an Opteron 1210 at 1.8 GHz burning 75 W, bought in 2008, and an APU E350 at 1.6 GHz burning 18 W bought last year. Their performance is about equal, no GPU. The Opteron powers a SUN WS, the APU a HP laptop where I installed a SSD by OCZ. Both use SuSE Linux.
Tullio
____________

Mark Lybeck
Send message
Joined: 9 Aug 99
Posts: 209
Credit: 97,455,624
RAC: 78,550
Finland
Message 1328815 - Posted: 18 Jan 2013, 21:16:07 UTC - in response to Message 1328324.

Hoohoo,

When is the new lunatics cpu app available with avx extensions for sandy / ivy bridge?



____________

Lionel
Send message
Joined: 25 Mar 00
Posts: 544
Credit: 221,884,340
RAC: 207,051
Australia
Message 1330099 - Posted: 22 Jan 2013, 9:14:01 UTC - in response to Message 1328815.

good question
____________

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3386
Credit: 46,235,010
RAC: 7,416
Russia
Message 1330176 - Posted: 22 Jan 2013, 16:35:38 UTC - in response to Message 1328461.

In that spirit, any chance of having Intel built-in GPUs to work for SETI? You are our only hope :)


Yes, already. But I need testers. So far it works only on my own Ivy Bridge :)
____________

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3386
Credit: 46,235,010
RAC: 7,416
Russia
Message 1330177 - Posted: 22 Jan 2013, 16:38:57 UTC - in response to Message 1328815.

Hoohoo,

When is the new lunatics cpu app available with avx extensions for sandy / ivy bridge?




All apps build with latest FFTW library (and Lunatics do) have some AVX support, at least in FFT part. Some additional AVX code was added by Joe Segur to MB stock, but still not adopted for opt builds. Maybe new stock MB will have AVX support then.
____________

SupeRNovA
Volunteer tester
Avatar
Send message
Joined: 25 Oct 04
Posts: 131
Credit: 9,238,042
RAC: 0
Bulgaria
Message 1330203 - Posted: 22 Jan 2013, 20:22:38 UTC - in response to Message 1330176.

In that spirit, any chance of having Intel built-in GPUs to work for SETI? You are our only hope :)


Yes, already. But I need testers. So far it works only on my own Ivy Bridge :)


:) nice ! When the new Haswell processors comes. I think the gpu inside the cpu can crunch something additional like a bonus :D
____________

Mark Lybeck
Send message
Joined: 9 Aug 99
Posts: 209
Credit: 97,455,624
RAC: 78,550
Finland
Message 1330215 - Posted: 22 Jan 2013, 21:07:09 UTC - in response to Message 1330176.
Last modified: 22 Jan 2013, 21:09:56 UTC

In that spirit, any chance of having Intel built-in GPUs to work for SETI? You are our only hope :)


Yes, already. But I need testers. So far it works only on my own Ivy Bridge :)


Just tell me how to download a beta for the CPU I'll give it a try on my new i7-3770K. I hope hyper-threading (maybe not for the on chip GPU) is supported so I can try to stuff 8 WU concurrently.

I can then post results. If you prefer to have the development feedback on arkayns forum thats fine also.
____________

Profile petri33Project donor
Volunteer tester
Send message
Joined: 6 Jun 02
Posts: 375
Credit: 66,751,042
RAC: 8,432
Finland
Message 1330268 - Posted: 23 Jan 2013, 0:19:15 UTC - in response to Message 1330176.

Hi,

My i7-3930K is running boinc on linux 64 bit (FC14).
I could test a new MB AVX version if you or anybody else have a version for linux.

(I'm currently runnig the almost a year old lunatics MB CPU & GPU and astropulse AVX for linux64.)

The system is set up to use HT (6->12 cores) but boinc CPU is limited to 50% of processors. This way i have 6 HT cores for GPU apps whenever they need CPU for memory transfer/occasional arithmetic.

--
petri33
____________

hbomber
Volunteer tester
Send message
Joined: 2 May 01
Posts: 437
Credit: 50,852,854
RAC: 31
Bulgaria
Message 1330270 - Posted: 23 Jan 2013, 0:39:10 UTC

It will be great, if u test with 6 and 12 enabled threads, same tasks. Interestingly, on this processor, HT efficiency is almost non-existent with AP, even worse, 12 simultaneous tasks are finished same slow as two groups of 6 tasks(I turn off HT completely from BIOS - XP scheduler get confused with more than 8 cores - if 6 tasks are being run with 12 cores enabled, it does not put one task per physical core, but uses very often two logical cores of same physical core, even when there is other free physical cores. With 4 tasks on 8 cores it does make it OK). For example, this run, from first post, with 12 simultaneous tasks, where HT efficiency is 16.5%, when I ran it second time, gave me even worse result than 2 sets of 6 tasks, something like "negative HT efficiency".

Raistmer, if your Intel OpenCL application can run on Sandy Bitch too, I'm ready to test it. You may contact me on PM with details any time.
____________

1 · 2 · Next

Message boards : Number crunching : Bulldozer vs Vishera vs Sandy Bridge(small comparison)

Copyright © 2014 University of California