Message boards :
Number crunching :
To Hyperthread or not to Hyperthread
Message board moderation
Author | Message |
---|---|
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
I know we have been thru this before but I never really paid much attention as at the time I didn't have a chip where I could do this. So the question, Is there any advantage to hyperthreading the chip and using all the virtual cores or is it better to turn off the hyperthread and only work with the physical cores. Does Hyperthreading impede GPU crunching in any way? Thanks, Zalster |
OTS Send message Joined: 6 Jan 08 Posts: 369 Credit: 20,533,537 RAC: 0 |
I am sure there are lots of opinions and even valid reasons why one way is better than the other but from my limited experience I found if ran 8 virtual cores the tasks took twice as long as with 4 cores so it was a wash. The only advantage I could see was that when I left 1 core free to support the GPU I still had 7/8 of the CPU available for crunching with hyperthreading instead of only 3/4 of the CPU if I didn’t so I run with hyperthreading. YMMV. |
Darth Beaver Send message Joined: 20 Aug 99 Posts: 6728 Credit: 21,443,075 RAC: 3 |
Zalster it will depend on the chip you use . The newer the chip the less it effects the times . I 5 , I 7 or above then i would use it otherwise something like a Core 2 Duo or a Pentium 4 then no i would not use it Also the speed of the chip makes a difference also . If it about 3.2 gig or faster then i would use it , 2.3 gig or slower not so much A simple version of what happens is that when you have HT on it cuts the speed in 1/2 and uses 1/2 the speed for 1 core and 1/2 the speed for the vertual core . There's more to it but that's a simplistic way to understand it . All thou the latest chips are much better at this and don't slow times down as much |
The Jedi Alliance - Ranger Send message Joined: 27 Dec 00 Posts: 72 Credit: 60,982,863 RAC: 0 |
I set up hyper threading for another reason...a larger work queue. With an i7 and a GTX 970 GPU I got 200 WU in my queue. It didn't matter how fast I was burning through WUs. With hyper threading I run 3 virtual Ubuntu machines, 1 virtual Server 2008 machine and 1 virtual Win7 machine, 1 cpu on the physical i7 and the GPU. I now get 700 WU in my queue. I have noticed no BOINC performance degradation using the virtual machines on my i7 4770 running at 3.4 GHz using Hyper-V. I have also tried hyper threading on my Core 2 Quad 9550 running at 2.83 GHz, also running virtual Ubuntu but using Virtual Box. The jury is still out on those machines because they were set up just before a widespread outage and have suffered through two smaller outages since. Regarding the GPU, with or without hyper threading, as long as I leave 1 physical cpu for the gpu work I have been able to run 6 simultaneous gpu tasks. I leave 1 physical cpu for myself. I'm a fan of hyper threading. |
OzzFan Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28 |
The Core 2 series lacked HyperThreading. Only the Pentium 4/D and Core i5/i7 series have SMT. As to the topic at hand, I do much more on my machines that straight out SETI, so I use HyperThreading for the additional threading power. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
With my i3-390 & six i7-860s I run HT on & use all cores for SETI@home. As it results in more work done per day. However I don't know about the latest generation. As all of my new CPUs are sans HT. However I have been wanting to get a new i3 to run tests with. If there turned out to be no advantage to using HT on the current gen CPUs I would leave HT on & limit BOINC to only the physical cores. Then let the HT cores run the OS & support any GPUs. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
I don't have HT on right now for the i7 but even with all the work units I'm doing, I don't exceed 6 cores so it leaves 2 free cores as is. My thought was, is it worth turning it on and then running some work on the HT cores? Might not be worth it but I thought to ask before I tried it Zalster |
The Jedi Alliance - Ranger Send message Joined: 27 Dec 00 Posts: 72 Credit: 60,982,863 RAC: 0 |
I've just analyzed the WU completed by the physical i7 and the virtual i7 machines. It appears as though the virtual machines are completing work slightly faster, although I'm not sure if this is due to the prowess of Hyper-V or if it is just the mix of WU received. Where I do see a significant benefit is in the total number of WU in my queue. Let's say you're running an i7 with 8 cores and all 8 core are running s@h at an average of 3 hours each. The maximum number of WU you will get in your queue is 100. That is a total of 300 hours of potential work, divided by the 8 cores will be completed in 37.5 hours. Let's say that s@h goes dry or down for 2-3 days. You're running on empty in 1.5 days. Now, let's say you're funning the same i7 but with 8 Hyper-V virtual machines, each one allocated 1 physical core. My results suggest you will still complete the WU at an average of 3 hours each, but now you will have 800 WU in your queue. Each virtual machine will have 300 hours of work, but for only 1 cpu. You will be able to withstand a 12.5 day drought. While it is true that a 12+ day drought of WUs is rare, a 2-3 day drought is fairly common. For me using Hyper-V was just a way around the 100 WU limit. |
ML1 Send message Joined: 25 Nov 01 Posts: 20147 Credit: 7,508,002 RAC: 20 |
The Intel HT gives a BIG boost in that you can claim to have x2 the number of real physical CPU cores available (and call them virtual cores). The physical reality is that each physical core is 'multiplexed' to alternately (simplistically) give one CPU cycle to one virtual core and then the next CPU cycle is given to the second virtual core, and so on. Hence, your virtual CPU cores each run at half speed if both are being used. You suffer a slight slowdown in the extra overhead for the OS to manage the Intel HT (or even to badly ignore it altogether). You gain a small boost from being able to opportunistically execute more threads if you have software that takes advantage of that. So... My brief summary from my experience is: If you are looking for performance, then you need to optimise for the HT just as you need to optimise for using a GPU for example. In very brief summary, you get somewhere between a x0.9 and a x1.3 speedup in THROUGHPUT if you utilise all available threads. For single threaded tasks, switching HT off can give you a better/faster response. (Note that for the recent AMD CPUs, you have real physical cores for each thread and there is an FPU shared between two cores. Very good structure for general use and for server use. For s@h, you only need half the cores computing to max out the FPUs. The other cores should then be used to support GPUs or for general running. There's some good opportunity there for performance comparisons ;-) ) Happy fast crunchin', Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
Marco Franceschini Send message Joined: 4 Jul 01 Posts: 54 Credit: 69,877,354 RAC: 135 |
|
Marco Franceschini Send message Joined: 4 Jul 01 Posts: 54 Credit: 69,877,354 RAC: 135 |
|
ML1 Send message Joined: 25 Nov 01 Posts: 20147 Credit: 7,508,002 RAC: 20 |
http://www.agner.org/optimize/blog/read.php?i=6 Excellent summary and two good examples, thanks. The very brief summary for consumers is that of befuddlement by the claimed 'x2' for the Marketing numbers. For the developers, there is a lot of hard work needed to optimise before being able to gain any useful performance boost. Whatever number of 'virtual' CPUs you might have still rely on the reality of the real-world physical CPU cores! All good useful stuff. Shame about what I see as the Marketing corruption behind it all. Anyone for the higher (Marketing?) GHz of Intel's Hyper Pipelined Technology and their "Netburst" Marketing?... Happy faster crunch in, Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
Marco Franceschini Send message Joined: 4 Jul 01 Posts: 54 Credit: 69,877,354 RAC: 135 |
Many many thanks Martin. Netburst was totally a nonsense...with its ridiculously double clocked (x2 core frequency) A.L.Us and lack of a fundamental piece of hardware: barrel shifter. Faster PIIIs (e.g Tualatin) did have this integrated (barrel shifter was introduced dating back at the 80386 era). Marco. |
ML1 Send message Joined: 25 Nov 01 Posts: 20147 Credit: 7,508,002 RAC: 20 |
... Netburst was totally a nonsense...with its ridiculously double clocked (x2 core frequency) A.L.Us and lack of a fundamental piece of hardware: barrel shifter. The barrel shifter omission will have been one of the design compromises for the higher GHz requiring deep ("Hyper!") pipelining. Barrel shift logic is quite a long logic path and so to pipeline that up 'NetBurst'-style would likely have been too much of a design compromise and cost. We really do need three or more good sized equally matched competitors in this game. Monopolies are always overly costly... Here's hoping AMD and ARM (or some others) can yet even up the CPU/GPU games! Happy cool fast crunchin', Martin ps: Instead of 'NetBurst'-style, perhaps we should dance to "Gangnam Style" Yeah!!! ;-) (There's just got to be some good viral Marketing in that... :-( ) See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
James Sotherden Send message Joined: 16 May 99 Posts: 10436 Credit: 110,373,059 RAC: 54 |
I know we have been thru this before but I never really paid much attention as at the time I didn't have a chip where I could do this. I did a test back in 2014 running stock aps on my I7 920 Visata machine. I ran both 4 cores and then did HT on. Then ran 4 cores with my GTS 250. Then with HT on with my GTS 250. And than ran 7 cores with my GTS 250. Here is my summary at the end of the test. Running with HT on with a GPU does help RAC with stock apps anyway. I havent had the guts to do it with lunatics yet. Maybe after the WOW event I might try it. Of course your milage my vary. Message 1565148 30 Aug. 2014 This running stock MB only test is now done. NNT is set. And I will go back to running lunatics. I started this test with an average RAC of 5450 running luntaics doing MB and AP work. I went to stock MB only with HT on no GPU useing 8 cores. RAc ended up at 2017 Average. Turned off HT ran four cores no GPU average rac at end was 1,834 HT off and running GPU RAC average 5,043. Freed one core HT off and running GPU Average RAC 5,083 HT on 8 cores and GPU running average RAC 5,326. Freed one core and running GPU, average RAC 5,442. End of test. I came into this test with alot of preconcieved notions on what would happen. The only one I had right was not running a GPU will hurt RAC a lot. For my machine the I7 920 with my GTS 250, Its debateible if freeing up a core with HT on or off makes a dfferance. Higher end cards thats something else. [/quote] Old James |
JBird Send message Joined: 3 Sep 02 Posts: 297 Credit: 325,260,309 RAC: 549 |
Based on my read @ NVidia-Maxwell FAQ, Hyperthreading ON *Activates Maxwell's Unified Memory Feature- which provides *better I/O - throughput between GPU/CPU/RAM. = Suggestion: Turn Hyperthreading ON in BIOS (*I turn Speedstep OFF(its in the Power area) and TurboBoost ON too) = But still set 8 Core config at BOINC and SETI Computer Prefs et Voila! 8 Free Cores(Threads) for your GPUs to Feast on(*without "Freeing a Core" the other way) = Just sayin.... ;) |
Rasputin42 Send message Joined: 25 Jul 08 Posts: 412 Credit: 5,834,661 RAC: 0 |
It is like cutting a cake into more and more pieces and expecting the sum of the pieces to more than one cake. Under very special circumstances, there might me a marginal improvement when using hyper-threading, but in general, it does not make things better. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Yeah, I'm not planning on using the Hyperthread. It's doing just fine without it Zalster |
JBird Send message Joined: 3 Sep 02 Posts: 297 Credit: 325,260,309 RAC: 549 |
PS - *Could be doing AVX 2.0 and Intel openCL 2.0 APs with that bad boy i7-5960X CPU |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
Yeah, I'm not planning on using the Hyperthread. It's doing just fine without it Another consideration is power usage when using HT. My i7-860 showed 11.1% increase in power consumption & 27.7% increase in work output when running HT on & 8 tasks. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.