Message boards :
Number crunching :
Stock vs Lunatics: CPU throughput
Message board moderation
Author | Message |
---|---|
Shaggie76 Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 |
I switched my workstation back to stock and noticed the most amazing thing: for the first time in recent memory my CPU temps are actually lower than my GPU! 5 degrees and declining (which is shocking given the GPU temps are actually higher). I know that S@H uses libfftw which should use AVX and I assume that's where the heavy SIMD-lifting is happening (maybe I'm wrong?). Based on my experiences with SSSE3-SSE4.2 I can't imagine any other processor feature making that big of a difference for this as AVX. The only other theory I can think of is maybe stock is being built with an older version of VS -- I recall x64 codegen being improved in more recent builds (I think it was is VS2012?). Or maybe the lunatics version is being built as a VEX app so it can use AVX throughout without transition penalties? I don't mean to take a dig at Lunatics but I was also fascinated to see the default CUDA load making extremely good use of my GPU -- I used to run 3 apps at a time (on the left) but the right is default 1 task OpenCL. From what I can tell the OpenCL app is keeping the GPU more busy even with one task at a time (I haven't tuned anything at all). That, or OpenHWM is only measuring the first SM cluster or something whacky like that? The other worthless anecdote after switching was my power to the wall is a bit lower -- I assume this is consistent with the CPU running cooler. This begs the question: can you mix and match? Lunatics CPU + OpenCL SoG/SaH GPU? And what's stopping the CPU optimizations from getting into the main build? |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13727 Credit: 208,696,464 RAC: 304 |
This begs the question: can you mix and match? Lunatics CPU + OpenCL SoG/SaH GPU? And what's stopping the CPU optimizations from getting into the main build? That's what Richards' Beta Lunatics installer allows you to do- install the CPU & CUDA or OpenCL applications without losing your cache or in progress work. SoG performs much better than SaH. And what's stopping the CPU optimizations from getting into the main build? No idea, however the more optimised the stock application becomes, the lower the Credit payout becomes. Grant Darwin NT |
The_Matrix Send message Joined: 17 Nov 03 Posts: 414 Credit: 5,827,850 RAC: 0 |
I can't imagine any other processor feature making that big of a difference for this as AVX. Me would interest the impact on AMD cpu that support AVX etc. any workunit samples ? I am asking because i am shortly before to install lunarics 64-bit again, but now on an AMD system and see. SSE3 is recomended before avx , i see... |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
And what's stopping the CPU optimizations from getting into the main build? They do but slowly. Also, AKv8-based CPU Lunatics apps use quite different approach to host divercity than stock so it's unprropriate to replace stock with them. They could be distributed along with current stock as AstroPulse opt apps do. Maybe at some point it will be done. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Mike Send message Joined: 17 Feb 01 Posts: 34255 Credit: 79,922,639 RAC: 80 |
I can't imagine any other processor feature making that big of a difference for this as AVX. You can always check my results. I`m running AVX version on my FX CPU. Its slightly faster than other Lunatics versions but it depends on the CPU/memory/host combination. Needless to say its much faster than stock. With each crime and every kindness we birth our future. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
As the installer says, SSE4.2 was reported to be preferable over AVX on AMD CPUs - some years ago, when we had an SSE4.2 build available, courtesy of Joe Segur. It is certainly possible that AVX may be better than SSE3, which is the next-best I've been provided with for v8. Try them both, and see what you think. |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Richard, I've seen Joes named mentioned sometimes when the CPU app is brought up. Is there a thread about what happened and why he has decided to move on. From everything I've gathered, he was a great asset to the project, and it appears that his leaving has left a huge hole in that side of development. Or do you have anything you could share about it? Thanks! |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Richard, I've seen Joes named mentioned sometimes when the CPU app is brought up. Is there a thread about what happened and why he has decided to move on. From everything I've gathered, he was a great asset to the project, and it appears that his leaving has left a huge hole in that side of development. Or do you have anything you could share about it? Thanks! I'm afraid it was not his decision... He was old enough and suddenly all his activity both on SETI boards and Lunatics site was disrupted. Month or 2 could be internet difficulties (he in "hard-internetizable" area) but so long time... I'm afraid the worst happened :( P.S. and indeed he was very valuable and knowledgeable member of the team. His experience {and the manner of communication he always showed} missed a lot... SETI apps news We're not gonna fight them. We're gonna transcend them. |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Aww crap, that stinks if that is true. Anyone here know him personally, who might be able to share with us a more definitive answer? He sounds like he was a great guy, I wonder if I had ever talked to him back when I was more active here around 2009-2011? :-( |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Did anyone know him in RL and could try to contact him? i |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Did anyone know him in RL and could try to contact him? i He keeps his private life very private, and only really communicated on these boards about technical matters - a true number cruncher. I did find a physical address and telephone number for the house he shares with his sister, and passed them on to the project team (who also miss him) - I don't know whether or how hard they tried to follow it up. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
AFAIK Mike tried to call him by phone and I sent E-mail... all w/o any success :( SETI apps news We're not gonna fight them. We're gonna transcend them. |
Mike Send message Joined: 17 Feb 01 Posts: 34255 Credit: 79,922,639 RAC: 80 |
AFAIK Mike tried to call him by phone and I sent E-mail... all w/o any success :( Yes, i tried a few times. With each crime and every kindness we birth our future. |
_heinz Send message Joined: 25 Feb 05 Posts: 744 Credit: 5,539,270 RAC: 0 |
hmm, talking about Josef W. Segur alias Joe ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Looks like the same as we lost the board-moderator of lunatics Simon from Salzburg Austria with his last activity on 26 Aug 2007, 09:53:05 am I had have called his father working in Switzerland as a doctor, but get not any Information from him. He said I don't know anything about him. Nobody knows what happened....still sadness and helplessness. |
_heinz Send message Joined: 25 Feb 05 Posts: 744 Credit: 5,539,270 RAC: 0 |
I looked up about joes last activity on Lunatics. Date Registered: 05 Jul 2006, 06:52:16 pm Last Active: 06 Sep 2015, 08:11:33 pm ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Joe is generous helpful gentleman with excellent knowledge. We are missing him. I wish him all the best. Respectfully _heinz |
Marco Franceschini Send message Joined: 4 Jul 01 Posts: 54 Credit: 69,877,354 RAC: 135 |
Hi Shaggie76, your assumptions are correct. But AVX SIMD on some systems can be slower than SSE2 (e.g 256 bit transfers can impair memories subsystems) as Matteo Frigo quotes in its fftw's pages. There's new fftw library 3.3.5 with extended SIMD support beyond than sse,sse2 and AVX (e.g AVX2 with fma for example belonging to Haswell and above microarchitectures). http://www.fftw.org/ http://www.fftw.org/release-notes.html Thanks. Marco71. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
But AVX SIMD on some systems can be slower than SSE2 (e.g 256 bit transfers can impair memories subsystems) as Matteo Frigo quotes in its fftw's pages. I didn't do AVX research but some time ago there were AMD CPUs with SSE3 support. And they performed SSE2-only calculations better than SSE3-enabled ones. Cause SSE3 was implemented, but with so many cycles per instruction that using SSE3 for speed was not possible. To support some instruction set per se doesn't mean that using that instruction set will speedup things. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Shaggie76 Send message Joined: 9 Oct 09 Posts: 282 Credit: 271,858,118 RAC: 196 |
I'm well-aware of how AVX2 may not be ideal when used in isolation -- I did a bunch of experiments for Xbox One and PlayStation/4 and left the code #ifdef'd out :) I also ran the same tests on a Haswell (mobile) chip and saw improvements in specific cases (and even when it was about as fast the instruction cache was under much less pressure) so my gut is saying for an app like S@H an extra build-target with VEX encoding might be a big enough win [for my Haswell-E desktop] to explain the improved throughput. |
Marco Franceschini Send message Joined: 4 Jul 01 Posts: 54 Credit: 69,877,354 RAC: 135 |
I'm in the process of re cross compiling under Linux o.s with gcc 5.4 (with -O3 and -march/-mtune switches settings for haswell,ivybridge,core2 with sse4.1 etc.). So far only avx2/fma, avx and sse4.1 simd. Library name libfftw3f-3-3-4_x64.dll to avoid changes in configuration files. Fftw3.3.5 have support for the following simd isa: https://github.com/FFTW/fftw3/tree/master/simd-support https://drive.google.com/folderview?id=0B9iU4E_jpim0Y3ExUm9lZk5zN0E&usp=sharing Marco71. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.