Message boards :
Number crunching :
Open Beta test: SoG for NVidia, Lunatics v0.45 - Beta6 (RC again)
Message board moderation
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 32 · Next
Author | Message |
---|---|
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14672 Credit: 200,643,578 RAC: 874 |
AFAIK all modern AMD CPU`s and APU`s are AVX enabled. But that's what I'm not familiar with. What is 'all', and what is 'modern'? As usual, most of my detail comes from Wkipedia. So, Comparison of AMD processors is incomplete. List of AMD microprocessors doesn't mention supported features. But putting the two together, Bulldozer, Bobcat and Jaguar - 2011 and later - should all be AVX enabled? Now to turn that into model numbers identified by BOINC in host listings... |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13832 Credit: 208,696,464 RAC: 304 |
AFAIK all modern AMD CPU`s and APU`s are AVX enabled. Problem is defining modern. Not so long ago someone was calling their video card Mid-Range. In a list of all hardware from 18 years ago to now, it was. As far as current hardware went (even without including Pascal), it was on the bottom rung. Grant Darwin NT |
BilBg Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0 |
. . I think you made the same mistake I did, ..... Off-topic: If you use these dots only to make indented text - why not use Alt+255 Like this:Â Â Â Â Â Â I think you made the same mistake I did, ..... Â Â Â Â Â Another brilliant "D'oh!" moment. P.S. Alt+255 = press Alt, type 255 (use the Right-num-keys, not the top), release Alt Then Copy/Paste that -> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â <- space (nbsp) Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â as many times you like. Â - ALF - "Find out what you don't do well ..... then don't do it!" :) Â |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
AFAIK Joe never did AP builds. All those from me solely (for Windows). And there is no AVX-specific optimizations in AP as I can recall now. The advantage from AVX build could come only from auto-optimization by compiler. Also, FFTW has separate AVX paths so this speedup will be enabled automatically (even with SSE3 main app binary) on AVX-compatible host. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14672 Credit: 200,643,578 RAC: 874 |
The references to AP in recent posts to this thread all refer back to Keith Meyer's post 1796270 There really isn't a AP CPU SSE4.1 app in this new beta Lunatics installer, correct? The installer defaults to old SSE4.1 app choice for AMD CPU as being preferred over AVX. However there is no SSE4.1 AP app now that Joe Segur left ... correct?? The SSE4.1 choice falls back on the older SSE3 AP app. We AMD users really should choose the AVX option as that one actually gets installed and is faster than SSE3. Once someone confirms my observation, I will rerun the installer and go back to the AVX app. Let's look at the page: (image taken on a pre-AVX intel, so enabled options and pre-selects are different) I think it's clear that Keith's question relates to MB choices, not AP. Let's read it as if it was written that way, please, and drop AP from the conversation for now? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Yes, I goofed in my mention of AP tasks. I MEANT MB tasks. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I know my processors are AVX enabled because SIV64 tells me so in its main page. At least I hope it is correct. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Somewhere along the way I snagged onto some optimized FFTW libraries. They were supposed to be better than the ones shipped with the stock libraries. libfftw3f-3-3-4_x86.dll 2.24 MB libfftw3f-3-3-4_x86sse41.dll 2.24 MB libfftw3f-3avx.dll 2.45 MB libfftw3f-3ssse3.dll 1.95 MB Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
And they better indeed but for a quite a low margin. One need understand that big benefits usually comes from hand-optimized to SIMD sources. That what I mean by AVX path in FFTW. Compiler's vectoriser rarely can reach similar speedup. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Preliminary testing looks like there is very little increase in time to the blc guppi work units. Maybe 20-30 secs but very negligible. Non-guppi mb is hard to tell as these work units are not from the same tape as before the change in version. Most of these mb are coming from 06my10 and 07my10. Since I don't have any from before from that tape, it's hard to compare. I looked at my other machine and the times are pretty close to each other, even though these are not cloned machines. So I'm going to say very little difference in time. However, I did notice that the % of CPU has dropped noticeably. You didn't slip in a use sleep command did you? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
And they better indeed but for a quite a low margin. So my decrease in task completion time is solely due to using the optimized FFTW library libfftw3f-3avx.dll renamed to the stock libfftw3f-3-3-4_x64.dll FFTW library? And nothing to do with using r3430 or r3472 app? Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 66194 Credit: 55,293,173 RAC: 49 |
Well I'm as happy as a clam with SoG in lunatics 0.45 Beta3. Savoir-Faire is everywhere! The T1 Trust, T1 Class 4-4-4-4 #5550, America's First HST |
Marco Franceschini Send message Joined: 4 Jul 01 Posts: 54 Credit: 69,877,354 RAC: 135 |
Hi Keith, it's my own fftw 3.3.4 recompiled libraries from Matteo Frigo's original code. Cross compiled under Ubuntu 15.10 (with gcc gnu c/c++ compiler 5.2 and maximum optimizations choice for various architecture i.e Core 2, Sandy Bridge, Ivy Bridge,Haswell etc.) with fma enabled for Haswell and above cpus. AVX simd instruction set can be slower than other like SSE2 due to memory bandwidth it required (mostly under notebook systems). Some other improvements may be reached when fftw 3.3.5 will be released. Marco. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
There will be new build quite soon, with much more changes - would be interesting to see how it reacts. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . I think you made the same mistake I did, ..... Do you mean like this? Thank you for that advice I have found the other method tedious :) This is so much better :) [edit] a pity it didn't work though. I wonder why it worked for you and not for me. It shows as spaces in my edit window but disappears when posted, same as using the spacebar. |
BilBg Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0 |
[edit] a pity it didn't work though. Use [Quote] button and Copy/Paste the following line to Notepad: ->Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â <- space (nbsp) Then use (Copy/Paste from Notepad in future) the "spaces" that are between the arrows. P.S. - Do not Copy the above line from the page, Copy from the typing box after [Quote] button. - Use [Preview] button before post to see if "it did work" Alt+255 (and similar) exist since MS-DOS times. "Your spaces" look like normal spaces (Hex: 20 20 20 ...) and "My spaces" look like A0 A0 A0 ...: 20 20 20 20 20 20 54 68 69 73 20 69 73 20 73 6F 20 6D 75 63 68 20 62 65 74 74 65 72 : This is so much better 2D 3E A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 3C 2D : ->Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â <- Â - ALF - "Find out what you don't do well ..... then don't do it!" :) Â |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
[quote][edit] a pity it didn't work though. Use button and Copy/Paste the following line to Notepad: ->Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â <- space (nbsp) Then use (Copy/Paste from Notepad in future) the "spaces" that are between the arrows. P.S. - Do not Copy the above line from the page, Copy from the typing box after [Quote] button. - Use [Preview] button before post to see if "it did work" Alt+255 (and similar) exist since MS-DOS times. "Your spaces" look like normal spaces (Hex: 20 20 20 ...) and "My spaces" look like A0 A0 A0 ...: 20 20 20 20 20 20 54 68 69 73 20 69 73 20 73 6F 20 6D 75 63 68 20 62 65 74 74 65 72 : This is so much better 2D 3E A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 A0 3C 2D : ->Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â <- Â Â test ^ worked for me ... though I've found the pre /pre functions to be more useful ... |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Ok, I was trying to find the post where Keith talked about the time to complete under the new app. I'm sure it's here but I forgot where it was. Now that I've had 24 hour with it running, I think he is right. For non-guppi work units, it looks like a 200-240 sec increase for the work,not the 100 sec I originally thought.(these are running multiple work units at once, not single work units per card) Took a while to get enough work units to look at all of the different types. I've switched back to r3430 and the times have come back down on new work from the same tape. So for now, I'm sticking with that. No real change to Guppi work units. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 66194 Credit: 55,293,173 RAC: 49 |
For guppi's 45-50 mins, non-guppi's 10-25 mins, mind you I'm doing 3 wu's at a time on r3472/SoG, on a PNY LC 580 card @ 857MHz, plus there are 4 cpu's being done on AVX. Savoir-Faire is everywhere! The T1 Trust, T1 Class 4-4-4-4 #5550, America's First HST |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Zalster, to throw more wood on the fire .... I just changed my computers to remove the -use_sleep option in the MB txt file. !WOW! That really goosed the engines. I am now doing GUPPI VLAR in 12-15 minutes vice 22-24 minutes with sleep. Now doing non-VLAR Arecibo tasks in 5-9 minutes vice 12-15 minutes. CPU is at 90-100% usage with MB tasks basically taking a full CPU core. I think that is normal for the SoG app from what I've read about its history. Funny thing is that the system lags are actually reduced from using the sleep function and I haven't done anything to change my very aggressive MB txt file parameters. I wonder if the sleeping actually impacted the video graphics engine causing lags more than normal. I did see only about 50% CPU usage and NO red in the SIV64 CPU utilization traces when using sleep. There's RED all over the place now but the systems are stable as always and the text entry and mouse cursor movement is acceptable. I still think there is a speedup of the r3430 app vs the newer r3472 app. I am trying that version out now since it was supposed to play nice with the sleep parameter. Not sure I need it now. Will have to revert to the older r3430 app and compare throughput but need a few days of data collection to form a baseline. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.