D500 MacPro Twice as Fast with New AMD/ATI App

Message boards : Number crunching : D500 MacPro Twice as Fast with New AMD/ATI App
Message board moderation

To post messages, you must log in.

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1832229 - Posted: 24 Nov 2016, 19:03:24 UTC
Last modified: 24 Nov 2016, 19:21:38 UTC

Hey, someone said they wanted new Topics....

But....it's True
SETI@home v8 v8.10 (opencl_ati_mac) x86_64-apple-darwin VLAR RunTime = 2,071.08
SETI@home v8 v8.19 (opencl_ati5_mac) x86_64-apple-darwin VLAR RunTime = 1,199.51
https://setiweb.ssl.berkeley.edu/beta/results.php?hostid=74755&offset=80

Now if the app were just on Main.

But wait. The same App is over here, ATI5_r3552&CPUr3344.zip
ID: 1832229 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1832297 - Posted: 25 Nov 2016, 1:32:04 UTC

Impressive!
ID: 1832297 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1832738 - Posted: 27 Nov 2016, 0:06:45 UTC - in response to Message 1832229.  

I'll give those a shot on my D500 & D700 when I get back in town!

Chris
ID: 1832738 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1832921 - Posted: 27 Nov 2016, 22:31:42 UTC - in response to Message 1832738.  
Last modified: 27 Nov 2016, 22:35:33 UTC

I found a D700 at Beta running the New Apps and it also appears to be running around twice as Fast as the old 8.10 Apps;
SETI@home v8 v8.10 (opencl_ati_mac) x86_64-apple-darwin BLC7 = 22 min 2 sec
SETI@home v8 v8.19 (opencl_ati5_mac) x86_64-apple-darwin BLC7 = 11 min 8 sec
https://setiweb.ssl.berkeley.edu/beta/results.php?hostid=80918
I went to check your D700 and it appears something Bad is going on with the GPUs, you need to get home soon ;-)
Since you're already running Apps newer than 8.10, you probably won't see much difference.

Checking the New OpenCL nVidia App and AMD Apps at Beta suggests they are Ready for Main...soon.
Unfortunately the Intel iGPU is still suffering the same problems they've always had, looks like a lost cause.
ID: 1832921 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1832972 - Posted: 28 Nov 2016, 5:54:37 UTC - in response to Message 1832921.  

Ugh, I'll be back to tomorrow evening. Thanks for the heads up, it definately needs a reboot!

It would be nice if I didn't have to run 3 at a time thigh to maximize RAC...

Chris
ID: 1832972 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1833166 - Posted: 29 Nov 2016, 15:48:53 UTC - in response to Message 1832972.  

Ugh, I don't know what happened. I can't get any anonymous to run. Stock will run if I reset the project but as soon as I try anon I get "Your app_info.xml doesn't have a usable version of seti@home v8."

I've double checked all files are in the folder. I've probably been on the road too long and there's something stupid but I'm not seeing it... Here's my file:

<app_info>
<app>
<name>setiathome_v8</name>
</app>
<file_info>
<name>MBv8_8.19r3552_ati5_ssse3_x86_64-apple-darwin</name>
<executable/>
</file_info>
<file_info>
<name>MultiBeam_Kernels_r3552.cl</name>
</file_info>
<file_info>
<name>mb_cmdline_mac_OpenCL_ati5_sah.txt</name>
</file_info>
<app_version>
<app_name>setiathome_v8</app_name>
<platform>x86_64-apple-darwin</platform>
<version_num>800</version_num>
<plan_class>opencl_ati5_sah</plan_class>
<avg_ncpus>.3</avg_ncpus>
<max_ncpus>1</max_ncpus>
<coproc>
<type>ATI</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>MBv8_8.19r3552_ati5_ssse3_x86_64-apple-darwin</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>MultiBeam_Kernels_r3552.cl</file_name>
</file_ref>
<file_ref>
<file_name>mb_cmdline_mac_OpenCL_ati5_sah.txt</file_name>
<open_name>mb_cmdline.txt</open_name>
</file_ref>
</app_version>


<app>
<name>setiathome_v8</name>
</app>
<file_info>
<name>MBv8_8.06r3366_avx_x86_64-apple-darwin</name>
<executable/>
</file_info>
<app_version>
<app_name>setiathome_v8</app_name>
<platform>x86_64-apple-darwin</platform>
<version_num>803</version_num>
<file_ref>
<file_name>MBv8_8.06r3366_avx_x86_64-apple-darwin</file_name>
<main_program/>
</file_ref>
</app_version>

<app>
<name>astropulse_v7</name>
</app>
<file_info>
<name>ap_7.01r2559_sse3_OSX64</name>
<executable/>
</file_info>
<app_version>
<app_name>astropulse_v7</app_name>
<version_num>701</version_num>
<file_ref>
<file_name>ap_7.01r2559_sse3_OSX64</file_name>
<main_program/>
</file_ref>
</app_version>

<app>
<name>astropulse_v7</name>
</app>
<file_info>
<name>ap_7.07r2750_sse3_clATI_OSX64</name>
<executable/>
</file_info>
<file_info>
<name>AstroPulse_Kernels_r2750.cl</name>
</file_info>
<file_info>
<name>ap_cmdline_7.07_x86_64_ati_mac.txt</name>
</file_info>
<app_version>
<app_name>astropulse_v7</app_name>
<platform>x86_64-apple-darwin</platform>
<version_num>707</version_num>
<plan_class>opencl_ati_mac</plan_class>
<avg_ncpus>0.5</avg_ncpus>
<max_ncpus>1</max_ncpus>
<coproc>
<type>ATI</type>
<count>1</count>
</coproc>
<file_ref>
<file_name>ap_7.07r2750_sse3_clATI_OSX64</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>AstroPulse_Kernels_r2750.cl</file_name>
</file_ref>
<file_ref>
<file_name>ap_cmdline_7.07_x86_64_ati_mac.txt</file_name>
<open_name>ap_cmdline.txt</open_name>
</file_ref>
</app_version>


</app_info>
ID: 1833166 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1833197 - Posted: 29 Nov 2016, 19:45:35 UTC - in response to Message 1833166.  
Last modified: 29 Nov 2016, 20:24:06 UTC

It could be a file type problem. It needs to be a BOINC compatible .xml. You can't create a BOINC .xml on a Mac very easily, so, the best way is to use an existing BOINC generated .xml.
Pick a file generated by BOINC, say, global_prefs.xml and duplicate it using TextEdit.
Copy the contents of your existing app_info.xml (which looks fine) and then paste it over the contents of the duplicated global_prefs.xml, save the file.
Then rename the existing app_info to say app_info1.xml, rename the New file to app_info.xml, and move the New app_info.xml into the setiathome.berkeley.edu folder.
That should result in a BOINC compatible app_info.xml file.
ID: 1833197 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1833203 - Posted: 29 Nov 2016, 20:39:17 UTC - in response to Message 1833197.  

Great! That seemed to fix everything. Now I just need to wait to get some data to crunch. Thanks!

Chris
ID: 1833203 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1833211 - Posted: 29 Nov 2016, 21:39:09 UTC - in response to Message 1833203.  
Last modified: 29 Nov 2016, 21:49:05 UTC

Yes, seems to be working now. Are you still running 3 at a time?
That looks kinda fast to be 3 at a time.

I suppose it could be running One BLC5 and 2 Shorties and give those times.
Interesting....
ID: 1833211 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1833247 - Posted: 30 Nov 2016, 0:24:48 UTC - in response to Message 1833211.  

No, just one at a time at the moment. I like to get a good single baseline to know how running multiples is doing. I need to fiddle with my settings too, the wu's in general are running about 100s faster on beta with default settings, so I'm pushing too hard somewhere.

Chris
ID: 1833247 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1833255 - Posted: 30 Nov 2016, 1:13:30 UTC - in response to Message 1833247.  

Well, the Hosts at Beta are running mostly BLC7s. The BLC5s take a bit longer, they are close to the same as an Arecibo VLAR. The D500 at Beta runs an Arecibo VLAR in 1100s , https://setiweb.ssl.berkeley.edu/beta/result.php?resultid=25411753 and a BLC7 in ~740s. So, running a BLC5 (Arecibo VLAR) in 760s isn't that bad when it takes 1100s on a D500. If you ran a BLC7 on your machine it would probably run in around 630s ;-)
Try running an Arecibo VLAR at Beta, and it will probably take about the same time as a BLC5 on Main.
ID: 1833255 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1833256 - Posted: 30 Nov 2016, 1:18:59 UTC - in response to Message 1833255.  

You're right, I misread the beta wu's thought they were 5's as well. Oh well, looks like multiple wu's are probably going to be still best for max throughput.

Thanks,

Chris
ID: 1833256 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1833258 - Posted: 30 Nov 2016, 1:33:16 UTC - in response to Message 1833256.  
Last modified: 30 Nov 2016, 1:48:51 UTC

Also, looking at the results at Beta, it seems the ATI5 SoG App is a little faster on your machine. It would probably be best to use the SoG App from Beta, or dig it out of the OSX_OpenCL_for_Beta.zip package at C.A. About the only difference is the SoG App still has strangeness with the progress bar.

Oh, the ATI5 App should work with Lion (Darwin 11.4.2) and above. The ATI5 SoG App needs Mavericks (Darwin 13.x) or above.
ID: 1833258 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1833262 - Posted: 30 Nov 2016, 2:20:34 UTC - in response to Message 1833258.  

Yeah I just switched over to it about a half hour ago. Getting around 700s with fairly aggressive settings. Next up is running multiples. Only thing I hate about running multiples is when you end of with Guppi's and Arecibo mid-range wu's. The mid-ranges wu's end up taking almost twice as long. I was hoping to get a little better performance running singles but such is life.:).

Chris
ID: 1833262 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1833368 - Posted: 30 Nov 2016, 19:04:10 UTC - in response to Message 1833262.  

Now your D500 machine has developed the same problem as the D700. It appears the task isn't starting and timing out. When I was running the OpenCL App I had a similar problem with the last CMDline in the ReadMe file, -oclfft_tune_cw 64. That CMD caused the task to fail to start, it worked fine without that setting. In fact, it worked fine without both the last two CMDs, -oclfft_tune_bn 64 -oclfft_tune_cw 64. You might try removing that setting(s) and see how it runs.
ID: 1833368 · Report as offensive

Message boards : Number crunching : D500 MacPro Twice as Fast with New AMD/ATI App


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.