ROCm 1.8

Message boards : Number crunching : ROCm 1.8
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1936694 - Posted: 22 May 2018, 20:50:11 UTC - in response to Message 1936673.  

Don't really have an idea if this problem can be tweaked away - I have opened a ticket at ROCm though, maybe there is a bug in their OpenCL implementation?
https://github.com/RadeonOpenCompute/ROCm/issues/423

Might be worth keeping an eye on that in case they need some more diagnostic info


Thanks for posting. I will keep on eye on the thread. I was thinking of giving the non-SoG version of the app a try.


Can someone point me to the latest Linux non-SoG AMD MB app?
Thanks!


Sent via mail.


With each crime and every kindness we birth our future.
ID: 1936694 · Report as offensive
Profile RueiKe Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 14 Feb 16
Posts: 492
Credit: 378,512,430
RAC: 785
Taiwan
Message 1936776 - Posted: 23 May 2018, 10:53:04 UTC - in response to Message 1936694.  

Don't really have an idea if this problem can be tweaked away - I have opened a ticket at ROCm though, maybe there is a bug in their OpenCL implementation?
https://github.com/RadeonOpenCompute/ROCm/issues/423

Might be worth keeping an eye on that in case they need some more diagnostic info


Thanks for posting. I will keep on eye on the thread. I was thinking of giving the non-SoG version of the app a try.


Can someone point me to the latest Linux non-SoG AMD MB app?
Thanks!


Sent via mail.


Thanks Mike! The system is now running with the non-SoG MB app. Will have to monitor it for a while.
GitHub: Ricks-Lab
Instagram: ricks_labs
ID: 1936776 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1936798 - Posted: 23 May 2018, 13:43:01 UTC - in response to Message 1936776.  

I'm not sure Linux build implements -tt option.
Time targeting use profiling abilities of OpenCL runtime - worth to check if Urs ported that block of code to Linux or not.
Look into stderr what it reports.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1936798 · Report as offensive
Profile RueiKe Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 14 Feb 16
Posts: 492
Credit: 378,512,430
RAC: 785
Taiwan
Message 1936800 - Posted: 23 May 2018, 14:00:40 UTC - in response to Message 1936798.  

I'm not sure Linux build implements -tt option.
Time targeting use profiling abilities of OpenCL runtime - worth to check if Urs ported that block of code to Linux or not.
Look into stderr what it reports.


I was using -tt on this system when I had the ProDuo cards with AMD standard drivers and it didn't cause a problem. Now with ROCm, I kept the args the same as before and then simplified to what I use in Windows. I could try to remove it to see if it makes a difference, but I want it to run for a while to see if switching to non-SoG makes a difference.
ID: 1936800 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1936801 - Posted: 23 May 2018, 14:05:46 UTC - in response to Message 1936800.  
Last modified: 23 May 2018, 14:06:09 UTC

I'm not sure Linux build implements -tt option.
Time targeting use profiling abilities of OpenCL runtime - worth to check if Urs ported that block of code to Linux or not.
Look into stderr what it reports.


I was using -tt on this system when I had the ProDuo cards with AMD standard drivers and it didn't cause a problem. Now with ROCm, I kept the args the same as before and then simplified to what I use in Windows. I could try to remove it to see if it makes a difference, but I want it to run for a while to see if switching to non-SoG makes a difference.


Non-implemented option will just be ignored. So it will not cause trouble per se. But if it unsupported then app will use older way to select size of kernel. And this could lead to errors vs Windows version.
If errors continue try to increase -period_iterations_num value.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1936801 · Report as offensive
Profile RueiKe Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 14 Feb 16
Posts: 492
Credit: 378,512,430
RAC: 785
Taiwan
Message 1937036 - Posted: 25 May 2018, 8:51:36 UTC

I have setup a bench test on a random VLAR WU I had on my system and here are the results running with ROCm1.8 in Ubuntu 16.04.4 with no command line arguments:
----------------------------------------------------------------
Suspending BOINC
Listing wu-file(s) in /testWUs :
21jl16ad.13182.18067.14.41.184_vlar_CPU.wu

Listing executable(s) in /APPS :
MBv8_8.22r3584_sse2_clAMD_HD5_x86_64-pc-linux-gnu

Listing executable in /REF_APPS :
MBv8_8.05r3345_avx_linux64
----------------------------------------------------------------
Current WU: 21jl16ad.13182.18067.14.41.184_vlar_CPU.wu

----------------------------------------------------------------
Skipping default app MBv8_8.05r3345_avx_linux64, displaying saved result(s)
Elapsed Time: ....................... 2958 seconds
----------------------------------------------------------------
Running app with command : .......... MBv8_8.22r3584_sse2_clAMD_HD5_x86_64-pc-linux-gnu
./MBv8_8.22r3584_sse2_clAMD_HD5_x86_64-pc-linux-gnu 538.67 sec 115.71 sec 5.55 sec
Elapsed Time : ...................... 539 seconds
Speed compared to default : ......... 548 %
-----------------
Comparing results
                ------------- R1:R2 ------------     ------------- R2:R1 ------------
                Exact  Super  Tight  Good    Bad     Exact  Super  Tight  Good    Bad
        Spike      0      1      1      1      0        0      1      1      1      0
     Autocorr      0      0      0      0      0        0      0      0      0      0
     Gaussian      0      0      0      0      0        0      0      0      0      0
        Pulse      0      0      0      3      1        0      0      0      3      1
      Triplet      0      0      1      3      0        0      0      1      3      0
   Best Spike      0      1      1      1      0        0      1      1      1      0
Best Autocorr      0      0      1      1      0        0      0      1      1      0
Best Gaussian      1      1      1      1      0        1      1      1      1      0
   Best Pulse      0      0      0      1      0        0      0      0      1      0
 Best Triplet      0      0      0      1      0        0      0      0      1      0
                ----   ----   ----   ----   ----     ----   ----   ----   ----   ----
                   1      3      5     12      1        1      3      5     12      1

Unmatched signal(s) in R1 at line(s) 402
Unmatched signal(s) in R2 at line(s) 462
For R1:R2 matched signals only, Q= 14.95%
Result      : Weakly similar.

----------------------------------------------------------------
Done with 21jl16ad.13182.18067.14.41.184_vlar_CPU.wu

====================================================================
Hosts CPU data ...
model name	: AMD Ryzen Threadripper 1950X 16-Core Processor
cpu cores	: 16
cpu MHz		: 3012.220
cache size	: 512 KB
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate retpoline retpoline_amd ssbd amd_ssbd vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca

Done with Benchmark run! Removing temporary files!
Resuming BOINC



Here are the results when running on the same machine under Ubuntu 18.04 with rev 18.20 AMD PAL drivers:

KWSN-Linux-MBbench v3.0 cache-keeping edition
Running on Eos at Thu 24 May 2018 10:42:23 AM UTC
----------------------------------------------------------------
Starting benchmark run...
----------------------------------------------------------------
Suspending BOINC
Listing wu-file(s) in /testWUs :
21jl16ad.13182.18067.14.41.184_vlar_CPU.wu

Listing executable(s) in /APPS :
MBv8_8.22r3584_sse2_clAMD_HD5_x86_64-pc-linux-gnu

Listing executable in /REF_APPS :
MBv8_8.05r3345_avx_linux64
----------------------------------------------------------------
Current WU: 21jl16ad.13182.18067.14.41.184_vlar_CPU.wu

----------------------------------------------------------------
Skipping default app MBv8_8.05r3345_avx_linux64, displaying saved result(s)
Elapsed Time: ....................... 2958 seconds
----------------------------------------------------------------
Running app with command : .......... MBv8_8.22r3584_sse2_clAMD_HD5_x86_64-pc-linux-gnu
./MBv8_8.22r3584_sse2_clAMD_HD5_x86_64-pc-linux-gnu 567.01 sec 75.28 sec 74.44 sec
Elapsed Time : ...................... 567 seconds
Speed compared to default : ......... 521 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.93%

----------------------------------------------------------------
Done with 21jl16ad.13182.18067.14.41.184_vlar_CPU.wu

====================================================================
Hosts CPU data ...
model name	: AMD Ryzen Threadripper 1950X 16-Core Processor
cpu cores	: 16
cpu MHz		: 1941.846
cache size	: 512 KB
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate sme vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca ssbd

Done with Benchmark run! Removing temporary files!
Resuming BOINC
ID: 1937036 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : ROCm 1.8


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.