Radeon Software Crimson

Message boards : Number crunching : Radeon Software Crimson
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

AuthorMessage
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1769582 - Posted: 4 Mar 2016, 21:38:13 UTC - in response to Message 1769517.  
Last modified: 4 Mar 2016, 21:40:17 UTC

i think it's an opencl 2.0 thing

16.2.1 with one wu feels faster than 14.4 with two wu, but i haven't been patient enough to verify that's the case. i'll probably go back and try it again


a 280x is tahiti, not hawaii

tahiti doesn't have the issues that hawaii does


Yes, i think you are right here.
My Tonga is very close to the Tahiti and has only little issues running multiple instances with Crimson.


With each crime and every kindness we birth our future.
ID: 1769582 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1770659 - Posted: 10 Mar 2016, 2:27:14 UTC

Crimson 16.3 Hotfix (Beta) is available...

http://support.amd.com/en-us/kb-articles/Pages/AMD_Radeon_Software_Crimson_Edition_16.3.aspx

I'll test it tomorrow (if simultaneously tasks possible)...

Maybe someone other is curious and will test it before me? ;-)
ID: 1770659 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 1770664 - Posted: 10 Mar 2016, 3:05:08 UTC - in response to Message 1770659.  

I'll test it tomorrow (if simultaneously tasks possible)...


I'm using 15.12 and I'm running 3 MB or 2 APs simultaneously.
ID: 1770664 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1770665 - Posted: 10 Mar 2016, 3:07:20 UTC - in response to Message 1770659.  

You should give one or both of the new apps just put out for testing (see the OpenCL NV Multibeam v8 edition for Windoes thread) as Raistmer has put out a couple of new ATI variants that should better utilize all the compute units in the cards during pulsefind. Might help you get a little more production out of your Fury's even running 1 wu at a time... Plus I'm curious to see how high end ATI cards like the new apps lol. If the early beta results are any indication the SoG version might be slower than the APU versions and the non-SoG ATI5 version but very few people have run it so time will tell which is fastest...

Chris
ID: 1770665 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1770666 - Posted: 10 Mar 2016, 3:08:30 UTC - in response to Message 1770664.  

I'll test it tomorrow (if simultaneously tasks possible)...


I'm using 15.12 and I'm running 3 MB or 2 APs simultaneously.


I don't think Tahiti based cards have the same problem as the Hawaii and Fiji based ones...
ID: 1770666 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1770718 - Posted: 10 Mar 2016, 8:49:06 UTC - in response to Message 1770666.  

I'll test it tomorrow (if simultaneously tasks possible)...


I'm using 15.12 and I'm running 3 MB or 2 APs simultaneously.


I don't think Tahiti based cards have the same problem as the Hawaii and Fiji based ones...


Correct.
Also new builds don`t fix this as my tests have shown.
YMMV.


With each crime and every kindness we birth our future.
ID: 1770718 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1770798 - Posted: 10 Mar 2016, 19:02:10 UTC - in response to Message 1770665.  
Last modified: 10 Mar 2016, 19:04:20 UTC

Chris Adamek wrote:
You should give one or both of the new apps just put out for testing (see the OpenCL NV Multibeam v8 edition for Windoes thread) as Raistmer has put out a couple of new ATI variants that should better utilize all the compute units in the cards during pulsefind. Might help you get a little more production out of your Fury's even running 1 wu at a time... Plus I'm curious to see how high end ATI cards like the new apps lol. If the early beta results are any indication the SoG version might be slower than the APU versions and the non-SoG ATI5 version but very few people have run it so time will tell which is fastest...

Chris

Until now I didn't read this thread you mentioned, because of the 'NV' in the title.

I didn't know that in this thread are also AMD/ATI apps...

Please could you give me the URL of the message in this thread which mention the (latest/newest) AMD/ATI apps (URL for download)? ...because it's very long thread.

Thanks.

(Before I test the apps, I'll test the 2 WUs/GPU with Crimson 16.3 - in it is a new driver v2004.6.
I tested the AMD/ATI SoG app at SETI-Beta in past, but it needed 1 1/3 CPUs. The CPU time was higher than the whole calculation time.)
ID: 1770798 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1770801 - Posted: 10 Mar 2016, 19:16:03 UTC - in response to Message 1770798.  

https://cloud.mail.ru/public/DMkN/x4BRCYuAV

There are a couple of versions to try, so you'll have to see if any of them help, but even the non-SoG version has the new pulsefind code that should help better utilize all your CU's. I'm patiently awaiting a Mac port so that I can give them a try too. Lol

Chris
ID: 1770801 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1772697 - Posted: 19 Mar 2016, 22:33:33 UTC - in response to Message 1770659.  

ID: 1772697 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1774933 - Posted: 29 Mar 2016, 10:29:55 UTC - in response to Message 1772697.  
Last modified: 29 Mar 2016, 10:30:32 UTC

ID: 1774933 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1774974 - Posted: 29 Mar 2016, 15:27:44 UTC - in response to Message 1774933.  

FYI

Crimson 16.3.2 - (NO Hotfix/Beta) is available...

http://support.amd.com/en-us/kb-articles/Pages/AMD-Radeon-Software-Crimson-Edition-16-3-2.aspx

;-)

I had a BSOD on the first boot after installation with my R9 390X. So I revered back and figured I would try 16.3.2 again later when I had more caffeine in me.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1774974 · Report as offensive
Xron

Send message
Joined: 1 Sep 99
Posts: 6
Credit: 1,855,692
RAC: 0
United States
Message 1775000 - Posted: 29 Mar 2016, 23:11:12 UTC - in response to Message 1774933.  

Boinc 7.6.22 on Windows 10 Pro stopped detecting my ASUS HD7750-T-1GD5 card when I upgraded to Crimson Edition 16.3.2. I found this in the Boinc Manager event log:

3/29/2016 12:38:26 PM | SETI@home | Application uses missing ATI GPU
3/29/2016 12:38:26 PM | | App version needs OpenCL but GPU doesn't support it

The only workaround I found is to downgrade to Crimson Edition 15.12.
ID: 1775000 · Report as offensive
Profile RueiKe Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 14 Feb 16
Posts: 492
Credit: 378,512,430
RAC: 785
Taiwan
Message 1775526 - Posted: 1 Apr 2016, 7:50:55 UTC - in response to Message 1750035.  


My four FuryX's (from PowerColor) PC is currently top host #5.
And this with just 1 WU/GPU.
More isn't possible with the currently driver.
Two Xeon's each with 6 CPU-Cores/12 threads, but HT off for faster GPU app calculation, so 12 CPU-Cores.
1 Core/GPU app, so 8 Cores crunch tasks.


BTW.
Crimson 15.12 is available. (at least for Win8.1 x64)
For some minutes installed, but it's the same v1912.5 driver like in Crimson v15.11 and v15.11.1 Beta.
So I guess still not 2+WUs/GPU possible, right?


I have been having trouble getting decent RAC with my latest build, FX-8370 with 2 Fury Nanos (powercolor). Both cards have an average loading of about 55%, where my single Fury X system is at about 85%. The RAC of the dual Nano system is significantly lower than the single Fury X. I have tried Crimson 15.12 and the latest 16.3.2, with no difference. I have tried many CPU/GPU per app configurations. I get errors with more than 1 per GPU, so I have settled with 1 task to 1 CPU and 1 GPU. Just curious if you faced this issue and how you overcame it for your 4 Fury X system.
GitHub: Ricks-Lab
Instagram: ricks_labs
ID: 1775526 · Report as offensive
Profile RueiKe Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 14 Feb 16
Posts: 492
Credit: 378,512,430
RAC: 785
Taiwan
Message 1775728 - Posted: 2 Apr 2016, 12:24:13 UTC - in response to Message 1775526.  


I have been having trouble getting decent RAC with my latest build, FX-8370 with 2 Fury Nanos (powercolor). Both cards have an average loading of about 55%, where my single Fury X system is at about 85%. The RAC of the dual Nano system is significantly lower than the single Fury X. I have tried Crimson 15.12 and the latest 16.3.2, with no difference. I have tried many CPU/GPU per app configurations. I get errors with more than 1 per GPU, so I have settled with 1 task to 1 CPU and 1 GPU. Just curious if you faced this issue and how you overcame it for your 4 Fury X system.


I have posted a video summarizing my findings so far. Still struggling with getting significant loading on my dual Fury system.
https://youtu.be/LqO1yvRMBGQ
GitHub: Ricks-Lab
Instagram: ricks_labs
ID: 1775728 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1775736 - Posted: 2 Apr 2016, 14:24:19 UTC - in response to Message 1775728.  


I have been having trouble getting decent RAC with my latest build, FX-8370 with 2 Fury Nanos (powercolor). Both cards have an average loading of about 55%, where my single Fury X system is at about 85%. The RAC of the dual Nano system is significantly lower than the single Fury X. I have tried Crimson 15.12 and the latest 16.3.2, with no difference. I have tried many CPU/GPU per app configurations. I get errors with more than 1 per GPU, so I have settled with 1 task to 1 CPU and 1 GPU. Just curious if you faced this issue and how you overcame it for your 4 Fury X system.


I have posted a video summarizing my findings so far. Still struggling with getting significant loading on my dual Fury system.
https://youtu.be/LqO1yvRMBGQ

Something to take note of the R9 290X is based on GCN 1.1 & Fury are based on GCN 1.2.

Whenever you are running a system with multi GPUs have you checked the Crossfire option in the driver? Perhaps it is being turned on automatically and slowing things down.

Some things I might try:
-Pairing the R9 290X & HD 7870 to see if the same lower utilization effect occurs.
-Run only 1 of the GPUs in a multi GPU configuration to see if the utilization is the same as when multi GPUs are running.
-Run with 0 CPU tasks.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1775736 · Report as offensive
Profile RueiKe Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 14 Feb 16
Posts: 492
Credit: 378,512,430
RAC: 785
Taiwan
Message 1775744 - Posted: 2 Apr 2016, 15:00:58 UTC - in response to Message 1775736.  


Something to take note of the R9 290X is based on GCN 1.1 & Fury are based on GCN 1.2.

Whenever you are running a system with multi GPUs have you checked the Crossfire option in the driver? Perhaps it is being turned on automatically and slowing things down.

Some things I might try:
-Pairing the R9 290X & HD 7870 to see if the same lower utilization effect occurs.
-Run only 1 of the GPUs in a multi GPU configuration to see if the utilization is the same as when multi GPUs are running.
-Run with 0 CPU tasks.


I have asked an AMD contact about Fury vs. R9 290X. If I hear back, I will share here.

Yes, crossfire is getting turned on, but I have manually disabled it. It did not seem to make much difference though.

I have difficulty to pair the R9 290 with the R7870. The 290X is too long to fit in my server and the system with the R9 290 doesn't have a spare slot. I have tried every other combination though.

Not sure I understand the last 2 suggestions:
-What do you mean by 1 GPU in multi GPU config? Do you mean 1 GPU with 0.5 GPUs per task? I have tried that with the Fury X and found it gets computation errors.
-How do you disable CPU tasks?
GitHub: Ricks-Lab
Instagram: ricks_labs
ID: 1775744 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1775760 - Posted: 2 Apr 2016, 16:12:24 UTC - in response to Message 1775744.  


Something to take note of the R9 290X is based on GCN 1.1 & Fury are based on GCN 1.2.

Whenever you are running a system with multi GPUs have you checked the Crossfire option in the driver? Perhaps it is being turned on automatically and slowing things down.

Some things I might try:
-Pairing the R9 290X & HD 7870 to see if the same lower utilization effect occurs.
-Run only 1 of the GPUs in a multi GPU configuration to see if the utilization is the same as when multi GPUs are running.
-Run with 0 CPU tasks.


I have asked an AMD contact about Fury vs. R9 290X. If I hear back, I will share here.

Yes, crossfire is getting turned on, but I have manually disabled it. It did not seem to make much difference though.

I have difficulty to pair the R9 290 with the R7870. The 290X is too long to fit in my server and the system with the R9 290 doesn't have a spare slot. I have tried every other combination though.

Not sure I understand the last 2 suggestions:
-What do you mean by 1 GPU in multi GPU config? Do you mean 1 GPU with 0.5 GPUs per task? I have tried that with the Fury X and found it gets computation errors.
-How do you disable CPU tasks?


Run only 1 of the GPUs means exactly that. With 2, or 3, GPUs installed in a system. Tell BOINC to only use one of them. You could use <exclude_gpu> or <ignore_ati_dev> in your cc_config.xml. You will need to restart BOINC when using either one of those settings.

To suspend all CPU processing there are a few different ways to do it.
You could increase <cpu_usage> in the app_config.xml to a value that would reserve all of the CPUs. Something like 2.66 or 2.67 for 3 GPUs in the 8 CPU system.
You could also lower the number of CPUs that BOINC is allowed to use by changing Use at most X% of the CPUs to a low value to prevent CPU tasks from running. With GPU tasks running setting 1% works for me. Also the setting my be labeled On multiprocessor systems, use at most X% of the processors in the GUI.
Lastly you can always suspend the CPU tasks.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1775760 · Report as offensive
Profile RueiKe Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 14 Feb 16
Posts: 492
Credit: 378,512,430
RAC: 785
Taiwan
Message 1775861 - Posted: 3 Apr 2016, 0:56:08 UTC - in response to Message 1775760.  

Run only 1 of the GPUs means exactly that. With 2, or 3, GPUs installed in a system. Tell BOINC to only use one of them. You could use <exclude_gpu> or <ignore_ati_dev> in your cc_config.xml. You will need to restart BOINC when using either one of those settings.

To suspend all CPU processing there are a few different ways to do it.
You could increase <cpu_usage> in the app_config.xml to a value that would reserve all of the CPUs. Something like 2.66 or 2.67 for 3 GPUs in the 8 CPU system.
You could also lower the number of CPUs that BOINC is allowed to use by changing Use at most X% of the CPUs to a low value to prevent CPU tasks from running. With GPU tasks running setting 1% works for me. Also the setting my be labeled On multiprocessor systems, use at most X% of the processors in the GUI.
Lastly you can always suspend the CPU tasks.


Thanks for the recommendations! I think excluding other GPU and limit to 1 at a time is a great idea.

In the past, I had allocated all CPUs to GPU tasks by adjusting the CPU ratio per task. I think at that time I was trying to run multiple tasks per GPU and the results were bad. I will try again with 1 task per GPU.

I will be away from my computer for a couple of days, so I will try this later in the week.
GitHub: Ricks-Lab
Instagram: ricks_labs
ID: 1775861 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1775914 - Posted: 3 Apr 2016, 8:14:07 UTC
Last modified: 3 Apr 2016, 8:15:47 UTC

Since cpu_lock doesn`t work as intended in r_3330 try -no_cpu_lock in conjunction with -hp.
This helped me on my FX system whilst using r_3330.

Check with taskmanager which GPU instance is pinned to what CPU core.

This is fixed in r3401 and later revisions.


With each crime and every kindness we birth our future.
ID: 1775914 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1775932 - Posted: 3 Apr 2016, 13:45:32 UTC - in response to Message 1775526.  
Last modified: 3 Apr 2016, 14:07:42 UTC

I myself wrote:
My four FuryX's (from PowerColor) PC is currently top host #5.
And this with just 1 WU/GPU.
More isn't possible with the currently driver.
Two Xeon's each with 6 CPU-Cores/12 threads, but HT off for faster GPU app calculation, so 12 CPU-Cores.
1 Core/GPU app, so 8 Cores crunch tasks.


BTW.
Crimson 15.12 is available. (at least for Win8.1 x64)
For some minutes installed, but it's the same v1912.5 driver like in Crimson v15.11 and v15.11.1 Beta.
So I guess still not 2+WUs/GPU possible, right?

RueiKe wrote:
I have been having trouble getting decent RAC with my latest build, FX-8370 with 2 Fury Nanos (powercolor). Both cards have an average loading of about 55%, where my single Fury X system is at about 85%. The RAC of the dual Nano system is significantly lower than the single Fury X. I have tried Crimson 15.12 and the latest 16.3.2, with no difference. I have tried many CPU/GPU per app configurations. I get errors with more than 1 per GPU, so I have settled with 1 task to 1 CPU and 1 GPU. Just curious if you faced this issue and how you overcame it for your 4 Fury X system.


It was a long way to find the best way... ;-)

- - - - - - - - - -
BTW.
I tested Crimson 16.3.2 for around 3 1/2 days, the result:
SETI results are OK.
AstroPulse results go invalid.

I'm back to Crimson 15.12.
- - - - - - - - - -

Like Mike mentioned already, SETI r3330 ATI app don't work well with default cpu_lock (at least at my system). All 4 GPU apps were fixed at Core#0.
(This could be the reason that your Fury Nano's have ~ 55 % GPU Load each.)
I need to use -no_cpu_lock in cmdline.txt file (more slightly further down).

You use the stock SETI@home apps.
You should/could go with opti apps, at least higher RAC for the CPUs (and just the correct/best/fastest GPU apps installed/used).
Example here (in the middle of the site). If you decided to go with opti apps, we will help you.


If you would like to go still with stock apps... :

(I tell you what I would do (if I would go with stock apps))

You should make an app_config.xml (on the bottom of the site) file with:
<app_config>
  <app>
    <name>setiathome_v7</name>
  <gpu_versions>
    <gpu_usage>1</gpu_usage>
    <cpu_usage>1</cpu_usage>
  </gpu_versions>
  </app>
  <app>
    <name>setiathome_v8</name>
  <gpu_versions>
    <gpu_usage>1</gpu_usage>
    <cpu_usage>1</cpu_usage>
  </gpu_versions>
  </app>
  <app>
    <name>astropulse_v7</name>
  <gpu_versions>
    <gpu_usage>1</gpu_usage>
    <cpu_usage>1</cpu_usage>
  </gpu_versions>
  </app>
</app_config>

(I don't think there are SETIv7 tasks still around, but it's still possible to check it in the project prefs, so I included it.)

This make: 1 CPU-thread reserved for 1 GPU app (on this CPU-thread no CPU app/task (theoretically)).

My ap_cmdline_win_x86_SSE2_OpenCL_ATI.txt file in my setiathome.berkeley.edu project folder have inside:
-cpu_lock -instances_per_device 1 -unroll 18 -ffa_block 2830 -ffa_block_fetch 2830 -tune 1 64 4 1 -tune 2 64 4 1 -oclFFT_plan 256 16 256 -hp

My mb_cmdline_win_x86_SSE2_OpenCL_ATi_HD5.txt file in my setiathome.berkeley.edu project folder have inside:
-instances_per_device 1 -no_cpu_lock -sbs 512 -period_iterations_num 20 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 -hp

(With stock apps, the .TXT files could have the same name.
But you could have more (a few) .TXT files for SETI (MB) apps.)

This cmdline settings should go with your FuryX, could be tested on your Fury Nano's.
ID: 1775932 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : Number crunching : Radeon Software Crimson


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.