Multiple tasks for ATI R9 290X

Message boards : Number crunching : Multiple tasks for ATI R9 290X
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
ChrisD
Volunteer tester

Send message
Joined: 25 Sep 99
Posts: 158
Credit: 2,496,342
RAC: 0
Denmark
Message 1736986 - Posted: 25 Oct 2015, 18:18:49 UTC
Last modified: 25 Oct 2015, 18:21:54 UTC

Lunatics Optimized Apps installed, and seems to work nicely. :)

However, reading through all the read-me's and instructions seems to confuse me somewhat, so maybe someone would be kind enough to help me out.

CPU I7 3770K, GPU R9 290X

I have ended up using anonymous CPU??
I have only ONE GPU task running (and 8 cpu tasks)

Instructions tells me to 'reserve' one core for GPU tasks, how?
Up to 4 GPU tasks can be run simultaneously, how?

I just changed my graphics card from a HD7970 to this new R9 290X and according to SETI this should boot performance from .7 to 1.0, but much to my surprise, the new card takes longer to crunch the tasks????

There are a lot of 'config' files. Which ones are needed, and what content would work for me?

Thanks :) and happy crunching.

ChrisD
ID: 1736986 · Report as offensive
Darth Beaver Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 20 Aug 99
Posts: 6728
Credit: 21,443,075
RAC: 3
Australia
Message 1737843 - Posted: 28 Oct 2015, 21:37:27 UTC - in response to Message 1736986.  
Last modified: 28 Oct 2015, 21:49:03 UTC

Sorry Chris D nobody has answered you

First yes your R9 GPU will do more than 1 units at a time

You say you loaded Lunatic . Lunatic is known as Anonymous platform when you look at your stats bothe the GPU and CPU will show Anonymous platform for the units plus what type of unit Astropulse or Multibeam

To leave 1 cpu free for your GPU you use your computer preferances on the client

Click Tool tab , computer pref then go to processor usage tab and then look for On multiprocessor systems use at most ?? % of processors

In your case 8 cores that would mean set it to 88% = 7 cores

Now to do more than 1 unit on the GPU this is a little more complicated but is still easy

What you have to do is open a file in the Bionic Data folder called appinf.xml
It's in the seti project folder in the Bionic data folder

open the file with word pad

the you will have to look for the lines in the code that say this




<app_version>
<app_name>astropulse_v7</app_name>
<version_num>710</version_num>
<platform>windows_x86_64</platform>
<avg_ncpus>0.04</avg_ncpus>
<max_ncpus>0.2</max_ncpus>
<plan_class>opencl_nvidia_100</plan_class>
<cmdline></cmdline>
<coproc>
<type>CUDA</type>
<count>0.5</count>
</coproc>
<file_ref>


The lines you need to change are

<type>CUDA</type>
<count>0.5</count>

You change the <count> ?? <count> part and you will find there a a few of them to change in the whole file

The number you use are

1 = 1 unit
0.5 = 2 units
0.33 = 3 units
0.25 = 4 units

The example file i'm using for this post is out of my system so you will find that your file will have a 1 instead of 0.5

<count>1<count>

Once you have saved the file (no need to save as) just run Bionic and it should start doing more than one unit on the GPU

To do more on the CPU's just set the pref in the client to how many you wish .

your system is 8 cores so

100 % = 8
88% = 7
75% = 6
63% = 5
50% = 4
36% = 3
25% = 2
12% = 1

Hope that explains it

Edit: Ops your using a ATI card so it may not say

<type>CUDA</type>
<count>0.5</count>
where it says on mine

<type>CUDA<type> it may have ATI or something like than , no matter it's the line under that you change

<count>??<count>

Just remember to change all of them in the file

you can use a config file too but someone else will have to explain what commands you would put into a config.xml file for your R9
ID: 1737843 · Report as offensive
Profile Louis Loria II
Volunteer tester
Avatar

Send message
Joined: 20 Oct 03
Posts: 259
Credit: 9,208,040
RAC: 24
United States
Message 1737844 - Posted: 28 Oct 2015, 21:45:38 UTC - in response to Message 1737843.  

Sorry Chris D nobody has answered you

First yes your R9 GPU will do more than 1 units at a time

You say you loaded Lunatic . Lunatic is known as Anonymous platform when you look at your stats bothe the GPU and CPU will show Anonymous platform for the units plus what type of unit Astropulse or Multibeam

To leave 1 cpu free for your GPU you use your computer preferances on the client

Click Tool tab , computer pref then go to processor usage tab and then look for On multiprocessor systems use at most ?? % of processors

In your case 8 cores that would mean set it to 88% = 7 cores

Now to do more than 1 unit on the GPU this is a little more complicated but is still easy

Glen, I'm glad you responded to this request. The R9 Fury is a beast. I feel that a great many people are missing out on multiple WUs per GPU. You were a great help to me also.

Happy Crunching!

What you have to do is open a file in the Bionic Data folder called appinf.xml
It's in the seti project folder in the Bionic data folder

open the file with word pad

the you will have to look for the lines in the code that say this




<app_version>
<app_name>astropulse_v7</app_name>
<version_num>710</version_num>
<platform>windows_x86_64</platform>
<avg_ncpus>0.04</avg_ncpus>
<max_ncpus>0.2</max_ncpus>
<plan_class>opencl_nvidia_100</plan_class>
<cmdline></cmdline>
<coproc>
<type>CUDA</type>
<count>0.5</count>
</coproc>
<file_ref>


The lines you need to change are

<type>CUDA</type>
<count>0.5</count>

You change the <count> ?? <count> part and you will find there a a few of them to change in the whole file

The number you use are

1 = 1 unit
0.5 = 2 units
0.33 = 3 units
0.25 = 4 units

The example file i'm using for this post is out of my system so you will find that your file will have a 1 instead of 0.5

<count>1<count>

Once you have saved the file (no need to save as) just run Bionic and it should start doing more than one unit on the GPU

To do more on the CPU's just set the pref in the client to how many you wish .

your system is 8 cores so

100 % = 8
88% = 7
75% = 6
63% = 5
50% = 4
36% = 3
25% = 2
12% = 1

Hope that explains it


Glen, I'm glad you responded to this request. Great explanation. You have been a great help to me also.

Happy Crunching!
ID: 1737844 · Report as offensive
Darth Beaver Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 20 Aug 99
Posts: 6728
Credit: 21,443,075
RAC: 3
Australia
Message 1737847 - Posted: 28 Oct 2015, 21:56:16 UTC - in response to Message 1737844.  

Thank's Lousis . Sorry i haven't been around much i've been mining Crypto but i'm back so ask awy if you need any help i'll try and help you's if i can :-)
ID: 1737847 · Report as offensive
ChrisD
Volunteer tester

Send message
Joined: 25 Sep 99
Posts: 158
Credit: 2,496,342
RAC: 0
Denmark
Message 1740556 - Posted: 8 Nov 2015, 9:07:22 UTC

Thanks for the kind answers :)

The time You post Your answers is not shown here, so I do not know how long ago You wrote, so sorry for my delayed answer.

I have been trying out both suggestions, and they work. :)

88% cpu useage and only 7 tasks are crunching.

The ATI count .5 does work too, now it displays 0.04CPU + .5 ATI, and 2 GPU tasks are shown active.

So far, so good. Here is my findings:

Increasing GPU tasks to 2 doubles the crunching time for each task. GPU load is unchanged (around 90%), but CPU load is up.
All it seems to get me is that the gap in GPU load between 2 WU's are now gone, but that does not add much to the throughput.

Lowering the CPU tasks to 7 gets CPU load down from a solid 100 to around 96%.
The free time is equally distributed along the 8 cores, so removing a task will not free up a core, but just lower the overall CPU utilization. ??

Wether I lower the CPU tasks to 7 or increases GPU tasks to 2, all I get is lower throughput.

At least I have tried :)

Thansk for Your help.

ChrisD
ID: 1740556 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1740560 - Posted: 8 Nov 2015, 9:32:28 UTC
Last modified: 8 Nov 2015, 9:33:39 UTC

To increase GPU load add the comandline values suggested in the read me`s.

Add to a file called mb_cmdline_win_x86_SSE2_OpenCL_ATi_HD5.txt.

-sbs 256 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64.


With each crime and every kindness we birth our future.
ID: 1740560 · Report as offensive
Darth Beaver Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 20 Aug 99
Posts: 6728
Credit: 21,443,075
RAC: 3
Australia
Message 1740659 - Posted: 8 Nov 2015, 21:22:06 UTC

Chris D your times will increase if you do 2 GPU tasks , but over 24 hr you will actually do more work .

Do not add a comandline value such as -sbs 256 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64.

If you are going to do more than 1 unit on a GPU

Doing 2 units on the GPU will push the utilisation up to around 95% and the above cmdline will also push up the utilisation so using both has no advantage .

Use the appinf.xm to do more units or add the cmdline but don't use both .

As for the CPU you can only do 1 unit per core full stop so the more cores you leave free the less it will do .

You should leave 1 core free for the GPU if you are doing AP's however it not necessary so much when doing MB's

There is a program called SIV64 that will help in assigning 1 unit to a particular core this is handy if you wish to watch vid's or do other things .

eg: of times

1 units 28 mins = 2 units in 58 mins
2 units 50 mins each 2 at a time = 2 units in 50 mins being 8 mins faster than doing 1 single unit

These are time I get and I have just done a test to prove to someone that 2 units at a time is faster than a cmdline under windows however if your using Linux then this may be different .

I have 7 core running and do 2 units AP and 3 units MB at a time on the GPU . I have a AMD FX chip with 8 core and 1 680 gtx GPU and 1 650 GTX on that machine and I'm getting 23,000 and my RAC has not peeked yet

There is 1 more tweek to do with CUDA you can use look in the MBCUDA file and follow the instructions for your card use abovenormal , one word and 16 , 200

Remember if you use more cores and do more units on the GPU your machine will get hot so keep a eye on the temps and adjust the workload accordingly to the temps .
ID: 1740659 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1740672 - Posted: 8 Nov 2015, 23:05:07 UTC - in response to Message 1740659.  
Last modified: 8 Nov 2015, 23:16:57 UTC


Do not add a comandline value such as -sbs 256 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64.

If you are going to do more than 1 unit on a GPU

Doing 2 units on the GPU will push the utilisation up to around 95% and the above cmdline will also push up the utilisation so using both has no advantage .


It's wrong advise with incorrect logic.
GPU utilisation is not the single speed-characterizing parameter. Abovementioned string allows app to use GPU device better, with more optimal CU units load. GPU utilization value doesn't look into how many CUs work or how effectively those CU work. GPU as device will be "busy" even if some CUs stalled. Running few tasks per GPU allows to cover big gaps when full GPU device idle, but AFAIK even quite new drivers dont's allow to load different CUs with kernels from different OpenCL contexts (though all modern GPU are capable to run different kernels on different CUs inside single device).

EDIT: And what a CUDA nonsense in ATI R9 290X thread?? Mix it all together to say smth, ugh ?
ID: 1740672 · Report as offensive
Darth Beaver Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 20 Aug 99
Posts: 6728
Credit: 21,443,075
RAC: 3
Australia
Message 1740677 - Posted: 8 Nov 2015, 23:18:41 UTC - in response to Message 1740672.  

Sorry Raistmer only going from what I was told by Tbar as he told me there is no point using both the appinf.xml and a cmdline but your the Guru so what you say ppl should listen to
ID: 1740677 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1740683 - Posted: 8 Nov 2015, 23:33:46 UTC - in response to Message 1740677.  
Last modified: 8 Nov 2015, 23:45:15 UTC

Sorry Raistmer only going from what I was told by Tbar as he told me there is no point using both the appinf.xml and a cmdline but your the Guru so what you say ppl should listen to

Glenn, I don't believe I said ANYTHING to you about running MBs on AMD cards. I Did Keep repeating to you I was talking about how to get other Team Members GTX 750Ti to run 30 minute APs such as this, http://setiathome.berkeley.edu/results.php?hostid=7258715&offset=40&appid=20. As with your problem understanding the difference between a 750Ti and a 680, you also can't tell the difference between MBs on AMDs and APs on nVidias.
Please do not make things up about other people and post them on these boards.

Either show me where I made this comment or remove it.
no point using both the appinf.xml and a cmdline
ID: 1740683 · Report as offensive
Darth Beaver Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 20 Aug 99
Posts: 6728
Credit: 21,443,075
RAC: 3
Australia
Message 1740687 - Posted: 8 Nov 2015, 23:47:54 UTC

So Tbar you didn't say this then

When you run a high unroll it will use most of the GPU, trying to run another task that also tries to use all the GPU will not work very well.


Yep I make things up and spread gossip that is untrue .
ID: 1740687 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1740691 - Posted: 8 Nov 2015, 23:57:11 UTC - in response to Message 1740687.  
Last modified: 9 Nov 2015, 0:03:58 UTC

So Tbar you didn't say this then

When you run a high unroll it will use most of the GPU, trying to run another task that also tries to use all the GPU will not work very well.


Yep I make things up and spread gossip that is untrue .

How you could possibly get
no point using both the appinf.xml and a cmdline
out of
When you run a high unroll it will use most of the GPU, trying to run another task that also tries to use all the GPU will not work very well
when talking about a 750TI is a complete mystery. Yes I stand by not using an unroll of 12 on a 750Ti and then then trying to run two tasks. I said NOTHING about an app_info there, did I? How is a high unroll used on an AP related to the cmdline on a MB?
ID: 1740691 · Report as offensive
Darth Beaver Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 20 Aug 99
Posts: 6728
Credit: 21,443,075
RAC: 3
Australia
Message 1740698 - Posted: 9 Nov 2015, 0:12:15 UTC - in response to Message 1740691.  

Tbar if your going to be silly ....

I too did not mention MB's or Ap's and no you didn't use the word appinf.xml but you didn't need to , how else are you to do 2 units if not using a apinf.xml

The whole point of my post to chrisd was to just to try and give him some explaination on what he should do .

Use the appinf.xml if he doesn't wish to stuff around trying this cmdline or that cmdline and doesn't understand what they do or stuff around and try using the cmdline .

My post was to point out that doing 1 unit (stock) over appinf.xml will give him a higher Rac I didn't say one is faster than the other and not to use both as of your advise .

As for weather it a 750 or 680 doesn't matter if cmdline is faster then I would assume it would be faster on all cards of the Keplar class .

I can't win if I disagree with you I'm wrong if I agree with you I'm still wrong so $%^& &*^
ID: 1740698 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1740704 - Posted: 9 Nov 2015, 0:34:33 UTC - in response to Message 1740698.  

Tbar if your going to be silly ....

I too did not mention MB's or Ap's and no you didn't use the word appinf.xml but you didn't need to , how else are you to do 2 units if not using a apinf.xml

The whole point of my post to chrisd was to just to try and give him some explaination on what he should do .

Use the appinf.xml if he doesn't wish to stuff around trying this cmdline or that cmdline and doesn't understand what they do or stuff around and try using the cmdline .

My post was to point out that doing 1 unit (stock) over appinf.xml will give him a higher Rac I didn't say one is faster than the other and not to use both as of your advise .

As for weather it a 750 or 680 doesn't matter if cmdline is faster then I would assume it would be faster on all cards of the Keplar class .

I can't win if I disagree with you I'm wrong if I agree with you I'm still wrong so $%^& &*^

Two things to remember, APs ARE NOT MBs.
What works on a nVidia 750Ti running APs Might NOT Work on a Hawaii running MBs.

Being Silly doesn't include getting annoyed when someone posts one of your comments totally out of context and then tries to attribute that comment to you.
ID: 1740704 · Report as offensive
Profile Angela Special Project $75 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 16 Oct 07
Posts: 13130
Credit: 39,854,104
RAC: 31
United States
Message 1740719 - Posted: 9 Nov 2015, 1:25:16 UTC

I think there was a misunderstanding between friends.

Hoping for a peaceful turn in this thread...
ID: 1740719 · Report as offensive
Darth Beaver Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 20 Aug 99
Posts: 6728
Credit: 21,443,075
RAC: 3
Australia
Message 1740722 - Posted: 9 Nov 2015, 1:29:42 UTC

Sorry Tbar I shall not use your name for any advise you may have given me in future in case I have misunderstood you .
ID: 1740722 · Report as offensive
Darth Beaver Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 20 Aug 99
Posts: 6728
Credit: 21,443,075
RAC: 3
Australia
Message 1740745 - Posted: 9 Nov 2015, 3:33:29 UTC

Let's all remember this the thread is

Multiple tasks for ATI R9 290X was the question not weather one way is better than the other and it does not say how do you use the cmdline it asks how to do Multiple tasks for ATI R9 290X
ID: 1740745 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1740779 - Posted: 9 Nov 2015, 6:11:55 UTC - in response to Message 1740556.  

Thanks for the kind answers :)

The time You post Your answers is not shown here, so I do not know how long ago You wrote, so sorry for my delayed answer.

I have been trying out both suggestions, and they work. :)

88% cpu useage and only 7 tasks are crunching.

The ATI count .5 does work too, now it displays 0.04CPU + .5 ATI, and 2 GPU tasks are shown active.

So far, so good. Here is my findings:

Increasing GPU tasks to 2 doubles the crunching time for each task. GPU load is unchanged (around 90%), but CPU load is up.
All it seems to get me is that the gap in GPU load between 2 WU's are now gone, but that does not add much to the throughput.

Lowering the CPU tasks to 7 gets CPU load down from a solid 100 to around 96%.
The free time is equally distributed along the 8 cores, so removing a task will not free up a core, but just lower the overall CPU utilization. ??

Wether I lower the CPU tasks to 7 or increases GPU tasks to 2, all I get is lower throughput.

At least I have tried :)

Thansk for Your help.

ChrisD

Interesting results compared to others with Hawaii cards when trying to run Multiple MBs. If you have a chance you may want to read this thread starting with this post, http://setiathome.berkeley.edu/forum_thread.php?id=78300&postid=1732890#1732890. Note the links in that post, basically, people with Hawaii and newer AMD cards usually receive many Errors and Invalids when running more than 1 MB at a time on those cards. That's probably why most people are not in a hurry to tell you to run Multiple MBs on your Hawaii card. It's surprising all you experienced was double run times, you're lucky. I'd suggest trying the cmdline line Mike suggested and watching Dirk's Fury thread to see if there are any improvements in the situation. For me the most important parts of the MB cmdline is the -sbs, -oclfft_tune_gr, and -oclfft_tune_wg settings. The other settings don't seem to make much difference on my cards. In Windows using 256 for those 3 settings should give you a little speed up over the default settings.
Good Luck.
ID: 1740779 · Report as offensive
ChrisD
Volunteer tester

Send message
Joined: 25 Sep 99
Posts: 158
Credit: 2,496,342
RAC: 0
Denmark
Message 1740879 - Posted: 9 Nov 2015, 17:15:09 UTC
Last modified: 9 Nov 2015, 17:15:50 UTC

I finally got around to trying the command line:

-sbs 256 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64.

to control my R9 290X and it seems that crunching time is reduced by 20%. :) 1 GPU Task, only.

GPU utilization is unaltered, so it seems that the GPU just calculates the WU in a smarter more efficient way.

Now, this is for the 44 Compute Unit Hawaii GPU.

I also have a Tahiti powered (HD7970) but this GPU has only 32 Compute Units, so the above command line makes this card choke.

I remember seeing a 'read-me' where both command lines for a high powered card like the R9 290X and for a 'medium powered' card like the 7970, was listed, but I can not remember where I saw it.

Can anybody help me here?

Thanks, and happy crunching :)

ChrisD
ID: 1740879 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1740893 - Posted: 9 Nov 2015, 18:12:52 UTC

I used the same values for my old 7970.
Probably you need to add period_iterations_num 40.

Thats what i used.


With each crime and every kindness we birth our future.
ID: 1740893 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : Multiple tasks for ATI R9 290X


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.