too many invalid amd gpu units

Message boards : Number crunching : too many invalid amd gpu units
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Changeling

Send message
Joined: 24 May 99
Posts: 27
Credit: 14,433,218
RAC: 0
United States
Message 1778165 - Posted: 11 Apr 2016, 14:01:40 UTC

Currently running Radeon 16.3. Computer "Wedge"-Typical results tend to be 200 valid, 50 invalid. My other active computer "Storm Trooper" gets 0 invalids.

I have tried reverting to older drivers, I cannot get CCC to install on anything below 15.7.1. The drivers install, just not CCC. Seti doesn't see a graphics card without CCC. I have searched the internet, being unable to get CCC installed on older versions is a common problem. I have used both AMD's registry scrubber and another one written by a frustrated AMD user. Spent roughly 1.5 days trying to deal with this, no luck.

Running 15.7.1 or 16.1 was a disaster with valid results being at most around 60% and often considerably worse. So 16.3 is a big improvement, but still getting a large amount of invalids.

Reinstalling the OS is not an option, too much time needed for that on an active computer. My other active comp is using a much older version of CCC, and has not been updated, under the principle of "don't fix it if it isn't broke". It has no problems at all.

Any suggestions?...Please...Will give 100,000 internet points for a solution!
ID: 1778165 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1778171 - Posted: 11 Apr 2016, 14:28:23 UTC - in response to Message 1778165.  

Have you tried cleaning the dust out of your computer?
It could be a heat problem.
ID: 1778171 · Report as offensive
Profile Changeling

Send message
Joined: 24 May 99
Posts: 27
Credit: 14,433,218
RAC: 0
United States
Message 1778187 - Posted: 11 Apr 2016, 21:04:06 UTC - in response to Message 1778171.  

It's a brand new build, less that 2 months old, I think.
Also, the cpu's and the graphic card never go above 60c. I deliberately under clocked the GPU and set the fan to run manually so that it stays below 60c, usually running around 57-58c.

I don't beat up my equipment,want it to last a long time. The computer I have the most credit on was finally retired after 7 years with only 2 fan replacements during that whole time. I treat my builds gently. :--)

Have you tried cleaning the dust out of your computer?
It could be a heat problem.

ID: 1778187 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1778188 - Posted: 11 Apr 2016, 21:04:45 UTC

You have the same GPU that i have so we should get it fixed.
As i can see you are running 2 instances and thats about it.
You can either change to 1 instance or add a few params to comandline text file.
Best thing would be to upgrade to a more recent app version.


With each crime and every kindness we birth our future.
ID: 1778188 · Report as offensive
Profile Changeling

Send message
Joined: 24 May 99
Posts: 27
Credit: 14,433,218
RAC: 0
United States
Message 1778253 - Posted: 12 Apr 2016, 2:50:12 UTC - in response to Message 1778188.  

1 instance arrrgh, I ran 2 on a wimpy 6850, and are running 2 on a 7870. Add a few params sounds good to me. I leave one core free to control the gpu, I think the standard setting is .04 cpu's, do I need to increase that to maybe .1 or .2 perhaps? Or is there something else I can do?

As for the latest app, do you mean the AMD crimson control center?. I have 16.3, the latest version is 16.4.1 I was afraid to upgrade as I was worried it might make things worse again and at least 16.3 is sort of, mostly stable...

Thanks for replying Mike, also you too Brent

You have the same GPU that i have so we should get it fixed.
As i can see you are running 2 instances and thats about it.
You can either change to 1 instance or add a few params to comandline text file.
Best thing would be to upgrade to a more recent app version.

ID: 1778253 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1778257 - Posted: 12 Apr 2016, 3:47:53 UTC

Check for a file called mb_cmdline_win_x86_SSE2_OpenCL_ATi_HD5.txt and add the following.

-sbs 384 -instances_per_device 2 -period_iterations_num 40 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 32 -oclfft_tune_cw 32 -no_cpu_lock

Also free up another CPU core.
For 2 GPU instance you need 2 free cores.


With each crime and every kindness we birth our future.
ID: 1778257 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1778262 - Posted: 12 Apr 2016, 4:07:41 UTC - in response to Message 1778165.  
Last modified: 12 Apr 2016, 4:10:43 UTC

Currently running Radeon 16.3. Computer "Wedge"-Typical results tend to be 200 valid, 50 invalid. My other active computer "Storm Trooper" gets 0 invalids.

I have tried reverting to older drivers, I cannot get CCC to install on anything below 15.7.1. The drivers install, just not CCC. Seti doesn't see a graphics card without CCC. I have searched the internet, being unable to get CCC installed on older versions is a common problem. I have used both AMD's registry scrubber and another one written by a frustrated AMD user. Spent roughly 1.5 days trying to deal with this, no luck.

Running 15.7.1 or 16.1 was a disaster with valid results being at most around 60% and often considerably worse. So 16.3 is a big improvement, but still getting a large amount of invalids.

Reinstalling the OS is not an option, too much time needed for that on an active computer. My other active comp is using a much older version of CCC, and has not been updated, under the principle of "don't fix it if it isn't broke". It has no problems at all.

Any suggestions?...Please...Will give 100,000 internet points for a solution!

When you say you "can't get CCC to install on anything below 15.7.1" are you trying to install a driver over another one, or trying to install a different version of the control center than the one that comes in the driver package? Or do you mean you can't use a driver older than Cat 15.7 because it is the first release to support the R9 380?
I've installed just the drivers on many systems without the control center and never had an issue with BOINC being able to use the the GPU.

A tool you might want to use is DDU to clean out any mismatched driver components. Then you can do a clean driver install.

As far as driver release go I've used Cat 15.7, Cat 15.7.1, Crimson 15.12, & Crimson 16.3.2 with my R9 390X.
16.3.2 was an absolute mess. The first time I installed it the system crashed. After cleaning out the drivers and reinstalling I was having GUI lag issue. Even when not running BOINC. Just idle Windows desktop usage.
15.7, 15.7.1, & 15.12 were OK. 15.12 Had reduced CPU time for some projects vs 15.7.x. However with 15.12 I experienced a higher rate of crashes while playing games. So I have since gone back to 15.7.1.
I'm also using Cat 15.7.1 on my HD6870. I haven't found any differences vs Cat 14.12 when running work.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1778262 · Report as offensive
Profile Changeling

Send message
Joined: 24 May 99
Posts: 27
Credit: 14,433,218
RAC: 0
United States
Message 1778302 - Posted: 12 Apr 2016, 9:16:03 UTC - in response to Message 1778262.  

[quote]When you say you "can't get CCC to install on anything below 15.7.1" are you trying to install a driver over another one, or trying to install a different version of the control center than the one that comes in the driver package? Or do you mean you can't use a driver older than Cat 15.7 because it is the first release to support the R9 380?


I did a clean install (used the amd official uninstaller and another 3rd party uninstaller after the amd one failed to fix the problem. Used on each reinstall, no attempts to install over older drivers.
I tried
4.12 no ccc installed, drivers installed, message from seti "no graphics card found"
4.4 no ccc installed, drivers installed, message from seti "no graphics card found"
15.7.1 ccc installed, drivers installed, all gpu work units immediately errored out.
16.2 Crimson control panel installed, drivers installed, invalid vs valid 40% vs 60%
16.3 Used currently, invalid vs valid roughly 20% vs 80%

on 4.12 and 4.4 I wasn't even given the option of installing the CCC control panel, no checkbox for it at all. People were having this problem and posting about it online. Tried all their suggestions, no ccc panel installed.
Again, all attempts were made after uninstalling the older drivers.

A tool you might want to use is DDU to clean out any mismatched driver components. Then you can do a clean driver install.


I have not tried DDU, but did used the official AMD uninstaller and another uninstaller people were recommending.

As far as driver release go I've used Cat 15.7, Cat 15.7.1, Crimson 15.12, & Crimson 16.3.2 with my R9 390X.
16.3.2 was an absolute mess. The first time I installed it the system crashed. After cleaning out the drivers and reinstalling I was having GUI lag issue. Even when not running BOINC. Just idle Windows desktop usage.
15.7, 15.7.1, & 15.12 were OK. 15.12 Had reduced CPU time for some projects vs 15.7.x. However with 15.12 I experienced a higher rate of crashes while playing games. So I have since gone back to 15.7.1.
I'm also using Cat 15.7.1 on my HD6870. I haven't found any differences vs Cat 14.12 when running work.


My other currently running rig, with a 7870 card is using version 4.12 and generates no invalids at all.

With the R9 380, I made multiple attempts at reinstalling over the course of a day and a half, the best situation I got was using 16.3, but again, that one is generating 20% invalids.

Don't know what to do, suppose I could try DDU, maybe it will work but I doubt it.
ID: 1778302 · Report as offensive
Profile Changeling

Send message
Joined: 24 May 99
Posts: 27
Credit: 14,433,218
RAC: 0
United States
Message 1778326 - Posted: 12 Apr 2016, 12:50:31 UTC - in response to Message 1778302.  

Update:

Using the DDU cleanup utility, I tried to reinstall (fresh, no overwrite)

14.12 again, no CCC installed, drivers installed, seti sees no graphic card and will not run GPU units

Another clean install

15.7.1 again CCC installed, seti sees graphic card, results out of 30 units processed, 14 errored out, 16 completed successfully.

Going to go back to 16.3, but that still had a 20% invalid rate for results.

Any more suggestions?
ID: 1778326 · Report as offensive
Ulrich Metzner
Volunteer tester
Avatar

Send message
Joined: 3 Jul 02
Posts: 1256
Credit: 13,565,513
RAC: 13
Germany
Message 1778328 - Posted: 12 Apr 2016, 13:13:07 UTC

Faulty RAM on the graphics card? :?
Aloha, Uli

ID: 1778328 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1778329 - Posted: 12 Apr 2016, 13:18:20 UTC - in response to Message 1778328.  

Faulty RAM on the graphics card? :?


Nope, thats driver related.

Read my post below.


With each crime and every kindness we birth our future.
ID: 1778329 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1778348 - Posted: 12 Apr 2016, 15:28:23 UTC - in response to Message 1778326.  

Update:

Using the DDU cleanup utility, I tried to reinstall (fresh, no overwrite)

14.12 again, no CCC installed, drivers installed, seti sees no graphic card and will not run GPU units

Another clean install

15.7.1 again CCC installed, seti sees graphic card, results out of 30 units processed, 14 errored out, 16 completed successfully.

Going to go back to 16.3, but that still had a 20% invalid rate for results.

Any more suggestions?

I understand what you were/are doing now. The R9 300 series was released after Cat 14.12. So I would not expect 14.12 or older to work correctly with the newer cards. Some .inf and installer tweaks might work to get 14.9 or 14.12 working. As those drivers supported the R9 285 which is also a Tonga based GPU & I believe could run 2 instances without problems. Even with 15.x releases GCN 1.1 GPUs do not seem to like running multiple instances.

If you are still running 2 SETI@home tasks at once have you tried the parameters Mike suggested?
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1778348 · Report as offensive
Profile Changeling

Send message
Joined: 24 May 99
Posts: 27
Credit: 14,433,218
RAC: 0
United States
Message 1778393 - Posted: 12 Apr 2016, 18:16:10 UTC - in response to Message 1778348.  

Oops, no I missed his post between mine and yours. Feel really stupid on that.

OK, changes made, text added to file "mb_cmdline_win_x86_SSE2_OpenCL_ATi_HD5.txt", but file was empty to start with. Is that normal? Also left a second core open for the second instance. Now waiting to see results...

If you are still running 2 SETI@home tasks at once have you tried the parameters Mike suggested?

ID: 1778393 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1778395 - Posted: 12 Apr 2016, 18:17:43 UTC - in response to Message 1778393.  

Yes it is normal for that file to be empty, it is there for tweaking purposes.
ID: 1778395 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1778448 - Posted: 12 Apr 2016, 20:13:37 UTC - in response to Message 1778393.  

Oops, no I missed his post between mine and yours. Feel really stupid on that.

OK, changes made, text added to file "mb_cmdline_win_x86_SSE2_OpenCL_ATi_HD5.txt", but file was empty to start with. Is that normal? Also left a second core open for the second instance. Now waiting to see results...

If you are still running 2 SETI@home tasks at once have you tried the parameters Mike suggested?


Fine.

Let it run for a few days to see a difference.
This should also speed up processing.


With each crime and every kindness we birth our future.
ID: 1778448 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1778460 - Posted: 12 Apr 2016, 20:39:35 UTC
Last modified: 12 Apr 2016, 20:39:59 UTC

BTW

Dont worry if you are getting long runners.
GBT data should be sent to AMD GPU`s also and are expected to be merely VLARs.


With each crime and every kindness we birth our future.
ID: 1778460 · Report as offensive
Profile Changeling

Send message
Joined: 24 May 99
Posts: 27
Credit: 14,433,218
RAC: 0
United States
Message 1778529 - Posted: 12 Apr 2016, 23:20:31 UTC - in response to Message 1778460.  

Thank you much for the help, very appreciated!

BTW

Dont worry if you are getting long runners.
GBT data should be sent to AMD GPU`s also and are expected to be merely VLARs.

ID: 1778529 · Report as offensive

Message boards : Number crunching : too many invalid amd gpu units


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.