Upgraded APU to AMD A10-7800 now computer crashes on GPU MB

Message boards : Number crunching : Upgraded APU to AMD A10-7800 now computer crashes on GPU MB
Message board moderation

To post messages, you must log in.

AuthorMessage
Deadmann

Send message
Joined: 30 Jun 99
Posts: 13
Credit: 19,410,104
RAC: 7
United States
Message 1955257 - Posted: 13 Sep 2018, 21:31:37 UTC

Upgraded APU from an A8-5500 to A10-7800 because I wanted a bit more number crunching power and I found it cheap on ebay. I then updated the video drivers to the 18.5.1 version. Now computer crashes on when GPU MB. I had been running the MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3557.exe with the old APU... That seems to be the correct binary for this GPU also. With the optimized binary the computer crashes within 3-20 min.

I've used the lunatics_win64_v0.45_beta6-for-sog_setup.exe to install the binaries and configure the app_info.xml.

Since I was having problems with the optimized binary I shutdown BOINC, renamed the app_info.xml, and restarted BOINIC so it would use the default binaries. With the default binaries the computer took a couple of hours before it crashed.

The computer is: 8532490
OS: Win 10 Pro (fully updated)
Motherboard: Gigabyte GA-F2A88X-D3H
CPU (APU): A10-7800
Video: (part of the APU) AMD Radeon R7
Video drivers: win10-64bit-radeon-software-adrenalin-edition-18.5.1-may23
Power: Cooler Master 600W something (both APU's are 65W versions)

https://setiathome.berkeley.edu/show_host_detail.php?hostid=8532490

1. Should I be using a different driver version? There is an 19.8.2 version available.
2. Should I be using a different binary version?
3. Do I have a defective APU?
4. Is there a better (free) program for monitoring temperatures than speedfan?

Note: I've been using speedfan to monitor the temperatures... The values it's getting seem a bit off. When the entire APU is idle speefan shows the GPU temp in the 40's F, the Core in the 50's F, CPU 65-80F. The room temperature is about 75F. They go up when the APU is in use. I'm wondering it the APU is thermally damaged.

Thank you
ID: 1955257 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1955276 - Posted: 13 Sep 2018, 23:08:32 UTC
Last modified: 13 Sep 2018, 23:10:05 UTC

As someone who's just recently retired his AMD A6-6400K APU, WHEN my system was new 4+ years ago, I found that these APUs HEAT UP Fast and Hard! My A6 hovered in the High 80s C and Peaked at 90 C WHILE crunching!!! AND, I did NOT use the Integrated Radeon Card... Instead, I was having one core of the APU feeding an NVIDIA GTX-760... Found FAST that I had to have the Stock APU Cooler REPLACED with a CoolerMaster Hyper212 EVO!!! (Use Arctic Silver 5 for best results... ymmv)

BEST $30 I've EVER Spent on a System!!! With the Hyper212 in place, Temps dropped into the Mid 70s C. NOW, Intel or AMD, I ALWAYS install a Hyper212.

[EDIT:]

Assuming you're having any Heat related issues, NOT just Driver issues...


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1955276 · Report as offensive
Profile Kissagogo27 Special Project $75 donor
Avatar

Send message
Joined: 6 Nov 99
Posts: 716
Credit: 8,032,827
RAC: 62
France
Message 1955319 - Posted: 14 Sep 2018, 11:36:25 UTC

try to monitoring your temps with Tthrottle and limit temperature https://efmer.com/
ID: 1955319 · Report as offensive
Deadmann

Send message
Joined: 30 Jun 99
Posts: 13
Credit: 19,410,104
RAC: 7
United States
Message 1955398 - Posted: 14 Sep 2018, 21:46:21 UTC
Last modified: 14 Sep 2018, 21:46:46 UTC

Kissagogo27: Thank you for the pointer to Tthrottle. I'll need to take a close look at that...

TimeLoard04: I don't think this APU is overheating... I ran some more test in the wee hours of the night. It crashed once when speedfan said the CPU temp was 118F (GPU 82F).

I updated the drivers to the latest win10-64bit-radeon-software-adrenalin-edition-18.9.1-sept12.exe but it still crashed....

I tried regediting TdrDelay increased it to 15s. That didn't help.

eyPath : HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\GraphicsDrivers
KeyValue : TdrDelay
ValueType : REG_DWORD

I've noticed that if I use the stock apps the CPU clocks down to about 2.5GHz for some reason... the temps are not high.

I'm going to try attaching a photo of the screen a few seconds after it crashed. Shoot... you can't attach images to this... If people want it I'll need to upload it some place.

Any other suggestions?
ID: 1955398 · Report as offensive
Profile TimeLord04
Volunteer tester
Avatar

Send message
Joined: 9 Mar 06
Posts: 21140
Credit: 33,933,039
RAC: 23
United States
Message 1955421 - Posted: 14 Sep 2018, 22:51:53 UTC - in response to Message 1955398.  

Kissagogo27: Thank you for the pointer to Tthrottle. I'll need to take a close look at that...

TimeLoard04: I don't think this APU is overheating... I ran some more test in the wee hours of the night. It crashed once when speedfan said the CPU temp was 118F (GPU 82F).

I updated the drivers to the latest win10-64bit-radeon-software-adrenalin-edition-18.9.1-sept12.exe but it still crashed....

I tried regediting TdrDelay increased it to 15s. That didn't help.

eyPath : HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\GraphicsDrivers
KeyValue : TdrDelay
ValueType : REG_DWORD

I've noticed that if I use the stock apps the CPU clocks down to about 2.5GHz for some reason... the temps are not high.

I'm going to try attaching a photo of the screen a few seconds after it crashed. Shoot... you can't attach images to this... If people want it I'll need to upload it some place.

Any other suggestions?

If my A6 were at 82 C while crunching, I'd be concerned... That's about where my A6 started at on the Stock Cooler... It tended to hover in the Mid 80s C and would occasionally PEAK at 90+ C!!! The AMDs are NOTORIOUS for having Heating/Cooling issues.

Again, if your A10 were my System, I'd DEFINITELY install, (or have installed by a Pro), a CoolerMaster Hyper212 EVO. The Unit is $29.99 off of Amazon, Plus Tax, Plus Shipping. (If you have Prime Membership, Shipping is FREE...) Also, pick up Arctic Silver 5 for the Thermal Paste to be used on the Hyper212. With the Hyper212, I can 99.999% Guarantee that your Temps WILL drop to the Mid 70s C WHILE Crunching.


TL

PS: You can Install CPUID Hardware Monitor to get FULL System Temps and Specs that can be monitored... (Now at Ver. 1.36.0)

Link: CPUID Hardware Monitor 1.36 Download Site.
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1955421 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1955429 - Posted: 15 Sep 2018, 0:23:28 UTC - in response to Message 1955319.  

try to monitoring your temps with Tthrottle and limit temperature https://efmer.com/



+1
A proud member of the OFA (Old Farts Association).
ID: 1955429 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1955431 - Posted: 15 Sep 2018, 0:33:28 UTC - in response to Message 1955398.  

Any other suggestions?


Certainly :)

After you get TThortle installed and the temperatures lowered back out of the stratosphere and have replaced the stock cooler with the recommended one(s) [can't have too many coolers here, add 2 instead of one] Yes I am KIDDING. Add just one.

Might try using the individual configuration on the BoincManager and run 75% of your CPU cores instead of 100%. (Still run the cpus at 100%) Some fairly substantial testing at Lunatics seems to indicate when you are running a pure APU system you can get more production from the whole system that way.

I don't have info on the Lunatics apps but the stock apps, if the GPU is running at 100% all the CPU's run at 22% or so. When you drop one core off, the cpus come back up to speed and the GPU continues to crunch at high speed.

HTH,
Tom
A proud member of the OFA (Old Farts Association).
ID: 1955431 · Report as offensive
Deadmann

Send message
Joined: 30 Jun 99
Posts: 13
Credit: 19,410,104
RAC: 7
United States
Message 1955447 - Posted: 15 Sep 2018, 1:59:45 UTC

TimeLord04: 82 Fahrenheit not Celsius.... But those values coming from SpeedFan and GPU-Z seem incorrect... going down to the 30s and 40s when idle... (hum.... I wonder if it's actually display the Celsius values and putting an F behind them....

I did install that CPUID Hardware Monitor and it's giving me completely different temperatures for the CPU and GPU.... Up in the 80s for CPU and 70-80s with the CPU busy (GPU idle). So MAYBE it is overheating... The computer is in the basement and I'm using remote desktop to look at things... I'll head down there soon and do some testing... I noticed that the fan is only running at about 2500 RPM. It should be running faster.

Here is the photo I took of the screen when it crashed.


Here is a photo I took of the screen with HWMonitor running... (2 SETI CPU tasks running, 0 GPU)


Tom M: I normally have the CPU setting at 70%. That way BOINC will run 2 CPU tasks and 1 GPU.

I'll do some more testing here soon and see what I get. (I wonder if I have a better AMD cooler sitting around...)

PS. I bought this motherboard and the original A8-5500 when I was pretty broke in early 2014... This is really reminding me that I don't like making changes to AMD systems... Too many problems....
ID: 1955447 · Report as offensive
Deadmann

Send message
Joined: 30 Jun 99
Posts: 13
Credit: 19,410,104
RAC: 7
United States
Message 1955476 - Posted: 15 Sep 2018, 6:58:20 UTC

OK. I found a different heatsink-fan. About twice the size of the retail boxed A8-5500 heatsink. The fan on it goes up to about 5500 RPM. The original fan maxed out about 3000 RPM. (A lot more noise for this new fan.) The temps are about 5-10C lower.

But it still crashed within 2 min. after enabling the SETI MB GPU.

When I have just 2 SETI MB CPU's going the CPU is clocked at about 3.6GHz.
When I have 2 SETI MB CPU's and 1 SETI MB GPU going the CPU clock jumping between about 2.5GHz and3.6GHz. Mostly at the 2.5GHz.

That method of putting photos here didn't work. So I just provide a like to where they are on google drive.
https://drive.google.com/drive/folders/1eBIEnv83Websm8O3BhzxNQrXbdZ5X83U?usp=sharing

I think I'm going to try sending this APU back as defective.
ID: 1955476 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1955477 - Posted: 15 Sep 2018, 7:36:29 UTC - in response to Message 1955476.  

You are generally better off not using any on-chip GPU for Seti processing, as the shared power, cache & thermal limits generally result in less work being done than just crunching on the CPU alone (unless its's a very low clock speed and core count chip, then the on-die GPU may be worth the impact on CPU output).
If the CPU temperatures are any higher than 70°c with all 4 CPU cores processing at full clock speed, i'd suggest a better after market cooler.

NB- the easiest method to get a screen shot is using the Windows + PrtScn keys.
It will save the image as a file in a Screenshots folder in your Pictures folder.
Grant
Darwin NT
ID: 1955477 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1955510 - Posted: 15 Sep 2018, 13:49:40 UTC - in response to Message 1955476.  

But it still crashed within 2 min. after enabling the SETI MB GPU.


Does the whole system crash or just the GPU? I was never able to get my GPU to not crash. With tinkering I got it to the point where it would USUALLY run all day but sometimes not. When the gpu crashed and re-started, the seti processing for the gpu didn't restart.

I think I'm going to try sending this APU back as defective.


Or buy an external gpu and stop using the internal one.

HTH,
Tom
A proud member of the OFA (Old Farts Association).
ID: 1955510 · Report as offensive

Message boards : Number crunching : Upgraded APU to AMD A10-7800 now computer crashes on GPU MB


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.