solution with reason for "display driver has stopped responding the system has recovered"

Message boards : Number crunching : solution with reason for "display driver has stopped responding the system has recovered"
Message board moderation

To post messages, you must log in.

AuthorMessage
far_raf

Send message
Joined: 26 Apr 00
Posts: 120
Credit: 47,977,058
RAC: 19
Canada
Message 1500577 - Posted: 6 Apr 2014, 7:04:56 UTC

I am sorry wing people, my AMD laptop (6972907) is having driver problems. It runs the cpu tasks fine but the amd GPU driver keeps failing and restarting. When the GPU driver fails it stops the GPU task from completing. So I have resorted to aborting tasks that are past due. I hate it.

I have tried everything: pause / restart, upgrade / downgrade vid driver, upgrade / downgrade boinc, 1 core / 2 core.

I am at my wits end, I even tried to upgrade win 8 to 8.1, it failed.

oh, sorry, it's a HP Pavilion g6-2240nr, new last April/ May, it uses the:

Authentic AMD A6-4400M APU with Radeon(tm) HD Graphics [Family 21 Model 16 Stepping 1]
(2 processors)AMD AMD Radeon HD (unknown)(512MB) driver: 1.4.1741 OpenCL: 1.2 Microsoft Windows 8
Core x64 Edition, (06.02.9200.00)
http://allprojectstats.com/ts593244b0a.png
ID: 1500577 · Report as offensive
far_raf

Send message
Joined: 26 Apr 00
Posts: 120
Credit: 47,977,058
RAC: 19
Canada
Message 1500584 - Posted: 6 Apr 2014, 7:22:31 UTC

I just checked again - all my GPU tasks are failing.
http://allprojectstats.com/ts593244b0a.png
ID: 1500584 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1500657 - Posted: 6 Apr 2014, 13:11:32 UTC - in response to Message 1500652.  
Last modified: 6 Apr 2014, 13:14:11 UTC

I'm asking you to update Boinc to a Boinc version that is likely to display what the GPU is, AMD doesn't have an api like Nvidia has, so the GPU has to be hard coded into Boinc, so new Boinc versions are required to display new AMD GPUs,
The Host details page only displays the CAL driver version, I did ask DA to display the the OpenCL driver version, he didn't do as I asked, he did something different instead and displayed the OpenCL version the GPU supported,
the OpenCL driver version is displayed in the Boinc startup, hence why I asked for the startup,
The AMD MBv7 apps have been quite sensitive to the APP runtime version, at the moment I can deduce you're got a Catalyst version of somewhere been Cat 12.6 and 13.4, with Cat 13.1 being the must avoid version,
I've already reported to the Boinc devs about app stderr.txt being not shown when a task is aborted, if Boinc 7.2.42 also does this i'll report it again,
At no point did I say that changing Boinc versions was going to fix it, I'm only trying to gather information, once we have the right information we can supply the correct fix.

If you don't want help, I won't help.

Claggy
ID: 1500657 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1500658 - Posted: 6 Apr 2014, 13:19:12 UTC

The GPU is an HD 7520G which is rather slow.

Make sure you try a more recent driver version for the GPU.

Did you use driver sweeper or AMD uninstall utility to get rid of driver rests?


With each crime and every kindness we birth our future.
ID: 1500658 · Report as offensive
far_raf

Send message
Joined: 26 Apr 00
Posts: 120
Credit: 47,977,058
RAC: 19
Canada
Message 1501249 - Posted: 8 Apr 2014, 6:52:30 UTC

O.K. Claggy, I will make my message public. This is what I tried to pm you:

Claggy, I wish to say I am sorry for being rude to you, there is no excuse for it. The only thing that I can say is that I am extremely frustrated and I really have tried all version, but I understand you need more data - hence your request. I will go back to the current version per your request.

We have talked before about my new Xeon and you were a great help.

In my feeble defense I had come in from a Hosp visit for a 2nd degree burn on my hand and was really really cranky.

Be Well Claggy
http://allprojectstats.com/ts593244b0a.png
ID: 1501249 · Report as offensive
far_raf

Send message
Joined: 26 Apr 00
Posts: 120
Credit: 47,977,058
RAC: 19
Canada
Message 1501252 - Posted: 8 Apr 2014, 6:55:24 UTC

To further cloud this issue if I update the video driver from the hp site boinc does not even see the gpu. Upgrading from amd does not solve the problem.

Robert
http://allprojectstats.com/ts593244b0a.png
ID: 1501252 · Report as offensive
far_raf

Send message
Joined: 26 Apr 00
Posts: 120
Credit: 47,977,058
RAC: 19
Canada
Message 1501258 - Posted: 8 Apr 2014, 7:10:41 UTC - in response to Message 1500658.  

I did not use the driver sweep function, I will at your suggestion.

Robert
http://allprojectstats.com/ts593244b0a.png
ID: 1501258 · Report as offensive
far_raf

Send message
Joined: 26 Apr 00
Posts: 120
Credit: 47,977,058
RAC: 19
Canada
Message 1505950 - Posted: 19 Apr 2014, 7:27:57 UTC
Last modified: 19 Apr 2014, 8:04:08 UTC

I have resolved the stalled wu issue on my AMD laptop. My solution was to add a registry value called TdrDelay and adjust it from the default of 2 to a value of 8. I would still suggest that you make certain that your drivers are current as a start to resolve this issue.

The following is just some information that I found after reading from many sources.

Snips from the various Microsoft web pages:

.. snip...

Symptom
Your PC may temporarily hang or become unresponsive, and you receive the following error message:
Display driver stopped responding and has recovered

Resolution
To resolve this issue, follow the steps in the methods starting with method 1 and then proceeding with method 2 if that solution does not resolve the issue.
Method 1: Increase the GPU (Graphics Processing Unit) processing time by adjusting the Timeout Detection and Recovery registry value
Timeout Detection and Recovery is a Windows feature that can detect when video adapter hardware or a driver on your PC has taken longer than expected to complete an operation. When this happens, Windows attempts to recover and reset the graphics hardware. If the GPU is unable to recover and reset the graphics hardware in the time permitted (2 seconds), your system may become unresponsive, and display the error “Display driver stopped responding and has recovered.”

Giving the Timeout Detection and Recovery feature more time to complete this operation by adjusting the registry value, may resolve this issue.

..snip...

Limiting Repetitive GPU Hangs and Recoveries

Beginning with Windows Vista with Service Pack 1 (SP1) and Windows Server 2008, the user experience has been improved in situations where the GPU hangs frequently and rapidly. Repetitive GPU hangs indicate that the graphics hardware has not recovered successfully. In these situations, the end user must shut down and restart the operating system to fully reset the graphics hardware. If the operating system detects that six or more GPU hangs and subsequent recoveries occur within 1 minute, the operating system bug-checks the computer on the next GPU hang.

..snip...

Important This section, method, or task contains steps that tell you how to modify the registry. However, serious problems might occur if you modify the registry incorrectly. Therefore, make sure that you follow these steps carefully. For added protection, back up the registry before you modify it. Then, you can restore the registry if a problem occurs. For more information about how to back up the registry in Windows 7, see Back up the registry


TDR Registry Keys

You can use the following TDR-related registry keys for testing or debugging purposes only. That is, they should not be manipulated by any applications outside targeted testing or debugging.

TdrDelay

Specifies the number of seconds that the GPU can delay the preempt request from the GPU scheduler. This is effectively the timeout threshold. The default value is 2 seconds.


KeyPath : HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\GraphicsDrivers
KeyValue : TdrDelay
ValueType : REG_DWORD
ValueData : Number of seconds to delay. 2 seconds is the default value.

I would postulate that when we add a massive workload to gpu it takes longer than normal to report and the os sees this as a fault - thus we need to give it more time to report.

On a further information note - in my research I saw that this fault was not isolated to AMD APU's, I saw NVIDIA and AMD stand alone cards involved in the issue.

If this helps even 1 person I will be very happy.

Be Well
Robert
http://allprojectstats.com/ts593244b0a.png
ID: 1505950 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1507475 - Posted: 23 Apr 2014, 2:21:49 UTC - in response to Message 1505950.  

Hmm, I will have to give this a try. I've had the same issue on occasion with my Nividia cards. It's sporadic which makes it even more annoying lol.

Thanks,

Chris
ID: 1507475 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1507485 - Posted: 23 Apr 2014, 3:51:56 UTC

I have used the TdrLevel setting & disabled this feature. As I was having an issue when using uVNC that caused the video driver to crash.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1507485 · Report as offensive
far_raf

Send message
Joined: 26 Apr 00
Posts: 120
Credit: 47,977,058
RAC: 19
Canada
Message 1507912 - Posted: 24 Apr 2014, 6:08:34 UTC - in response to Message 1507475.  

Good idea Chris, I was surprised to find out that the issue was OS based and not APU/GPU driver related. Good Luck.

Robert
http://allprojectstats.com/ts593244b0a.png
ID: 1507912 · Report as offensive
far_raf

Send message
Joined: 26 Apr 00
Posts: 120
Credit: 47,977,058
RAC: 19
Canada
Message 1507914 - Posted: 24 Apr 2014, 6:10:33 UTC - in response to Message 1507485.  

Hi HAL9000, could you please post some details on your solution. It will help not just I but others that run across this thread. I thank you in advance.

Robert
http://allprojectstats.com/ts593244b0a.png
ID: 1507914 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1507944 - Posted: 24 Apr 2014, 8:43:01 UTC - in response to Message 1507914.  
Last modified: 24 Apr 2014, 8:43:22 UTC

Hi HAL9000, could you please post some details on your solution. It will help not just I but others that run across this thread. I thank you in advance.

Robert

Here is the whole MS write up on the options for the video Timeout Detection & Recovery.
http://msdn.microsoft.com/en-us/library/windows/hardware/ff569918%28v=vs.85%29.aspx
I set TdrLevel to 0. Which basically just turns the whole thing off.

When I would VNC into my HTPC dwm.exe was fluctuating in memory use up to 300-400MB and back down to its normal amount. The graph on the performance tab looked like a sawtooth. Then after a few minutes I would get the display driver has stopped & been recovered message.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1507944 · Report as offensive
far_raf

Send message
Joined: 26 Apr 00
Posts: 120
Credit: 47,977,058
RAC: 19
Canada
Message 1512718 - Posted: 6 May 2014, 5:18:01 UTC
Last modified: 6 May 2014, 5:19:00 UTC

Hal9000, I took your advise and set TdrLevel to 0. And wanted to post so I can change title.

Robert
ID: 1512718 · Report as offensive

Message boards : Number crunching : solution with reason for "display driver has stopped responding the system has recovered"


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.