GPU tasks have started locking up my machine

Questions and Answers : GPU applications : GPU tasks have started locking up my machine
Message board moderation

To post messages, you must log in.

AuthorMessage
Atario

Send message
Joined: 17 May 99
Posts: 6
Credit: 35,391,855
RAC: 1,064
United States
Message 1794724 - Posted: 9 Jun 2016, 14:18:04 UTC

In the last week or two, if I allow GPU units to work, they will eventually cause my machine to lock up — or at least appear to. The monitor loses signal, and the keyboard lights no longer respond (e.g., Caps Lock). I can, however, see ordinary-looking activity on the HDD light.

I thought maybe my power supply or GPU were going bad, so I tried replacing the power supply first, but nothing changed. Then I tried running FurMark as a burn-in test for the GPU. It drives the card to even higher activity levels and higher temperatures than Seti@Home itself does, but still only 72°C or so, and without any crashes. I left FurMark running and started GPU crunching again, which brought the total activity level and temperature down somewhat, but after a while it locked up again.

I'm running stock and no overclocking. BOINC version is 7.6.22 (x64). The GPU units report v8.12 (opencl_nvidia_sah). My GPU is an ASUS GTX 770.

As a side note, for quite a long time now, every now and then I will get a "driver has stopped responding and has recovered" error with a brief black screen first, but I had just been living with it till recently. Driver version is 362.00.

At the moment I've just turned GPU crunching off so I can avoid having to hard-reset my machine. Is there anything I can do to fix this, or is it something I'll have to wait for a version update for?
ID: 1794724 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9947
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1794728 - Posted: 9 Jun 2016, 14:40:56 UTC - in response to Message 1794724.  

I had this on one of my machines.

When I looked in event viewer, the driver was crashing and recovering every 10 seconds till the machine locked up.

You could try updating to the newest Nvida drivers which is now 368.39

However the cause seems to be related to the "Timeout Detection and Recovery" on windows.

This is from the Microsoft website

Timeout Detection and Recovery is a Windows feature that can detect when video adapter hardware or a driver on your PC has taken longer than expected to complete an operation. When this happens, Windows attempts to recover and reset the graphics hardware. If the GPU is unable to recover and reset the graphics hardware in the time permitted (2 seconds), your system may become unresponsive, and display the error “Display driver stopped responding and has recovered.”

Giving the Timeout Detection and Recovery feature more time to complete this operation by adjusting the registry value, may resolve this issue.


if you go here:

https://support.microsoft.com/en-gb/kb/2665946

There are two methods for fixing this, one requires downloading the "fix-it" programme, the other involves a registry edit.

I adjusted mine to 8 seconds

Since I increased the TDR my machine does occasionally lock up but much much less frequently.

Hope this might help
ID: 1794728 · Report as offensive
Atario

Send message
Joined: 17 May 99
Posts: 6
Credit: 35,391,855
RAC: 1,064
United States
Message 1794910 - Posted: 10 Jun 2016, 2:19:25 UTC - in response to Message 1794728.  

The latest driver actually causes many more errors for me. I had to roll that back.

I also have done the timeout = 8s thing some time ago…
ID: 1794910 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9947
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1794977 - Posted: 10 Jun 2016, 7:06:48 UTC

Well some say you need to set TDR to 10.

I decided to start at 8 and increase if necessary.
ID: 1794977 · Report as offensive

Questions and Answers : GPU applications : GPU tasks have started locking up my machine


 
©2022 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.