NVIDIA driver crashing

留言板 : Number crunching : NVIDIA driver crashing
留言板合理

To post messages, you must log in.

前 · 1 · 2

作者消息
Profile TimeLord04
志愿者测试人员
Avatar

发送消息
已加入:9 Mar 06
贴子:20362
积分:33,933,039
近期平均积分:23
United States
消息 1745980 - 发表于:1 Dec 2015, 3:27:21 UTC - 回复消息 1745977.  

Update:

Okay, first of all, it is definitely a problem with it blanks the screen every minute or so while I'm sitting in front of it, and even more of a problem when the whole computer freezes after a few hours.

It did that again last night and I just now took the time while it was shut down anyway to open it up and blow out the dust. There was surprisingly little of it in there.

As soon as I started it up again, it started crashing again. I suspended the GPU and it quit. I updated the driver, but it didn't help. It was weird that it was crashing because I thought I had suspended all the individual GPU tasks. Finally, I noticed a Seti Beta openCL task. It would pop up as running, the screen would blank, and when it recovered the task wasn't running. I tracked it down and suspended it, and the crashing stopped.

Then I unsuspended one Seti task. No crash. I suspended that and unsuspended one Einstein. No crash. I suspended that and unsuspended one Beta Cuda50. No crash. Once more, I suspended that and unsuspended the openCL. Crash. I suspended that again and unsuspended every GPU task I had except the openCLs. No crashing.

This just leaves the question of why it suddenly started crashing when I had not made any changes to the system in months (even the Windows updates are a few weeks old).

I will post this info in Beta too.



Ah, now it makes much more sense.

I thought you were just talking about Seti Main work units.

Now that you said it was a OpenCl on Beta, it makes much more sense.

I'm guessed and saw I was correct, that it's the OpenCL_Nvidia_sah work units aka VLARS

Those really should be crunched on the newer Maxwell cards or Keplers. Even with my Machines, they were hard pressed. They use a full 1 core apiece.

They also shouldn't be mated with any other work units from any other project.

I think there was a decussion about those on Beta some time back if I remember correctly.

I would recommend not crunching any more of those on your GPU .

Zalster

Forgot to mention, I have made no adjustments to the way the card itself operates since I bought the computer several years ago.

I operate Seti and Einstein two at a time, but I have never set Beta to do that.

Is there a way I can set Beta to run cuda but not opencl? I don't see a pref for it (like Einstein has). Would I have to go with optimized apps? I've never done that on Beta (reasoning that the point of the Beta project is to test the next generation of Main apps, so let it do its own thing).

No Optimized Apps are allowed at Beta... However; you can run an app_config.xml file that may suit your needs to run only CUDA Units. Zalster, or someone else, will have to help you configure that for your system.

The app_config.xml I have was created for me by Joe Segur. I've modified it a couple of times; but, I could NOT have created it on my own from scratch.


TL
TimeLord04
Have TARDIS, will travel...
Come along K-9!
Join Calm Chaos
ID: 1745980 · 举报违规帖子
David S
志愿者测试人员
Avatar

发送消息
已加入:4 Oct 99
贴子:18352
积分:27,761,924
近期平均积分:12
United States
消息 1745977 - 发表于:1 Dec 2015, 3:17:30 UTC - 回复消息 1745972.  

Update:

Okay, first of all, it is definitely a problem with it blanks the screen every minute or so while I'm sitting in front of it, and even more of a problem when the whole computer freezes after a few hours.

It did that again last night and I just now took the time while it was shut down anyway to open it up and blow out the dust. There was surprisingly little of it in there.

As soon as I started it up again, it started crashing again. I suspended the GPU and it quit. I updated the driver, but it didn't help. It was weird that it was crashing because I thought I had suspended all the individual GPU tasks. Finally, I noticed a Seti Beta openCL task. It would pop up as running, the screen would blank, and when it recovered the task wasn't running. I tracked it down and suspended it, and the crashing stopped.

Then I unsuspended one Seti task. No crash. I suspended that and unsuspended one Einstein. No crash. I suspended that and unsuspended one Beta Cuda50. No crash. Once more, I suspended that and unsuspended the openCL. Crash. I suspended that again and unsuspended every GPU task I had except the openCLs. No crashing.

This just leaves the question of why it suddenly started crashing when I had not made any changes to the system in months (even the Windows updates are a few weeks old).

I will post this info in Beta too.



Ah, now it makes much more sense.

I thought you were just talking about Seti Main work units.

Now that you said it was a OpenCl on Beta, it makes much more sense.

I'm guessed and saw I was correct, that it's the OpenCL_Nvidia_sah work units aka VLARS

Those really should be crunched on the newer Maxwell cards or Keplers. Even with my Machines, they were hard pressed. They use a full 1 core apiece.

They also shouldn't be mated with any other work units from any other project.

I think there was a decussion about those on Beta some time back if I remember correctly.

I would recommend not crunching any more of those on your GPU .

Zalster

Forgot to mention, I have made no adjustments to the way the card itself operates since I bought the computer several years ago.

I operate Seti and Einstein two at a time, but I have never set Beta to do that.

Is there a way I can set Beta to run cuda but not opencl? I don't see a pref for it (like Einstein has). Would I have to go with optimized apps? I've never done that on Beta (reasoning that the point of the Beta project is to test the next generation of Main apps, so let it do its own thing).
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1745977 · 举报违规帖子
Profile Zalster Special Project $250 donor
志愿者测试人员
Avatar

发送消息
已加入:27 May 99
贴子:5445
积分:528,817,460
近期平均积分:242
United States
消息 1745972 - 发表于:1 Dec 2015, 2:18:52 UTC - 回复消息 1745963.  
最近的修改日期:1 Dec 2015, 2:20:48 UTC

Update:

Okay, first of all, it is definitely a problem with it blanks the screen every minute or so while I'm sitting in front of it, and even more of a problem when the whole computer freezes after a few hours.

It did that again last night and I just now took the time while it was shut down anyway to open it up and blow out the dust. There was surprisingly little of it in there.

As soon as I started it up again, it started crashing again. I suspended the GPU and it quit. I updated the driver, but it didn't help. It was weird that it was crashing because I thought I had suspended all the individual GPU tasks. Finally, I noticed a Seti Beta openCL task. It would pop up as running, the screen would blank, and when it recovered the task wasn't running. I tracked it down and suspended it, and the crashing stopped.

Then I unsuspended one Seti task. No crash. I suspended that and unsuspended one Einstein. No crash. I suspended that and unsuspended one Beta Cuda50. No crash. Once more, I suspended that and unsuspended the openCL. Crash. I suspended that again and unsuspended every GPU task I had except the openCLs. No crashing.

This just leaves the question of why it suddenly started crashing when I had not made any changes to the system in months (even the Windows updates are a few weeks old).

I will post this info in Beta too.



Ah, now it makes much more sense.

I thought you were just talking about Seti Main work units.

Now that you said it was a OpenCl on Beta, it makes much more sense.

I'm guessed and saw I was correct, that it's the OpenCL_Nvidia_sah work units aka VLARS

Those really should be crunched on the newer Maxwell cards or Keplers. Even with my Machines, they were hard pressed. They use a full 1 core apiece.

They also shouldn't be mated with any other work units from any other project.

I think there was a decussion about those on Beta some time back if I remember correctly.

I would recommend not crunching any more of those on your GPU .

Zalster
ID: 1745972 · 举报违规帖子
Darth Beaver Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

发送消息
已加入:20 Aug 99
贴子:6728
积分:21,443,075
近期平均积分:3
Australia
消息 1745966 - 发表于:1 Dec 2015, 1:46:24 UTC - 回复消息 1745963.  

David have a look at the event logs and see if it can give you more of a idea .

Look at my post Computer shut down WMI I'm using a cmdline on the open cl AP's and it crashed 3 times in the last day or 2 . The computer would restart and be ok till I started seti .

I've made a change so hopefully it's fixed now but I will have to see if it crashes again in the next 24 hrs .
ID: 1745966 · 举报违规帖子
David S
志愿者测试人员
Avatar

发送消息
已加入:4 Oct 99
贴子:18352
积分:27,761,924
近期平均积分:12
United States
消息 1745963 - 发表于:1 Dec 2015, 1:24:00 UTC

Update:

Okay, first of all, it is definitely a problem with it blanks the screen every minute or so while I'm sitting in front of it, and even more of a problem when the whole computer freezes after a few hours.

It did that again last night and I just now took the time while it was shut down anyway to open it up and blow out the dust. There was surprisingly little of it in there.

As soon as I started it up again, it started crashing again. I suspended the GPU and it quit. I updated the driver, but it didn't help. It was weird that it was crashing because I thought I had suspended all the individual GPU tasks. Finally, I noticed a Seti Beta openCL task. It would pop up as running, the screen would blank, and when it recovered the task wasn't running. I tracked it down and suspended it, and the crashing stopped.

Then I unsuspended one Seti task. No crash. I suspended that and unsuspended one Einstein. No crash. I suspended that and unsuspended one Beta Cuda50. No crash. Once more, I suspended that and unsuspended the openCL. Crash. I suspended that again and unsuspended every GPU task I had except the openCLs. No crashing.

This just leaves the question of why it suddenly started crashing when I had not made any changes to the system in months (even the Windows updates are a few weeks old).

I will post this info in Beta too.
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1745963 · 举报违规帖子
Darth Beaver Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

发送消息
已加入:20 Aug 99
贴子:6728
积分:21,443,075
近期平均积分:3
Australia
消息 1744598 - 发表于:24 Nov 2015, 22:40:45 UTC

It happens to me too Doc? . If it's doing to much work at SETI and then I try watching a video or play a game or if I've been mining and then stop mining and watch a video seems to be if you ask it to do to much it will crash and recover at a lower resolution this has happened while playing games to me

Until it actually errors out a unit don't worry as so far it hasn't done that Seti just keeps on keeping on with no prob's
ID: 1744598 · 举报违规帖子
Ulrich Metzner
志愿者测试人员
Avatar

发送消息
已加入:3 Jul 02
贴子:1256
积分:13,565,513
近期平均积分:13
Germany
消息 1744490 - 发表于:24 Nov 2015, 11:18:40 UTC

Have you checked the fans and blown out the dust bunnies?
Aloha, Uli

ID: 1744490 · 举报违规帖子
KLiK
志愿者测试人员

发送消息
已加入:31 Mar 14
贴子:1301
积分:22,994,597
近期平均积分:60
Croatia
消息 1744485 - 发表于:24 Nov 2015, 10:33:03 UTC

happens to me also...just keep crunching! :/


non-profit org. Play4Life in Zagreb, Croatia, EU
ID: 1744485 · 举报违规帖子
Profile Zalster Special Project $250 donor
志愿者测试人员
Avatar

发送消息
已加入:27 May 99
贴子:5445
积分:528,817,460
近期平均积分:242
United States
消息 1744423 - 发表于:24 Nov 2015, 2:21:30 UTC - 回复消息 1744417.  

I've seen that when bumping up the memory clock via Nvidia Inspector on the cards.

Couple of question about your GPU computing. Are you overclocking the cards? Do you used NI or GPUZ to increase any of the setting over factory? How many work units per card are you crunching? What about the ratio of CPU to GPU for each work unit?
ID: 1744423 · 举报违规帖子
David S
志愿者测试人员
Avatar

发送消息
已加入:4 Oct 99
贴子:18352
积分:27,761,924
近期平均积分:12
United States
消息 1744417 - 发表于:24 Nov 2015, 1:58:54 UTC - 回复消息 1744416.  

I used to get this with my GT430 on Einstein when I was running 2 tasks at a time and OCing about 10%, I was pushing it as hard as I dared. Since then it has been relegated to Seti and only running 1 task but still OCing I have not had that happen.

Okay, thanks.

Update: I suspended the GPU and allowed the CPU to go back to work and nothing has happened. I suppose one option would be to go to the Einstein site and disable GPU work, then abort all the GPU tasks I have and see if Seti and Beta work okay. Don't want to do that if there's an easier way.
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1744417 · 举报违规帖子
Profile betreger Project Donor
Avatar

发送消息
已加入:29 Jun 99
贴子:10359
积分:29,581,041
近期平均积分:66
United States
消息 1744416 - 发表于:24 Nov 2015, 1:50:31 UTC - 回复消息 1744415.  

I used to get this with my GT430 on Einstein when I was running 2 tasks at a time and OCing about 10%, I was pushing it as hard as I dared. Since then it has been relegated to Seti and only running 1 task but still OCing I have not had that happen.
ID: 1744416 · 举报违规帖子
David S
志愿者测试人员
Avatar

发送消息
已加入:4 Oct 99
贴子:18352
积分:27,761,924
近期平均积分:12
United States
消息 1744415 - 发表于:24 Nov 2015, 1:43:51 UTC

Last night, I was physically sitting in front of my i7 main cruncher (which I only do about once a week), playing Solitaire while waiting for the washing machine to finish. Suddenly, the display went blank. After a few seconds, it recovered and a balloon said the driver had crashed and recovered. This happened several times before the washer finished and I went upstairs.

Sometime during the night, the computer froze. I tried to get in with Teamviewer and it had been offline since around 2-3 a.m.

I got home just now and had to hold the power button to shut it down. After restarting it, everything appeared to be normal (except Boinc took longer than normal to get going). Then the display blanked a couple more times. I have now suspended Boinc and it hasn't happened since.

Details: GT440 running driver 353.30, Win 7 64 bit.

Questions: has this happened to anyone else? Is there a newer driver that is known to be okay for Seti and Einstein? Could this be a sign that the GPU itself is dying? Any other suggestions?
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1744415 · 举报违规帖子
前 · 1 · 2

留言板 : Number crunching : NVIDIA driver crashing


 
©2020 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.