Issues w/ GTX 690 and SETI@home v7 7.00 (cuda50) 10no13ac.xxx?

Message boards : Number crunching : Issues w/ GTX 690 and SETI@home v7 7.00 (cuda50) 10no13ac.xxx?
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Tylendol

Send message
Joined: 8 Nov 11
Posts: 4
Credit: 6,368,471
RAC: 12
United States
Message 1458398 - Posted: 29 Dec 2013, 4:26:22 UTC

Hey Folks...

My system just pulled a bunch of new files, and my system immediately began continuously crashing the video driver. It's definitely not normal; I've been running on this driver since mid-Nov, and using the newer BOINC Manager v7.2.33 (x64).

Thought I would see if anyone else was having this issue. I currently have SETI suspended, and didn't want to abort anything if I could figure out how to get it fixed and retry it.

Thanks!
Tylendol

Here's my system info, if anyone has any advice:
NVIDIA System Information report created on: 12/28/2013 20:18:19
System name: PMASON-AURORAR4

[Display]
Operating System: Windows 8 Pro, 64-bit
DirectX version: 11.0
GPU processor: GeForce GTX 690 (GPU 1 of 2)
Driver version: 331.82
Direct3D API version: 11.1
Direct3D feature level: 11_0
CUDA Cores: 1536
Core clock: 915 MHz
Memory data rate: 6008 MHz
Memory interface: 256-bit
Memory bandwidth: 192.26 GB/s
Total available graphics memory: 4096 MB
Dedicated video memory: 2048 MB GDDR5
System video memory: 0 MB
Shared system memory: 2048 MB
Video BIOS version: 80.04.1E.00.1F
IRQ: 40
Bus: PCI Express x16 Gen3
Device Id: 10DE 1188 095B10DE
Part Number: 2000 0000
GPU processor: GeForce GTX 690 (GPU 2 of 2)
Driver version: 331.82
Direct3D API version: 11.1
Direct3D feature level: 11_0
CUDA Cores: 1536
Core clock: 915 MHz
Memory data rate: 6008 MHz
Memory interface: 256-bit
Memory bandwidth: 192.26 GB/s
Total available graphics memory: 4096 MB
Dedicated video memory: 2048 MB GDDR5
System video memory: 0 MB
Shared system memory: 2048 MB
Video BIOS version: 80.04.1E.00.21
IRQ: ë’°m
Bus: PCI Express x16 Gen3
Device Id: 10DE 1188 095B10DE
Part Number: 2000 0000

[Components]

NvGFTrayPluginr.dll 10.11.15.0 NVIDIA GeForce Experience
NvGFTrayPlugin.dll 10.11.15.0 NVIDIA GeForce Experience
nvui.dll 8.17.13.3165 NVIDIA User Experience Driver Component
nvxdsync.exe 8.17.13.3165 NVIDIA User Experience Driver Component
nvxdplcy.dll 8.17.13.3165 NVIDIA User Experience Driver Component
nvxdbat.dll 8.17.13.3165 NVIDIA User Experience Driver Component
nvxdapix.dll 8.17.13.3165 NVIDIA User Experience Driver Component
NVCPL.DLL 8.17.13.3165 NVIDIA User Experience Driver Component
nvCplUIR.dll 6.9.850.0 NVIDIA Control Panel
nvCplUI.exe 7.5.760.0 NVIDIA Control Panel
nvWSSR.dll 6.14.13.1106 NVIDIA Workstation Server
nvWSS.dll 6.14.13.3165 NVIDIA Workstation Server
nvViTvSR.dll 6.14.13.1106 NVIDIA Video Server
nvViTvS.dll 6.14.13.3165 NVIDIA Video Server
nvDispSR.dll 6.14.13.1106 NVIDIA Display Server
NVMCTRAY.DLL 8.17.13.3165 NVIDIA Media Center Library
nvDispS.dll 6.14.13.3165 NVIDIA Display Server
PhysX 09.13.0725 NVIDIA PhysX
NVCUDA.DLL 8.17.13.3182 NVIDIA CUDA 6.0.1 driver
nvGameSR.dll 6.14.13.1106 NVIDIA 3D Settings Server
nvGameS.dll 6.14.13.3165 NVIDIA 3D Settings Server
ID: 1458398 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1458405 - Posted: 29 Dec 2013, 4:59:57 UTC

Have you installed or updated any other software lately?

Cheers.
ID: 1458405 · Report as offensive
Profile Tylendol

Send message
Joined: 8 Nov 11
Posts: 4
Credit: 6,368,471
RAC: 12
United States
Message 1458412 - Posted: 29 Dec 2013, 6:18:18 UTC - in response to Message 1458405.  

Have you installed or updated any other software lately?

Cheers.



Hmmm...

Might have been a couple new GPUGRID, thought it was SETI because they hit computation errors during the graphics card driver failures. But, seems to be running the SETI fine right now.

Will keep an eye on it.
Thx
ID: 1458412 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,352,955
RAC: 0
United States
Message 1458417 - Posted: 29 Dec 2013, 7:44:18 UTC - in response to Message 1458412.  

Windows Updates and anti-virus updates can screw things up too.
ID: 1458417 · Report as offensive
chromespringer
Avatar

Send message
Joined: 3 Dec 05
Posts: 296
Credit: 55,183,482
RAC: 0
United States
Message 1458453 - Posted: 29 Dec 2013, 13:33:23 UTC

i experienced gpu driver crashes a few weeks back similar to what you described and claggy gave me some useful info in thread "gpu issues and apologizes"

Delete the compilations the app made, they'll be renewed by the app/driver..

For OpenCL MB they are similar to:

MultiBeam_Kernels_r1843.clHD5_Capeverde.bin_V7

MB_clFFTplan_Capeverde_8_r1843.bin
MB_clFFTplan_Capeverde_16_r1843.bin
MB_clFFTplan_Capeverde_32_r1843.bin

all the way up to

MB_clFFTplan_Capeverde_524288_r1843.bin

r1843_IntelRCoreTMi72600KCPU340GHz.wisdom
ID: 1458453 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1458456 - Posted: 29 Dec 2013, 13:37:29 UTC

This only applies to OpenCL apps not cuda.


With each crime and every kindness we birth our future.
ID: 1458456 · Report as offensive
Profile Tylendol

Send message
Joined: 8 Nov 11
Posts: 4
Credit: 6,368,471
RAC: 12
United States
Message 1458519 - Posted: 29 Dec 2013, 18:06:24 UTC - in response to Message 1458412.  

Have you installed or updated any other software lately?

Cheers.




Might have been a couple new GPUGRID, thought it was SETI because they hit computation errors during the graphics card driver failures. But, seems to be running the SETI fine right now.


Well, I confirmed it; looks as though under GPUGRID v8.14 (cuda55) Long runs - if your PC has an abnormal restart for any reason and the BOINC Manager doesn't suspend the jobs normally - the jobs become corrupt and have to be Aborted. Not only that - something gets so disjointed, you have to Reset the Project to clear it - otherwise any new jobs for the same app will constantly reset the video card driver, aptly named a video card driver death loop until BSOD.

Sorry to get folks worried; I couldn't find a way to archive or delete my thread; but, at least you are able to see the resolution - albeit for another project.

Thanks!
Tylendol
ID: 1458519 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1458534 - Posted: 29 Dec 2013, 18:38:37 UTC - in response to Message 1458519.  

Might have been a couple new GPUGRID, thought it was SETI because they hit computation errors during the graphics card driver failures. But, seems to be running the SETI fine right now.

Well, I confirmed it; looks as though under GPUGRID v8.14 (cuda55) Long runs - if your PC has an abnormal restart for any reason and the BOINC Manager doesn't suspend the jobs normally - the jobs become corrupt and have to be Aborted. Not only that - something gets so disjointed, you have to Reset the Project to clear it - otherwise any new jobs for the same app will constantly reset the video card driver, aptly named a video card driver death loop until BSOD.

Sorry to get folks worried; I couldn't find a way to archive or delete my thread; but, at least you are able to see the resolution - albeit for another project.

Thanks!
Tylendol

This has actually been a known problem at GPUGrid since about the end of September: Abrupt computer restart - Tasks stuck - Kernel not found.

It isn't usually necessary to reset the GPUGrid project - simply aborting the task(s) in progress at the time of the abnormal restart is usually sufficient. But even that can require some creative effort to regain control of the computer for long enough to perform the 'abort'.

A new v8.15 of their application has been developed and seemed - in Beta testing - to cure the problem, but they have not yet tested it sufficiently to be confident about deploying it for the long tasks. If you live in an area where power interruptions are common, you could switch to the 'short' application queue (which has the new application), pending final release.
ID: 1458534 · Report as offensive

Message boards : Number crunching : Issues w/ GTX 690 and SETI@home v7 7.00 (cuda50) 10no13ac.xxx?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.