GPU died?

Questions and Answers : GPU applications : GPU died?
Message board moderation

To post messages, you must log in.

AuthorMessage
stranger7777

Send message
Joined: 18 Aug 01
Posts: 2
Credit: 2,027,109
RAC: 0
Russia
Message 1649750 - Posted: 5 Mar 2015, 22:49:12 UTC

Also made a thread at http://einstein.phys.uwm.edu/forum_thread.php?id=11173, but with no success.
Will try to explain here, because this is something that I can't understand myself.
This http://setiathome.berkeley.edu/hosts_user.php?userid=1143 my host recently broke all WUs with video driver error when starting each task.
So I can download GPU task (for CUDA), start it, and after a couple of seconds it fails the task accompanying with video driver error. Tried drivers from 330 to 347.52 with no success. Now I tried both Seti@Home and Einstein@Home. This bug is not project related, but I guess is CUDA related, because CPU and OpenCL tasks for Intel GPU are working flawlessly.
I've checked the GPU memory with VMT http://mikelab.kiev.ua/.
It can check video memory using DirectX, OpenGL and CUDA.
Memory is OK when checking with DirectX and OpenGL. But CUDA test errors out immediately.
Clean driver installation doesn't help. Checking this doesn't help. Clearing out BOINC, rebooting, checking disk and then reinstalling BOINC to a different folder doesn't help.
This host was working for years without any such issues. Even now I'm working on it and yes I can even play hard games on it (like http://www.WorldOfTanks.eu) for hours.
I've tried to change a slot (motherboard have two) and PSU. Coolers are working fine. Temperature is 36 C, case is always opened. Nothing helps.
Don't know where to dig or whom to ask for the help.
Stopped GPU crunching until resolving this issue.
Changing videocard is not a suitable solution for me.
New ideas are highly appreciated.
ID: 1649750 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1649774 - Posted: 6 Mar 2015, 0:30:38 UTC - in response to Message 1649750.  

But CUDA test errors out immediately.

Seems to me that it is hardware. More precise, a problem with the CUDA cores in the GPU. As the name implies, these are the cores used for running CUDA. They aren't used for OpenGL or DirectX 3D rendering, so gaming can continue without problems.

Apropos, if you want to run work (when it's available) on the Intel GPU, that's also an option. The HD Graphics 2500 can do OpenCL calculations.
ID: 1649774 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1650200 - Posted: 7 Mar 2015, 1:44:14 UTC - in response to Message 1649774.  

... these are the cores used for running CUDA. They aren't used for OpenGL or DirectX 3D rendering

I think the 'CUDA cores' are the same silicon as 'Shaders' so they have to be used in games.
OTOH 'Video Memory stress Test' (mikelab.kiev.ua) may not use (much) 'Shaders' while testing video RAM by DirectX or OpenGL
I think it creates some textures and fills the RAM with them.
(I remember on my ATI AMD Radeon HD 6570 the OpenGL video RAM test to run much faster than DirectX)


'CUDA cores' have to be used in PhysX and OpenCL, so I will suggest some more tests:
Of course they have to be run on the NVIDIA and not CPU or Intel GPU

FluidMark (PhysX)
http://www.ozone3d.net/benchmarks/physx-fluidmark/

LuxMark v3.x (OpenCL) (Binaries only for Windows 64bit)
http://www.luxrender.net/wiki/LuxMark

LuxMark v2.x (OpenCL) (Have Windows 32bit executables)
http://www.luxrender.net/wiki/LuxMark_v2

ShaderToyMark ("OpenGL benchmark based on hefty pixel shaders")
http://www.ozone3d.net/benchmarks/shadertoymark/


CUDA-Z
http://cuda-z.sourceforge.net/

CUDA-Z do not work (can't detect CUDA) on my NVIDIA GeForce 8400 GS (511MB) driver: 266.58 - so I don't know when it works.
http://setiathome.berkeley.edu/show_host_detail.php?hostid=7396088
 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1650200 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1650205 - Posted: 7 Mar 2015, 1:58:26 UTC - in response to Message 1650200.  

I think the 'CUDA cores' are the same silicon as 'Shaders' so they have to be used in games.

As far as I could find, that used to be the case, but these days in the modern Nvidia cards the CUDA cores are really for CUDA, not for rendering of OpenGL/DX3D. They can render graphics, but are primarily used for GPGPU calculations.
ID: 1650205 · Report as offensive
stranger7777

Send message
Joined: 18 Aug 01
Posts: 2
Credit: 2,027,109
RAC: 0
Russia
Message 1651000 - Posted: 9 Mar 2015, 12:46:10 UTC
Last modified: 9 Mar 2015, 12:52:04 UTC

First observations so far.
CUDA-Z detects my GPU and all its cores but crashes video driver when showing "Perfomance" page. No errors from CUDA-Z itself, but "Perfomance" shows zeros in every cell after this fail.
ShaderToyMark - works smoothly, no errors, music is in my favorite 8-bit style. Benchmark went well with 60 to 136 fps.
LuxMark fails video driver when checking NVidia. Using HD 2500 or CPU works fine but pictures rendered are weird.
FluidMark works only if PhysX is computed on CPU. When checking GPU it renders only a second or two (I see first hundred of frames) and then video driver crashes.

BUT! MilkyWay@home just completed about 10 WUs on the machine and all the tasks are checked as correct.

P.S. To detect CUDA on GeForce 8400 you have to install newer driver. I found myself version 340.50 to work fine both on 750Ti and GeForce 8400 simultaneously (though under XP)
ID: 1651000 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1651075 - Posted: 9 Mar 2015, 16:49:04 UTC - in response to Message 1651000.  

First observations so far.
CUDA-Z detects my GPU and all its cores but crashes video driver when showing "Perfomance" page. No errors from CUDA-Z itself, but "Perfomance" shows zeros in every cell after this fail.
ShaderToyMark - works smoothly, no errors, music is in my favorite 8-bit style. Benchmark went well with 60 to 136 fps.
LuxMark fails video driver when checking NVidia. Using HD 2500 or CPU works fine but pictures rendered are weird.
FluidMark works only if PhysX is computed on CPU. When checking GPU it renders only a second or two (I see first hundred of frames) and then video driver crashes.

BUT! MilkyWay@home just completed about 10 WUs on the machine and all the tasks are checked as correct.

P.S. To detect CUDA on GeForce 8400 you have to install newer driver. I found myself version 340.50 to work fine both on 750Ti and GeForce 8400 simultaneously (though under XP)


Milkyway is using OpenCL and not CUDA.

ID: 1651075 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1651357 - Posted: 10 Mar 2015, 11:53:33 UTC - in response to Message 1651000.  
Last modified: 10 Mar 2015, 11:54:22 UTC

P.S. To detect CUDA on GeForce 8400 you have to install newer driver. I found myself version 340.50 to work fine both on 750Ti and GeForce 8400 simultaneously (though under XP)

CUDA works fine for SETI@home as you can see:
http://setiathome.berkeley.edu/results.php?hostid=7396088&offset=0&show_names=0&state=4&appid=

(Windows XP, NVIDIA GeForce 8400 GS (511MB) driver: 266.58)

I just said that CUDA-Z do not detect it (which don't bother me)
I will not change the best driver for older GPUs which is 266.58

! In fact 340.50 should NOT be used for older GPUs !
@Pre-FERMI nVidia GPU users: Important warning:
http://setiathome.berkeley.edu/forum_thread.php?id=75633
 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1651357 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1651402 - Posted: 10 Mar 2015, 15:02:38 UTC - in response to Message 1651000.  
Last modified: 10 Mar 2015, 15:32:21 UTC

LuxMark fails video driver when checking NVidia. Using HD 2500 or CPU works fine but pictures rendered are weird.
FluidMark works only if PhysX is computed on CPU. When checking GPU it renders only a second or two (I see first hundred of frames) and then video driver crashes.

BUT! MilkyWay@home just completed about 10 WUs on the machine and all the tasks are checked as correct.

When you say "then video driver crashes" do you mean real crash or it is caused by TdrDelay default of 2 seconds?:
http://setiathome.ssl.berkeley.edu/forum_thread.php?id=75391&postid=1561837#1561837

Since some OpenCL works (MilkyWay@home) and some fails (LuxMark) it seems to me that either part of the driver files or part of the GPU is defective.
(Not every test may use all the parts/functions of the driver/GPU. I don't know which program may test/use all the parts)

You may try also the OpenGL & OpenCL tests ("Demos") in GPU Caps Viewer
http://www.ozone3d.net/gpu_caps_viewer/

(! "Furry Cube" is HEAVY OpenGL Test similar to FurMark)

FurMark (HEAVY GPU Stress Test, OpenGL, watch the temperature!)
http://www.ozone3d.net/benchmarks/fur/

OCCT (HEAVY - watch the temperature!)
http://www.ocbase.com/

For me OCCT 3.1.0 gives 10-15 °C higher temperature than OCCT 4.3.1 and OCCT 4.4.0
http://cub0.spaces.ru/files/?read=19526476&sid=2866152017028451

OCCTPT3.1.0-spaces.ru.zip scan:
https://www.virustotal.com/en/file/01f76cfbc49fba716c6a863e4b521717ffaaa4ca96ec543a0cddd9ea342de7cc/analysis/1426001321/


I also found CompuBench but it is only for 64 bit:
http://compubench.com/

 
 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1651402 · Report as offensive

Questions and Answers : GPU applications : GPU died?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.