Instability issues when using gpu

Questions and Answers : GPU applications : Instability issues when using gpu
Message board moderation

To post messages, you must log in.

AuthorMessage
NalaAddict
Avatar

Send message
Joined: 22 Apr 09
Posts: 3
Credit: 33,722
RAC: 0
United States
Message 887739 - Posted: 23 Apr 2009, 21:54:45 UTC
Last modified: 23 Apr 2009, 22:01:01 UTC

I'm running 6.6.20 at the moment. The first night it ran for 12 hours ... I woke up and came to an irresponsive system that was apparently still getting a signal from the video card but locked solid with a solid black screen.

The next morning the same happened 3 hours later rather than 12. Today, it occurred within 3 minutes of engaging. The sluggishness issue is also present if I engage it now, but that part was explained... this locking up of the system on the otherhand remains unexplained as far as I can tell.

I need to find the culprit or I won't have the option of doing cuda units any further.

The majority of my system stats can be found at: http://www.l33tsig.net/sig/1/adarious.png
ID: 887739 · Report as offensive
Profile popandbob
Volunteer tester

Send message
Joined: 19 Mar 05
Posts: 551
Credit: 4,673,015
RAC: 0
Canada
Message 887811 - Posted: 24 Apr 2009, 3:40:42 UTC

first try updating your drivers to the latest (182.50 I think)
second are you over-clocking? if so try reducing the OC.. If its factory OCed try reducing it.
third check stability with other cuda apps (GPU grid/folding@home) it's possible you may have a faulty card.

~Bob


Do you Good Search for Seti@Home? http://www.goodsearch.com/?charityid=888957
Or Good Shop? http://www.goodshop.com/?charityid=888957
ID: 887811 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 887844 - Posted: 24 Apr 2009, 6:32:06 UTC

Also check for heat, the number one thing that'll crash your PC. A GTX 285 will disperse a lot of heat and if it can't go anywhere, it'll build up inside your case and crash your computer.
ID: 887844 · Report as offensive
NalaAddict
Avatar

Send message
Joined: 22 Apr 09
Posts: 3
Credit: 33,722
RAC: 0
United States
Message 888143 - Posted: 25 Apr 2009, 2:06:34 UTC - in response to Message 887844.  
Last modified: 25 Apr 2009, 2:09:53 UTC

Thanks for the replies. I've done some preliminary testing including temperature monitoring. I'm not entirely sure temperature is totally to blame here, but it is an issue. I think it may be worth mentioning this is an XFX card and that it vents out the back of the pc, not that it accounts for all the heat while it's doing this type of work though. I have not managed to get any game to get the reactions/fan speed increases, etc out of the card that the projects do.

The card usually hits and maintains 78 to 79c when boinc is running, when its fans are left strictly to their default controls (ie no overrides/manual settings such as with Nvidia System Tools. I do not OC my cards ... I leave them at factory OC settings. I may consider reducing the factory OC settings if I continue to have problems.

Last night I went and downloaded the Folding@home software .. which seems to be independent of the Boinc software and ran it. Although similar temps were achieved the system maintained stability for a longer period, and the card didn't seem to be working anywhere near as hard. Is that project part of Boinc or is it independent of it likely seti classic was?

Earlier today I was encoding video via the dvdflick opensource software. I got one whitescreen but not until it was done with both the vid/audio conversion and was writing the iso file to disk. I've since done more without it happening but I took a few more actions in an attempt to solve the whitescreen.

1. I uninstalled the nvidia drivers I was using and upgraded to 185.68 beta drivers. So far they are a whole lot more stable, and multiple problems I had been having went away. The 'sluggishness' frequently reported when using the gpu still happens but it seems to be at a significantly reduced rate with these drivers.

2. I installed Nvidia System Tools and, when boinc is set to use the gpu, I set the fan speed to 100 percent consistent (otherwise it bursts the fan speeds up and down and never anywhere near that fast).

It is currently about 78 degrees in here, as it was 80 most of the day today. I am running Boinc with Seti@home being the only attached project and have been for around an hour or so. Currently temps are sitting at a consistent 64c; at best they sat at 60c.

Since those steps I haven't seen another whitescreen yet, I'll add another reply here if I do. Any comments?
ID: 888143 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 888196 - Posted: 25 Apr 2009, 8:21:37 UTC - in response to Message 888143.  

The problem is that, with a vid card running at those temps, it may not be the vid card that is having the problem. I'm running a BFG GTX295 card on an Asus MB Intel system with the CPU highly overclocked. I was aware that the North Bridge was running pretty close to the edge before I added this GPU and I now believe that the heat pumped into the case by the GPU - similar design to the XFX in that heat is supposed to exit out the back - was pushing the North Bridge over the edge and causing random re-boots. I thought the thermal management in my Antec P180 case (with extra fans) was quite good before, but taking the side off the case has stopped the re-boots. So, you may need to take extra steps to get the heat out of the case.
Just my .02.

F.
ID: 888196 · Report as offensive

Questions and Answers : GPU applications : Instability issues when using gpu


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.