SETI toasted my RTX 2060 in a few days!

Message boards : Number crunching : SETI toasted my RTX 2060 in a few days!
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile BigWaveSurfer

Send message
Joined: 29 Nov 01
Posts: 186
Credit: 36,311,381
RAC: 141
United States
Message 2033932 - Posted: 25 Feb 2020, 13:40:08 UTC

I went away on vacation for a extended weekend and left Seti running on my new system with the RTX 2060 - came back to a failed card. Is this common? I know I can throttle the CPU usage (and I do to 50%), is there a way to throttle the GPU usage as not to stress the card too much? Does Seti not play well with that particular card? Wondering if I just just disable the GPU crunching as I do not want this to happen again.
ID: 2033932 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2033936 - Posted: 25 Feb 2020, 13:59:41 UTC - in response to Message 2033932.  

Thank goodness, for your sake, that it would still have been under warranty.

I had a GTX 670 fail on me, about 7 years ago - and one month out of warranty. I lost a GTX 970 to a failed fan last year, after about 3½ years continuous running. Nothing else has failed, even the three GTX 750 Tis which were running on motherboards knocked out by a power surge.

I'd say you were just unlucky with a 'Friday build' - nothing to stop you replacing it and picking up where it left off.
ID: 2033936 · Report as offensive
Profile Bill Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 30 Nov 05
Posts: 282
Credit: 6,916,194
RAC: 60
United States
Message 2033943 - Posted: 26 Feb 2020, 2:58:53 UTC

My guess is that the card was not over-stressed. The reason I think that is because there are other "special sauces" of seti you can run that stress the GPU more than the stock application. That is to say, it completes the same task in less time.

I agree with Richard, you just happened to have been sent a bad card. I've been running a 1660 Ti for several months now 24/7 with no issues. I would just RMA the card.
Seti@home classic: 1,456 results, 1.613 years CPU time
ID: 2033943 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19062
Credit: 40,757,560
RAC: 67
United Kingdom
Message 2033946 - Posted: 26 Feb 2020, 3:07:45 UTC

My 2060 has been running continuously, except for the occasional re-boot since I updated this computer on 27th April 2019.

As suggested RMA it and get replacement.
ID: 2033946 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 2033950 - Posted: 26 Feb 2020, 3:24:59 UTC - in response to Message 2033932.  

Wondering if I just just disable the GPU crunching as I do not want this to happen again.
When you get another card, run something like GPUz and check to see what temperature it is running at.
70°c or less and it should be good for 24/7/365 operation for a decade or two.
As long as it's not a lemon, any video card that's kept cool will last for years even with continuous operation.
Grant
Darwin NT
ID: 2033950 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2033965 - Posted: 26 Feb 2020, 3:57:54 UTC

I just lost my first RTX 2080 Hybrid. Purchased 2/2019. Rather early for the pump to fail. Normally they fail around 18 months. RMA tomorrow.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2033965 · Report as offensive
tito
Volunteer tester

Send message
Joined: 28 Jul 02
Posts: 24
Credit: 19,536,875
RAC: 138
Poland
Message 2033987 - Posted: 26 Feb 2020, 6:47:35 UTC

What I do is to reduce power limit. Then card remains cool, power section is not stressed so much, fans are less noisy.
My 1660ti works at minimum 58% power limit in afterburner, while 1080ti is at 200W under Linux.
They crunch less, but advantages are significant - additionally no damages to any of my hardware for years.
ID: 2033987 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2034033 - Posted: 26 Feb 2020, 15:24:19 UTC - in response to Message 2033987.  

I do the same thing. But I actually add overclock to the cards to recover the performance lost from power limiting. Your power limit stays even when overclocked. Win-win.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2034033 · Report as offensive
Profile BigWaveSurfer

Send message
Joined: 29 Nov 01
Posts: 186
Credit: 36,311,381
RAC: 141
United States
Message 2034038 - Posted: 26 Feb 2020, 15:52:40 UTC

Thanks for all your tips and suggestions! I set up the replacement yesterday so hopefully will have the new card soon. I have shut down seti for now until I get the new card, running the on-board graphics currently. Hopefully this card will last longer!
ID: 2034038 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2034040 - Posted: 26 Feb 2020, 15:56:28 UTC

When you run seti (or any program who heavily uses the GPU resources) is recommended to increase the fan speed to keep the card cooler. Any program like afterburner or EVGA precision (there are a lot of others) will do that for you. What is important: keep you GPU running cool as possible.
ID: 2034040 · Report as offensive
Profile tazzduke
Volunteer tester

Send message
Joined: 15 Sep 07
Posts: 190
Credit: 28,269,068
RAC: 5
Australia
Message 2034084 - Posted: 26 Feb 2020, 20:51:54 UTC - in response to Message 2033946.  

My 2060 has been running continuously, except for the occasional re-boot since I updated this computer on 27th April 2019.

As suggested RMA it and get replacement.


Might I ask as to what temps your are getting on your RTX 2060????

Regards
ID: 2034084 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19062
Credit: 40,757,560
RAC: 67
United Kingdom
Message 2034092 - Posted: 26 Feb 2020, 21:45:01 UTC - in response to Message 2034084.  
Last modified: 26 Feb 2020, 22:17:53 UTC

My 2060 has been running continuously, except for the occasional re-boot since I updated this computer on 27th April 2019.

As suggested RMA it and get replacement.


Might I ask as to what temps your are getting on your RTX 2060????

Regards

52C (126F) with fan on auto. It is only air cooled, I haven't seen need to put the fan to max.

I hate fan noise and and case it close enough I can reach far side without moving.
I also hate water cooling, used to work on high power radio transmitters with water at 300psi, radiator as big a a large bed.

Edit] forgot to say I use Silverstone FHP141 case fans where possible, they are 38mm (1½" depth)
ID: 2034092 · Report as offensive
Profile tazzduke
Volunteer tester

Send message
Joined: 15 Sep 07
Posts: 190
Credit: 28,269,068
RAC: 5
Australia
Message 2034102 - Posted: 26 Feb 2020, 23:01:12 UTC - in response to Message 2034092.  

My 2060 has been running continuously, except for the occasional re-boot since I updated this computer on 27th April 2019.

As suggested RMA it and get replacement.


Might I ask as to what temps your are getting on your RTX 2060????

Regards

52C (126F) with fan on auto. It is only air cooled, I haven't seen need to put the fan to max.

I hate fan noise and and case it close enough I can reach far side without moving.
I also hate water cooling, used to work on high power radio transmitters with water at 300psi, radiator as big a a large bed.

Edit] forgot to say I use Silverstone FHP141 case fans where possible, they are 38mm (1½" depth)


Thankyou for that, gives me food for thought with my 2060.
ID: 2034102 · Report as offensive
Profile BigWaveSurfer

Send message
Joined: 29 Nov 01
Posts: 186
Credit: 36,311,381
RAC: 141
United States
Message 2034214 - Posted: 27 Feb 2020, 17:42:17 UTC

I just received and swapped the card. Running Seti for about the last 10 minutes along with GPU-Z.

GPU temp seems to be holding steady around 65C with the fan speed at 29% - ambient air temp is 68F.

I have Seti set to only run 25% of the CPU - currently running around 85C on the CPU.

I typically have Seti not run the GPU when I am on the computer, just when I am away. I will re-enable that once I make sure this card does not get too hot.

Thanks!
ID: 2034214 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30650
Credit: 53,134,872
RAC: 32
United States
Message 2034306 - Posted: 27 Feb 2020, 23:09:06 UTC - in response to Message 2033932.  

I went away on vacation for a extended weekend

Just a thought, was the A/C on? Thinking the ambient temp might have been partly at fault. Of course in winter that isn't it, but trips in summer it can be an issue.
99.99% it was a bad card and I see you have a replacement.
ID: 2034306 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 2034354 - Posted: 28 Feb 2020, 4:15:37 UTC - in response to Message 2034214.  

GPU temp seems to be holding steady around 65C with the fan speed at 29% - ambient air temp is 68F.

I have Seti set to only run 25% of the CPU - currently running around 85C on the CPU.

GPU is very cool, CPU is extremely hot for such a light workload.
Grant
Darwin NT
ID: 2034354 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 2034355 - Posted: 28 Feb 2020, 4:17:39 UTC - in response to Message 2034354.  

GPU temp seems to be holding steady around 65C with the fan speed at 29% - ambient air temp is 68F.

I have Seti set to only run 25% of the CPU - currently running around 85C on the CPU.
GPU is very cool, CPU is extremely hot for such a light workload.
Far too hot.

Cheers.
ID: 2034355 · Report as offensive
Ianab
Volunteer tester

Send message
Joined: 11 Jun 08
Posts: 732
Credit: 20,635,586
RAC: 5
New Zealand
Message 2034382 - Posted: 28 Feb 2020, 8:21:24 UTC

I'm going with Seti didn't "toast" your GPU, it just caused a manufacturing fault to show up earlier than it otherwise might have. In this case, while it was still in warranty, so it was a GOOD thing.

If you had bought the card and decided to run Blender and create the next LOTR movie, it would also have run at 100% (for several years), and also died.

So either manufacturers are selling cards that are over specced, or it was faulty.

It's a bit like buying a car, and the engine blows up. The dealer then says, "Oh, you took the engine to the red-line, sorry, that's not covered. " Of course it's covered, the "red-line" is what the manufacturer says the engine can safely run at. (and they put a rev limiter there to keep you honest). Same with a computer, if they say it should run at full load a "x" clock speed, then it should . Even if your room gets to warm etc, then the thermal protection "should" kick in and drop the clock speed to keep things safe. Now if you overclock and mess with the cooling? Yeah, you could cook something.

Now not every laptop / AIO has sufficient cooling, but the thermal throttling should still stop them melting.

BTW, I often use a few days of SETI work as a "burn in test" on a new / repaired machine. If it can run a weekend of that, then it's "good", and some useful work units got done. Seems more use than just doing random prime number tests.
ID: 2034382 · Report as offensive
Profile tazzduke
Volunteer tester

Send message
Joined: 15 Sep 07
Posts: 190
Credit: 28,269,068
RAC: 5
Australia
Message 2034393 - Posted: 28 Feb 2020, 11:53:15 UTC - in response to Message 2034382.  

Well I have the following

MSI RTX 2060 Ventus XS 6GB

Running in PC case in a room of ambiet temp of around 26C, with side cover off, it still manages to get to around 77c - 80c.

Well thats not good, but then I did the websearch on this model, turns out, its very common for this model of card, cheap cooling solution.

Even checked on the MSI forums, where a couple of respondents have said that is still okay to run.

Ummm, its still a hard pill to swallow after paying so much for them here in Australia.

Well looks like either I stick this cruncher in the back fridge or move to the South Pole lol.

Still to do some more troubleshooting tips, just in case there is a dodgy temp sensor, last time I touched it, it didnt feel that hot though.

Cheers
ID: 2034393 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22200
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2034395 - Posted: 28 Feb 2020, 12:12:36 UTC

While the finger test can be good, it can be somewhat misleading :-(
GPU temperature sensors tend to be well buried within the chips themselves and so the surface temperature will be somewhat lower than that where the probes are.

Book your passage to Antarctica ;-)
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2034395 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : SETI toasted my RTX 2060 in a few days!


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.