SETI@Home Bluescreen Crash


log in

Advanced search

Questions and Answers : Windows : SETI@Home Bluescreen Crash

Author Message
Mighty
Avatar
Send message
Joined: 15 May 99
Posts: 13
Credit: 3,986,588
RAC: 5,297
United States
Message 1531568 - Posted: 24 Jun 2014, 9:09:36 UTC

Win 7 Pro x64 on an Intel 4770K clocked to 3.5 GHz with 16 GB. I have an Nvidia GTX 780 with 3 GB, but I have it suspended in BOINC.

When I fire up SETI@Home it'll run for a short time, I think one to three minutes, and then Windows crashes to a blue screen. I've tried with and without the GPU. I haven't tried GPU only. I figure if it crashes on the CPU, then something is seriously not right.

I tried it when I first got the machine, in January. A couple of weeks ago, mid-June, I downloaded a fresh copy of BOINC and tried SETI@Home again. Same thing.

Other than BOINC/SETI@Home the machine is very stable. I'm a game player, so I'm pushing the hardware regularly. Including current first-person shooters and flying and driving sims.

I've searched the board, and can't find anyone else describing a bluescreen problem. Can anyone give me some ideas on what to do to troubleshoot this? I'd love to have SETI@Home cranking away on this machine.

Drake
____________
Drake

Profile Ageless
Avatar
Send message
Joined: 9 Jun 99
Posts: 12298
Credit: 2,590,616
RAC: 850
Netherlands
Message 1531583 - Posted: 24 Jun 2014, 11:10:43 UTC - in response to Message 1531568.

Without telling what the actual blue screen says, it's difficult to diagnose. So please download and run Blue Screen View and tell us what it says the screen told you.
____________
Jord

Fighting for the correct use of the apostrophe, together with Weird Al Yankovic

Mighty
Avatar
Send message
Joined: 15 May 99
Posts: 13
Credit: 3,986,588
RAC: 5,297
United States
Message 1531728 - Posted: 25 Jun 2014, 1:17:53 UTC - in response to Message 1531583.

I don't see any way to attach the reports here. I posted one from January and one from June on my website.

So, in most of the crashes from January it's Bug Check Code 0x00000124 in hal.dll. In one in January and this last one in June, same code in ntoskrnl.exe.

I've been skimming some of the Google hits on that error, and nothing is jumping out at me, so far. Any suggestions on how to narrow it down?

Thanks,

Drake
____________
Drake

John McLeod VII
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 15 Jul 99
Posts: 24496
Credit: 520,877
RAC: 44
United States
Message 1533763 - Posted: 29 Jun 2014, 21:55:29 UTC
Last modified: 29 Jun 2014, 21:56:00 UTC

It might be an overheating problem. When was the last time you chased the dust bunnies out of your computer case? Is the computer overclocked?
____________


BOINC WIKI

Mighty
Avatar
Send message
Joined: 15 May 99
Posts: 13
Credit: 3,986,588
RAC: 5,297
United States
Message 1533774 - Posted: 29 Jun 2014, 22:37:44 UTC - in response to Message 1533763.

In the past, I let cleaning go to the point that the heat sink in one machine was literally 90% clogged with dust. Since then, for the last six years or so, I routinely blow out the dust in my machines monthly or bi-monthly. I have an alarm remind me. I'm pretty sure I did it at the beginning of this month. Last month at the worst.

Also, I was getting the same crashes in the first few days I owned this machine. So, that wasn't dust.

The machine is overclocked by 20%, so nothing too aggressive. I had the boutique place I bought it from do that for me. It's one of their standard up-sells.

As I said, I run current 3D games on this machine all the time, and it has been very stable. Looking at the list in BlueScreenView, in six months there are two .dmp's that are not associated with attempts to run SETI@Home.

While, to me, the games seem a fairly thorough test, maybe you can suggest another number-hungry app that I can run to test the overheating idea? Or, a CPU diagnostic that you respect? I agree that the few minute delay sounds a lot like overheating.
____________
Drake

John McLeod VII
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 15 Jul 99
Posts: 24496
Credit: 520,877
RAC: 44
United States
Message 1533787 - Posted: 29 Jun 2014, 22:56:31 UTC - in response to Message 1533774.

In the past, I let cleaning go to the point that the heat sink in one machine was literally 90% clogged with dust. Since then, for the last six years or so, I routinely blow out the dust in my machines monthly or bi-monthly. I have an alarm remind me. I'm pretty sure I did it at the beginning of this month. Last month at the worst.

Also, I was getting the same crashes in the first few days I owned this machine. So, that wasn't dust.

The machine is overclocked by 20%, so nothing too aggressive. I had the boutique place I bought it from do that for me. It's one of their standard up-sells.

As I said, I run current 3D games on this machine all the time, and it has been very stable. Looking at the list in BlueScreenView, in six months there are two .dmp's that are not associated with attempts to run SETI@Home.

While, to me, the games seem a fairly thorough test, maybe you can suggest another number-hungry app that I can run to test the overheating idea? Or, a CPU diagnostic that you respect? I agree that the few minute delay sounds a lot like overheating.

Games tend to do much shorter computations than SETI does. They also do not always use all CPUs all the time. Can you get a temperature monitoring program for your machine? There are some that are free. This will tell us if it is a temp problem.

If you want to do a load test, try consume.exe from MS
____________


BOINC WIKI

Profile Ageless
Avatar
Send message
Joined: 9 Jun 99
Posts: 12298
Credit: 2,590,616
RAC: 850
Netherlands
Message 1533792 - Posted: 29 Jun 2014, 23:08:21 UTC

From http://www.tomshardware.co.uk/answers/id-2039223/whea-uncorrectable-error-bsod-message.html:

This is a bugcheck generated directly by the CPU. Most often it is caused because of incorrect voltage being applied to the CPU.
This can happen because of incorrect settings being applied to the BIOS (often resetting the BIOS to the defaults or updating the BIOS helps) Sometimes it indicates that a Power supply is outside of its correct voltage range. (power supply voltage regulators or mother board voltage regulator is in the process of failing.

____________
Jord

Fighting for the correct use of the apostrophe, together with Weird Al Yankovic

Mighty
Avatar
Send message
Joined: 15 May 99
Posts: 13
Credit: 3,986,588
RAC: 5,297
United States
Message 1533812 - Posted: 29 Jun 2014, 23:44:45 UTC - in response to Message 1533792.

I grabbed RealTemp and Prime95, and that triggered a Blue Screen. The temp jumped immediately to 100C and it died about 10 seconds later.

I'll look at that Tom's Hardware thread and my motherboard docs and see if I can tweak some settings up or down and try to stabilize this thing.

I appreciate the patient help.
____________
Drake

Profile Ageless
Avatar
Send message
Joined: 9 Jun 99
Posts: 12298
Credit: 2,590,616
RAC: 850
Netherlands
Message 1533822 - Posted: 30 Jun 2014, 0:13:43 UTC - in response to Message 1533812.
Last modified: 30 Jun 2014, 0:15:21 UTC

100C is never good. Sounds like either the CPU fan isn't doing its work (is it even spinning?), or that there's a problem with the thermal paste between the CPU and the heat sink.

(I'm helping you in my pauses in The Elder Scrolls Online. So if it takes me a while to answer, it's probably because trolls are slaying me. ;-))
____________
Jord

Fighting for the correct use of the apostrophe, together with Weird Al Yankovic

Mighty
Avatar
Send message
Joined: 15 May 99
Posts: 13
Credit: 3,986,588
RAC: 5,297
United States
Message 1533872 - Posted: 30 Jun 2014, 5:39:48 UTC - in response to Message 1533822.

All the fans are spinning. This is a liquid cooled system, so I have a clear view to the radiator to see that the radiator is clear. Obviously, I can't check the thermal paste that easily.

I've searched around, and apparently 100C is where this CPU starts throttling. It's touching 40C while I'm typing this.

I sent an email to my builder to see what they say. I expect to hear back from them tomorrow.

...Later

Just played a few hours of Far Cry 3. The temp was in the 60s most of the time, and went above 70 briefly, and only occasionally.
____________
Drake

Marc McLean
Avatar
Send message
Joined: 8 Nov 02
Posts: 4
Credit: 17,824
RAC: 323
United Kingdom
Message 1560335 - Posted: 21 Aug 2014, 22:52:34 UTC - in response to Message 1533872.

I suffered an overheating problem similar to this a while back on my system, after trying everything else, I found that it was my incorrect application of thermal paste that was the problem, redone it and all was good, also, you say all fans are spinning, check that your case is pulling and pushing air to and from the system correctly, might not be the cause but will help reduce temps overall

Profile Fred J. Verster
Volunteer tester
Send message
Joined: 21 Apr 04
Posts: 3238
Credit: 31,751,093
RAC: 4,178
Netherlands
Message 1561023 - Posted: 23 Aug 2014, 9:45:05 UTC - in response to Message 1560335.
Last modified: 23 Aug 2014, 9:59:22 UTC

I started yesterday, since 18 months off absense and installed the latest BOINC
64Bit on WIN 7, 64Bit.
PC: I7-2600 (HT=ON)16GB DDR 3 1600MHz. DRAM and 2x HD5870 GPU's.
PSU= 1000Watt.
I'd set the CPU to a higher TURBO Freq. 3.9 i.s.o/ 3.8GHz. and also aupped some
offset CPU voltages. This inmediatly cased not only overheat problems but real
instabillity! And lock-ups, BLUE-SCREENS and crashes.
Loading the B.I.O.S. settings for this particular CPU, caused the system to respond normally. CPU, with sock-cooling,[HT=ON] caused inmediat overheating problems, choosing 70% CPU load and a auxilliary fan took care offt that.
By the way, CoreTemp and INTEL DeskTopUtillities reported 101C for all 4 CPU's
[HT=ON], but the temp switched as fast as the core-load and caused no added
instabillity

The HD 5870 GPU's are running at ~75% off their crunching capacity did run smooth, the whole system uses 500Watt/hour and is very stable also use it for
browsing and forum posts, e-mail and other minor tasks, causing no problems
with BOINC running 'all the time' setting. I use 1 monitor on each GPU and enough 'head-room' is available the use these quite normally, although no smooth
graphics-display f.i web-cam or videos.

It took less then 8 hours to get me to post on these forums, enough work was done and validated.

Mighty
Avatar
Send message
Joined: 15 May 99
Posts: 13
Credit: 3,986,588
RAC: 5,297
United States
Message 1561027 - Posted: 23 Aug 2014, 10:04:31 UTC

Sorry I haven't updated. I got an email response from the boutique place that built my machine. They want me to call in to talk to a tech. Keep meaning to do it, but I get distracted. Real Soon Now.

Drake

John McLeod VII
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 15 Jul 99
Posts: 24496
Credit: 520,877
RAC: 44
United States
Message 1561339 - Posted: 24 Aug 2014, 1:43:27 UTC - in response to Message 1561023.

I started yesterday, since 18 months off absense and installed the latest BOINC
64Bit on WIN 7, 64Bit.
PC: I7-2600 (HT=ON)16GB DDR 3 1600MHz. DRAM and 2x HD5870 GPU's.
PSU= 1000Watt.
I'd set the CPU to a higher TURBO Freq. 3.9 i.s.o/ 3.8GHz. and also aupped some
offset CPU voltages. This inmediatly cased not only overheat problems but real
instabillity! And lock-ups, BLUE-SCREENS and crashes.
Loading the B.I.O.S. settings for this particular CPU, caused the system to respond normally. CPU, with sock-cooling,[HT=ON] caused inmediat overheating problems, choosing 70% CPU load and a auxilliary fan took care offt that.
By the way, CoreTemp and INTEL DeskTopUtillities reported 101C for all 4 CPU's
[HT=ON], but the temp switched as fast as the core-load and caused no added
instabillity

The HD 5870 GPU's are running at ~75% off their crunching capacity did run smooth, the whole system uses 500Watt/hour and is very stable also use it for
browsing and forum posts, e-mail and other minor tasks, causing no problems
with BOINC running 'all the time' setting. I use 1 monitor on each GPU and enough 'head-room' is available the use these quite normally, although no smooth
graphics-display f.i web-cam or videos.

It took less then 8 hours to get me to post on these forums, enough work was done and validated.

100 C is too hot.

Boosting the CPU voltage can lead to instability.
____________


BOINC WIKI

Profile BilBg
Volunteer tester
Avatar
Send message
Joined: 27 May 07
Posts: 2679
Credit: 6,058,051
RAC: 3,984
Bulgaria
Message 1561372 - Posted: 24 Aug 2014, 3:07:23 UTC - in response to Message 1561023.

Get TThrottle and don't let the CPU go over 80°C
http://efmer.com/b/

(in this case - in BOINC set 'Use 100% CPU time', TThrottle will do its own reducing of CPU time if the temperature you set (type in the box) is reached)
____________



- ALF - "Find out what you don't do well ..... then don't do it!" :)


Post to thread

Questions and Answers : Windows : SETI@Home Bluescreen Crash

Copyright © 2014 University of California