BOINC casuing BSOD's!!

Message boards : Number crunching : BOINC casuing BSOD's!!
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Jim Franklin

Send message
Joined: 3 Apr 99
Posts: 108
Credit: 10,843,395
RAC: 39
United Kingdom
Message 106565 - Posted: 2 May 2005, 21:47:16 UTC

I have been running BOINC for about 6 weeks now on several machines, the last few days this machine has crashed without warning on numerous occasions, and despite all possible solutions, the only one has been to halt BOINC as it is the cause of the crashes.

I am now seeing this on two of my other machines, namely a Centrino Laptop and a P4 Dell Optiplex.

I have posted this on SETI boards and had similar feedback from other users that BOINC causes BSOD's, machine lock-ups, memory errors and other similar fatal errors.

Have those that use these forums also come across odd behavior?

Also, if any of the Berkeley team visit here, what the hell is going on? I rely on this machine for work, as I do the laptop, this is why they run XP Pro, if I keep having these problems then I will be unistalling BOINC until such problems are resolved as I cannot risk a catastrophic failure.
ID: 106565 · Report as offensive
Profile dazphotog

Send message
Joined: 13 Mar 02
Posts: 73
Credit: 99,224
RAC: 0
United States
Message 106570 - Posted: 2 May 2005, 21:53:54 UTC

I am having the same thing...this morning I allowed BOINC network access and my computer froze solid. Last week the same thing happened but when the computer was rebooted...I lost all of my WU in BOINC. I don't get BSOD, my copmputer will just freeze solid...sometimes I can move the cursor, sometimes even that is frozen.
ID: 106570 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 106576 - Posted: 2 May 2005, 22:05:06 UTC - in response to Message 106565.  
Last modified: 2 May 2005, 22:05:47 UTC

> Also, if any of the Berkeley team visit here, what the hell is going on?

What are the error messages you get when they BSOD?
What messages do you get in your system logs?


EDIT- Oh, and do you have the screen saver enabled?
Grant
Darwin NT
ID: 106576 · Report as offensive
Profile Sir Ulli
Volunteer tester
Avatar

Send message
Joined: 21 Oct 99
Posts: 2246
Credit: 6,136,250
RAC: 0
Germany
Message 106577 - Posted: 2 May 2005, 22:05:35 UTC
Last modified: 2 May 2005, 22:07:16 UTC

i have no Probs...

using BOINC since 22.03.2004

i have NEVER a freeze with is going to BOINC

did your Computer is overclocked, did your Computer is running
prime


a prime test, is running at leat 24 Hours the Torture Test, at Options Torture Test, without any Errors, than your Computer is running prime

if you have Errors your Computer is not ready for BOINC

Greetings from Germany NRW
Ulli S@h Berkeley's Staff Friends Club m7 ©







ID: 106577 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 106581 - Posted: 2 May 2005, 22:10:32 UTC

Saying that your computer gets BSODs without actually stating what they are, isn't very helpful. Not to those of us who may want to help, nor to the developers.

So I had a hardlock BSOD last week. Only one. It was a STOP 0x0000001A MEMORY_MANAGEMENT Error Message under Windows 2000 SP4. But since I rebooted, no other BSODs have pained me.

Now on the other hand, you ought to read the End User License Agreement. (You see this when you install it and at the bottom of the page I linked you to just now). It states amongst other things:

------------------------------------
Disclaimer of Warranty

THIS SOFTWARE AND THE ACCOMPANYING FILES ARE DISTRIBUTED "AS IS"
AND WITHOUT WARRANTIES AS TO PERFORMANCE OR MERCHANTABILITY OR ANY
OTHER WARRANTIES WHETHER EXPRESSED OR IMPLIED.
NO WARRANTY OF FITNESS FOR A PARTICULAR PURPOSE IS OFFERED.

Restrictions

You may use this software on a computer system only if you own the system
or have the permission of the owner.
------------------------------------

In other words, if you don't trust it, don't run it. If you wreck your company computer using BOINC, it's your own fault. If you don't make regular backups, etc. ..
ID: 106581 · Report as offensive
Divide Overflow
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 365
Credit: 131,684
RAC: 0
United States
Message 106583 - Posted: 2 May 2005, 22:12:37 UTC

Odd that this is suddenly happening seemingly without cause.
Aside from the existing questions already asked here, how are your ambient temps? As Spring / Summer starts to warm up the Northern Hemisphere, many borderline machines can begin to fault.


ID: 106583 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 106609 - Posted: 2 May 2005, 22:32:25 UTC - in response to Message 106565.  

> I have been running BOINC for about 6 weeks now on several machines, the last
> few days this machine has crashed without warning on numerous occasions, and
> despite all possible solutions, the only one has been to halt BOINC as it is
> the cause of the crashes.


Hi Jim, Do you have Temp probes on your case/cpu? If so you might wanna check the temp. This has happened many times to different people in the past year. They all say"that can't be it" then we hear back that they're running at 150C or some other high figure. have you blown out the CPU and Ram lately??? If not, you might want to.

Seti and other projects run the processor harder than almost all other programs you can run, and that running creates heat. So If you stop running Boinc and have no trouble, then that's where I'd start

Hope this helps

tony


ID: 106609 · Report as offensive
Profile Jim Franklin

Send message
Joined: 3 Apr 99
Posts: 108
Credit: 10,843,395
RAC: 39
United Kingdom
Message 106619 - Posted: 2 May 2005, 22:43:00 UTC

Thanks for the replies Guys, generally the BSOD message causes a Physical Memory Dump with a Kernal Error message or a 0x0000000000 Management Error message.

This machine was overclocked, but had run happily for months, I have run both 3DMark 2003 and 2005 on this machine with it overclocked and had no issues.

CPU = P4 Prescott 3.0GHz @ 3.6GHz (Originally)
Memory = 1024MB OCZ Cas2 DDR PC5000 @400MHz
Mobo = Asus P4P800 Delux V2.0 (Latest BIOS update)
Graphics = ATI Radeon 9800 Delux 256MB DDR

Temps for CPU @ Full load = 54C, memory = 56C, Mobo = 32C

This machine has now been clocked back to stock of 3.0 GHz and the core temp has dropped to 43C, however it still locks up or BSOD's when I run BOINC

The Laptop and the Dell have never been O/Ced as they cannot.

All machines are running XP Pro with SP 2 installed, all hardware has latest drivers.

This is 100% BOINC related as I have run SETI classic units on this after the BSOD's and they have not crashed, I have run extreme Rendering programs on all and again no crash. I have run 3D Mark on all, again not crashes or freezes, on this machine I got it running a rendering sequence in AutoCAD whilst also editing another picture, streaming TV across the internet and also instigating a FULL Viruis scan...OK she was slow as an old pig as a result, but she did not crash, freeze or BSOD or failure of any type.

There is no doubt this is purely BOINC related and this will seriously effect the success of the program if it happens more often.

I only run SETI, so it would be interesting to know if those running other projects have seen similar problems or whether it is purely SETI related to BOINC.
ID: 106619 · Report as offensive
Profile Sir Ulli
Volunteer tester
Avatar

Send message
Joined: 21 Oct 99
Posts: 2246
Credit: 6,136,250
RAC: 0
Germany
Message 106623 - Posted: 2 May 2005, 22:47:44 UTC - in response to Message 106619.  

> Thanks for the replies Guys, generally the BSOD message causes a Physical
> Memory Dump with a Kernal Error message or a 0x0000000000 Management Error
> message.
>
> This machine was overclocked, but had run happily for months, I have run both
> 3DMark 2003 and 2005 on this machine with it overclocked and had no issues.
>
> CPU = P4 Prescott 3.0GHz @ 3.6GHz (Originally)
> Memory = 1024MB OCZ Cas2 DDR PC5000 @400MHz
> Mobo = Asus P4P800 Delux V2.0 (Latest BIOS update)
> Graphics = ATI Radeon 9800 Delux 256MB DDR
>
> Temps for CPU @ Full load = 54C, memory = 56C, Mobo = 32C
>
> This machine has now been clocked back to stock of 3.0 GHz and the core temp
> has dropped to 43C, however it still locks up or BSOD's when I run BOINC
>
> The Laptop and the Dell have never been O/Ced as they cannot.
>
> All machines are running XP Pro with SP 2 installed, all hardware has latest
> drivers.
>
> This is 100% BOINC related as I have run SETI classic units on this after the
> BSOD's and they have not crashed, I have run extreme Rendering programs on all
> and again no crash. I have run 3D Mark on all, again not crashes or freezes,
> on this machine I got it running a rendering sequence in AutoCAD whilst also
> editing another picture, streaming TV across the internet and also instigating
> a FULL Viruis scan...OK she was slow as an old pig as a result, but she did
> not crash, freeze or BSOD or failure of any type.
>
> There is no doubt this is purely BOINC related and this will seriously effect
> the success of the program if it happens more often.
>
> I only run SETI, so it would be interesting to know if those running other
> projects have seen similar problems or whether it is purely SETI related to
> BOINC.
>

have you ever run prime

only for Info

Greetings from Germany NRW
Ulli S@h Berkeley's Staff Friends Club m7 ©


ID: 106623 · Report as offensive
Profile Kajunfisher
Volunteer tester
Avatar

Send message
Joined: 29 Mar 05
Posts: 1407
Credit: 126,476
RAC: 0
United States
Message 106634 - Posted: 2 May 2005, 23:01:45 UTC

Hey Jim,

Sorry your having problems, but could you tell us what version of BOINC you have installed, and possible any error messages you may see on the "Messages" tab of the program?
No matter where you go, there you are...
ID: 106634 · Report as offensive
Profile Jim Franklin

Send message
Joined: 3 Apr 99
Posts: 108
Credit: 10,843,395
RAC: 39
United Kingdom
Message 106635 - Posted: 2 May 2005, 23:02:45 UTC

Ulli, Prime is now running on the Dell and the laptop, so I will let you know if there are any problems when I get in from work tomorrow night UK time
ID: 106635 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 106636 - Posted: 2 May 2005, 23:03:52 UTC

Jim, If it's not heat related than you're the only one who's gotten this error that I know of. If you find the answer than I can pass it on to others should it happen to someone else. Please let me know the answer. I've been watching these threads for over a year, and haven't heard of this before (other than heat related failures).

good luck

tony
ID: 106636 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 106637 - Posted: 2 May 2005, 23:04:17 UTC
Last modified: 2 May 2005, 23:09:53 UTC

damn double click.

Jim, I've taken one laptop back to Best Buy so many times for the BOSD that they lemoned it and gave me a new one (this one), even it has seen the BOSD when the vents get partially plugged. THere is a thread called "does Boinc harm a laptop" you can search for it and see if it has an answer.

tony
ID: 106637 · Report as offensive
Profile MikeSW17
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 1603
Credit: 2,700,523
RAC: 0
United Kingdom
Message 106638 - Posted: 2 May 2005, 23:04:29 UTC - in response to Message 106619.  

One step at a time, a pprocess of elimination:

1) It is not hardware related as it occurs on 3 entirely different systems.
2) It is not Windows XP SP2 - more than a few people run BOINC on XP OK.
3) It is not inherrently BOINC or SETI as more than a few people run these OK.

That leaves:

a) Environment - power and ambient temperatures.
b) Confilicting software - anti-virus, or any package that runs on all 3 systems - includes virii, trojans, key-loggers....




ID: 106638 · Report as offensive
Profile trux
Volunteer tester
Avatar

Send message
Joined: 6 Feb 01
Posts: 344
Credit: 1,127,051
RAC: 0
Czech Republic
Message 106648 - Posted: 2 May 2005, 23:32:29 UTC

I run BOINC/S@H on 7 XP machines without any BSOD's, but it regularly freezes two of my 3 freeBSD boxes. I tried both pre-compiled and self-compiled clients, with no difference - they freeze with letting no other option than rebooting. As a consequence, on rebooting, the client drops all WU's and downloads a full batch of new ones. I guess I could possibly avoid the reloading by turning off the disk write cache, but I didn't try it yet.

I plan playing with compiler optimizing and other options, but since it is a time-consuming process, it won't be any soon.

The first box is an old dual-CPU (2x Celeron II/500MHz) box. Previously there was Win XP on it, and BOINC/S@H worked just fine on it. The machine has no temperature problems. It crashes at each WU, usually just at the end.

The other crashing FreeBSD box has an AMD Athlon XP 1800+. It is a remote server, so I cannot verify the temperature, but it never crashed before even under load. I cannot exclude though that the temperature is the issue here. Unlike the first machine, it does not freeze at each unit, or at the end of the calculation.

BOINC/S@H runs fine on the 3rd FreeBSD box with Pentium 4/3GHz

trux
ID: 106648 · Report as offensive
Franz Bauer

Send message
Joined: 8 Feb 01
Posts: 127
Credit: 9,690,361
RAC: 0
Canada
Message 106786 - Posted: 3 May 2005, 5:48:50 UTC - in response to Message 106565.  

Hi Jim:

I had the same problem with my computer. Running the Hyper-Threading Technology Test Utility from Intel, it discovered that the 845PE Chipset on my ASUS P4PE-X motherboard was defective causing it to freeze. I have since disabled HT in the BIOS and it rarely crashes now.

You can download the utility programs from the Intel web site to help diagnose your computer system if you are having problems with running BOINC with multiple CPUs. These are OS specific.

http://downloadfinder.intel.com/scripts-df/Product_Filter.asp?ProductID=483

Hyper-Threading Technology Test Utility
Intel® Processor Identification Utility

http://downloadfinder.intel.com/scripts-df/Product_Search.asp?Prod_nm=chipset*ID*utility

Intel® Chipset Identification Utility
Intel® Chipset Software Installation Utility

The stats on my computer are:
Intel Pentium 4HT, 2600MHz (13 x 200), Northwood HyperThreading
ASUS P4PE-X, 11/10/2003 – I845PE / ICH4 / IT8708 – P4PE-X
Micron Tech 16VDDT6464AG-265C4, 514MB, PC2100 (133MHz)
Southbridge: Chipset Intel Brookdale i845PE, revision 03
Northbridge: Intel 82801DB, ICH4

Franz

ID: 106786 · Report as offensive
Profile Jim Franklin

Send message
Joined: 3 Apr 99
Posts: 108
Credit: 10,843,395
RAC: 39
United Kingdom
Message 106815 - Posted: 3 May 2005, 8:15:42 UTC - in response to Message 106638.  

> One step at a time, a pprocess of elimination:
>
> 1) It is not hardware related as it occurs on 3 entirely different systems.
> 2) It is not Windows XP SP2 - more than a few people run BOINC on XP OK.
> 3) It is not inherrently BOINC or SETI as more than a few people run these
> OK.
>
> That leaves:
>
> a) Environment - power and ambient temperatures.
> b) Confilicting software - anti-virus, or any package that runs on all 3
> systems - includes virii, trojans, key-loggers....
>
>


Mike, I have ruled out the Hardware and also mostly ruled out XP Pro, however I have not ruled out the BOINC software as the machine ONLY BSOD or freeze when running BOINC.

There are no Virii, trojans or aother nasties on my machines, they are checked regulaly with decent software and this is not an issue. I have alweays stopped BOINC before running scans every few days so it is not that the Software and BOINC cause the lock-ups and BSODs

I have posted this message on other boards and got feedback that it does occur on other systems, it may not be common, but regular enough for other to notice.

I have been running SETI since 1999, I have over 30K classic units to my name and have run up to 50 machines (self-built) so I know who to identify any potential conflicts and problems with them.

What concerns me is that the Dell ONLY RUNS XP Pro SP2, BOINC and Trend AntiVirus (Which only runs when I tell it), there are no other drivers loaded except the motherboard ones, yet is has the same problem as the laptop and this machine.

There is an old saying in England,

"If it walks like a duck, looks like a duck and quacks like a duck, then it is assuredly a duck"

BOINC is the cause of that I have no doubt, it is simply a case of finding the cause within BOINC, is the background software that is the issue, or is it solely related to how BOINC interacts with the SETI cleint and work units?
ID: 106815 · Report as offensive
Profile Paul D. Buck
Volunteer tester

Send message
Joined: 19 Jul 00
Posts: 3898
Credit: 1,158,042
RAC: 0
United States
Message 106907 - Posted: 3 May 2005, 13:26:00 UTC - in response to Message 106815.  

> Mike, I have ruled out the Hardware and also mostly ruled out XP Pro, however
> I have not ruled out the BOINC software as the machine ONLY BSOD or freeze
> when running BOINC.

If you have ruled out everything else, and you are one of the few having this problem, then you have to find out more about what is happening. Saying you get a BSOD with no precice information is useless.

Is it the BOINC Software?
The Science Application?
WHat projects are you running?
What are the lengths of your project Queues?
What type of network are you using?
What are you network settings?
Exactly what is your hardware?
Are you running the screen saver?
Do you have an ATI Video card?
etc.

At the moment, effectively all you have identified is that you have a problem, it causes BSOD and you are sure that it is BOINC.

The problem on this side is that we have helped people with similar problems in the past, who also were sure that it was BOINC when it was something else coming in conflict with BOINC. Which is why people asked you how did you identify that it could only be BOINC, and not something else colliding with BOINC.

For example, did you format the HD, install the OS clean, and then, and only then, install BOINC, run it, and have it die?

I know you say that this is more than an isolated instance, but I have run SETI@Home Classic, now BOINC on 20-30 computers, most of them self-assembled, and Have never seen this problem.
ID: 106907 · Report as offensive
Profile FloridaBear
Avatar

Send message
Joined: 28 Mar 02
Posts: 117
Credit: 6,480,773
RAC: 0
United States
Message 106916 - Posted: 3 May 2005, 13:33:37 UTC - in response to Message 106907.  


> I know you say that this is more than an isolated instance, but I have run
> SETI@Home Classic, now BOINC on 20-30 computers, most of them self-assembled,
> and Have never seen this problem.

Just as a quick note, I've also run this on probably 15-20 different hyperthreaded Dells (6 full-time at the moment) and never once seen a blue screen while running BOINC. Not that this helps solve the problem, but I'm pretty sure it's not Dell-specific.
ID: 106916 · Report as offensive
Profile MikeSW17
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 1603
Credit: 2,700,523
RAC: 0
United Kingdom
Message 106929 - Posted: 3 May 2005, 14:02:34 UTC

Jim,
Ok we know a lot more about your set-up now.
My choice of the word 'inherrent' was to say that it is unlikey to be a simple coding error in BOINC or the SETI app or it would likely show up more commonly - this does not preculde bugs due to the interaction of these with either the hardware or other software.

Driver problems also cannot be simply ruled-out - not long ago I had crashes casused by faulty CD driver - when I had not even put a CD in the drive for days - go figure.

The actual detail of BSOD message is really needed - from all 3 systems so we can see if there is a common factor. The system error log should also contain an entry for each of these crashes. As I'm sure you know the BCcode provides the cause(s) if the failure.

When the system re-boots, does it display a message to the effect that "the system has recovered from a serious error"? Does it offer to submit the error to MS? If so, does the MS system reply with any information as to the cause?


ID: 106929 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : BOINC casuing BSOD's!!


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.