Message boards :
Number crunching :
Computer crashes a lot
Message board moderation
Author | Message |
---|---|
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
Since a few weeks my computer crashes pretty often. It doesn't happen every day but 3-4 times a week. I usually let my computer crunch while I'm at work or at night, so I'm not there (or not awake) when the crashes happen which means that the computer sits in the log on screen for hours and that's a waste of electricity and also lets my RAC shrink. This is the computer I'm talking about: http://setiathome.berkeley.edu/show_host_detail.php?hostid=7327094 I usually run one task of vLHC on CPU and two AP tasks on the GPU. I use the following commandline for AP: -unroll 10 -ffa_block 8192 -ffa_block_fetch 4096 -tune 1 64 4 1 -use_sleep As long as the system doesn't crash everything is ok. Crunching speed is good and temperatures are rather low. I have no idea what causes the crashes, so today, after coming home from work and seeing the windows log on screen once again, I installed WhoCrashed to analyze the crash. This is what I got: Any ideas? |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
As a work around until this is sorted out you could configure the machine for auto logon. That was when it crashes it will log in and restart BOINC. Running control userpasswords2 is one way you can setup the system for auto logon. The information WhoCrashed is giving you is located in the Windows Event log. You don't really need a 3rd party application to read it, but whichever way you prefer works. That error doesn't point to anything specific. Kernel crashes can be tricky to lock down. If you look in the windows event log are there any event prior to the crash that may show what the system was doing. Perhaps an anti-virus scan or something along those lines was running? SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
rob smith Send message Joined: 7 Mar 03 Posts: 22220 Credit: 416,307,556 RAC: 380 |
Clue is at the end of the penultimate line "caused by a thermal issue" Loads of potential causes of thermal issues ranging from clogged fans and filters, excessive overclocking, failed fan, fan running too slow...... And more...... Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
@Hal: I didn't even know that there is a windows event log. I just checked it and it shows 29 errors in the last 24 hours but I have no idea if (and if so, which) one of those has to do with the crash. I also couldn't find anything like an av scan or so, but to be honest, I don't really know how to read this log (yet). @rob: When I'm at home and I watch my computer, temperatures are always ok. Cores are usually below 50 degrees and sometimes go up to 53 or 54, GPU usually is between 50 and 59 and I never saw more then 64 (at least not since I use "use sleep" in commandline). So I don't think that's the problem here. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
I usually run one task of vLHC on CPU and two AP tasks on the GPU.... By my count that setup requires 3 CPU cores...You have 2. Does it crash when the GPU is working 2 Blanked APs at once? |
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
Good morning, folks! TBar, well, I'm not sure, but yes, lot of highly blanked APs lately. Could that be the reason or at least part of it? I thought one free core is enough for doing AP on GPU. BTW: Do you guys think my commandline is ok? I remember reading that the computer may crash if the numbers are too high. |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
Good morning, folks! Its possible, yes. Your AP settings are at max for this GPU but your CPU is slowing things down at times. Did you try to run GPU only ? With each crime and every kindness we birth our future. |
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
I try this right now. Today the computer did not crash. But as I said, it didn't crash every day, so I have to try this for a few days before I can be sure. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
I ran a couple Dual core machines for a while with two GPU cards. Whenever I tried running 2 GPU APs and 1 CPU AP everything was fine until the GPUs hit 2 Highly Blanked APs at the same time. Then the CPU spent a lot of time thrashing at 100%. If your system is the least bit unstable it will probably crash. I didn't have any problem running 1 AP, 1 MB, and 1 CPU AP. I would suggest you try running just 1 GPU AP at a time. Your GTX 750 isn't going to gain anything by running 2 APs at a time anyway. Here is the ReadMe_AstroPulse_OpenCL_NV.txt; For best performance it is important to free 2 CPU cores running multiple instances. |
tbret Send message Joined: 28 May 99 Posts: 3380 Credit: 296,162,071 RAC: 40 |
I think everyone else has you covered and what I'm about to say almost certainly isn't your problem, but I thought it might be worth mentioning in case you continue to have trouble even after freeing CPU resources to run your GPUs. I had a machine that kept BSODing on me in the middle of crunching and I couldn't figure-out why. Nothing I did seemed to help. I ran memory diagnostics, chkdsk for hours, I re-configured BOINC, I cleaned, I cooled, I replaced the power supply... Nothing helped. Then I thought of something that I wouldn't say had the intellectual bandwidth of a "hunch" but was something closer to desperation: I pulled and reseated the RAM. (that hadn't been done in several years) My BSODs went away. |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
Yes it had its reasons i`ve added this to the read me`s. With each crime and every kindness we birth our future. |
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
I gave this a try today and until now it worked, running fine for 11 hours without crashing. But will just running 1 AP task at a time on GPU really give the same RAC as running two tasks? I know that with MB running two tasks at a time is definitly faster. So it's different on AP? Would be great if it really is so, because then I could do Seti and vLHC without crashing. I would prefer this to running 2 tasks on GPU and nothing on CPU because I really like CERN and would like to contibute there a bit also. Anyway, I have to test for a few days now if this setup is really stable and how the RAC looks like. Thx to anybody for helping me here! |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
Of course running 2 instances of AP on GPU is better for RAC. Also more efficient. With each crime and every kindness we birth our future. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Of course running 2 instances of AP on GPU is better for RAC. The only advantage with a low end card is when running Blanked APs. Soon, the problem with Blanked APs will be history. I would like to see a comparison with a low end card running AP_v7. |
qbit Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0 |
Your GTX 750 isn't going to gain anything by running 2 APs at a time anyway. Of course running 2 instances of AP on GPU is better for RAC. The only advantage with a low end card is when running Blanked APs. Soon, the problem with Blanked APs will be history. I would like to see a comparison with a low end card running AP_v7. I'm a bit confused now..... |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
Of course running 2 instances of AP on GPU is better for RAC. Not true, it depends on the CPU not GPU. With each crime and every kindness we birth our future. |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
Your GTX 750 isn't going to gain anything by running 2 APs at a time anyway. The 750 will benefit running 2 instances for sure. No doubt. With each crime and every kindness we birth our future. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Of course running 2 instances of AP on GPU is better for RAC. Have you tested multiple instances on a low end card with AP_v7? With the Single instance tuned to run at 98% load? |
rob smith Send message Joined: 7 Mar 03 Posts: 22220 Credit: 416,307,556 RAC: 380 |
The proportion of processing done by the GPU is inversely proportional to the amount of blanking. So for a very highly blanked task very little use is made of the GPU and lots of use is made of the CPU - thus a high performance GPU is little better (in terms of overall processing time) than a low end one. (For low blanked AP tasks most of the work is done on the GPU, and thus a high-end GPU will show a marked improvement in overall processing time) Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
Of course running 2 instances of AP on GPU is better for RAC. Yes, sure. I tested 8 different NV GPU`s last week with AP7. With each crime and every kindness we birth our future. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.