BSOD with Seti, tried reinstall of everything...


log in

Advanced search

Message boards : Number crunching : BSOD with Seti, tried reinstall of everything...

1 · 2 · Next
Author Message
Profile lupo
Send message
Joined: 29 Aug 10
Posts: 91
Credit: 4,736,407
RAC: 0
United States
Message 1058811 - Posted: 22 Dec 2010, 19:17:16 UTC
Last modified: 22 Dec 2010, 19:26:11 UTC

So I was getting regular BSOD when running SETI with the GPU tasks. I decided I would just format my machine (it needed it) and do a clean reinstall. I zipped up all my WU and set them aside and put them back in place once I put BOINC and Lunatic's back on the machine. I noticed that SETI will not process these 3K or so WU I saved. Maybe it has something with them being associated with the different version of software, I'm not sure.

Does anyone know how to get the WU working again under this new install of BOINC. I put them back in the same folder as before but BOINC does not see them nor it is downloading nay new WU (guess servers are down again)..

Thanks,

Adam

Edit:

Figured It would be a good idea to post my messages:

12/22/2010 2:22:06 PM Starting BOINC client version 6.10.58 for windows_x86_64
12/22/2010 2:22:06 PM log flags: file_xfer, sched_ops, task
12/22/2010 2:22:06 PM Libraries: libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
12/22/2010 2:22:06 PM Data directory: E:\ProgramData\BOINC
12/22/2010 2:22:06 PM Running under account lupo
12/22/2010 2:22:06 PM Processor: 12 GenuineIntel Intel(R) Core(TM) i7 CPU X 980 @ 3.33GHz [Family 6 Model 44 Stepping 2]
12/22/2010 2:22:06 PM Processor: 256.00 KB cache
12/22/2010 2:22:06 PM Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 syscall nx lm vmx tm2 popcnt aes pbe
12/22/2010 2:22:06 PM OS: Microsoft Windows 7: Enterprise x64 Edition, (06.01.7600.00)
12/22/2010 2:22:06 PM Memory: 5.99 GB physical, 11.98 GB virtual
12/22/2010 2:22:06 PM Disk: 186.31 GB total, 170.04 GB free
12/22/2010 2:22:06 PM Local time is UTC -5 hours
12/22/2010 2:22:06 PM NVIDIA GPU 0: GeForce GTX 470 (driver version 26099, CUDA version 3020, compute capability 2.0, 1249MB, 726 GFLOPS peak)
12/22/2010 2:22:06 PM NVIDIA GPU 1: GeForce GTX 470 (driver version 26099, CUDA version 3020, compute capability 2.0, 1249MB, 726 GFLOPS peak)
12/22/2010 2:22:06 PM NVIDIA GPU 2: GeForce GTX 470 (driver version 26099, CUDA version 3020, compute capability 2.0, 1248MB, 726 GFLOPS peak)
12/22/2010 2:22:06 PM SETI@home Found app_info.xml; using anonymous platform
12/22/2010 2:22:06 PM AQUA@home URL http://aqua.dwavesys.com/; Computer ID 82466; resource share 100
12/22/2010 2:22:06 PM SETI@home URL http://setiathome.berkeley.edu/; Computer ID 5707955; resource share 100
12/22/2010 2:22:06 PM SETI@home General prefs: from SETI@home (last modified 21-Dec-2010 23:08:43)
12/22/2010 2:22:06 PM SETI@home Host location: none
12/22/2010 2:22:06 PM SETI@home General prefs: using your defaults
12/22/2010 2:22:06 PM Reading preferences override file
12/22/2010 2:22:06 PM Preferences:
12/22/2010 2:22:06 PM max memory usage when active: 4908.06MB
12/22/2010 2:22:06 PM max memory usage when idle: 5521.57MB
12/22/2010 2:22:06 PM max disk usage: 50.00GB
12/22/2010 2:22:06 PM (to change preferences, visit the web site of an attached project, or select Preferences in the Manager)
12/22/2010 2:22:06 PM Not using a proxy
12/22/2010 2:22:06 PM Suspending computation - user request

Profile Gundolf Jahn
Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 358,195
RAC: 15
Germany
Message 1058816 - Posted: 22 Dec 2010, 19:43:30 UTC - in response to Message 1058811.

I zipped up all my WU and set them aside and put them back in place once I put BOINC and Lunatic's back on the machine.

I hope you backed up your whole data directory, otherwise the stored WUs are lost.

Gruß,
Gundolf

Olli
Send message
Joined: 25 Apr 07
Posts: 143
Credit: 2,089,162
RAC: 0
Finland
Message 1058818 - Posted: 22 Dec 2010, 19:56:35 UTC - in response to Message 1058811.

If you get the BSOD, then there is something strange going on in your own computer. To miss some of the earlier work, shouldn't be a bother, just wait to get some new WUs.
____________

Profile SciManStevProject donor
Volunteer tester
Avatar
Send message
Joined: 20 Jun 99
Posts: 4840
Credit: 81,521,299
RAC: 38,010
United States
Message 1058826 - Posted: 22 Dec 2010, 20:09:36 UTC

My 980 kept blue screening like that with two 480's. It didn't matter what speed things were runnig at. It would still Blue Screen. I turned off hyperthreading, and the problem went away.

Steve
____________
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website

Profile gizbar
Avatar
Send message
Joined: 7 Jan 01
Posts: 586
Credit: 21,087,774
RAC: 0
United Kingdom
Message 1058832 - Posted: 22 Dec 2010, 20:14:39 UTC

What power supply are you running with that? According to your post, you have 3 GTX470's in that rig? It may be stressing a rail too much, trying to run all 3.

If it only BSOD's on running GPU units, I'd start looking there. Try running one card at a time and seeing if it BSOD's, then 2, then 3. If it runs 2 successfully, then try a different combination, but still 2. Then try all the 2 card combinations, then try 3 cards again.

Even if I get a WU fail for some reason, it only gives me a computation error. It does not give me a BSOD.

Yes, Seti is in a downtime at the moment. They are doing the regular backup and trying to re-stripe Oscar to a different size. It's taking a while, because all the data has to be copied to Carolyn, then Oscar has his stripes changed, then all the data has to be copied back again. NOT a trivial operation.

Giz.

____________


A proud GPU User Server Donor!

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4085
Credit: 32,991,469
RAC: 5,858
United Kingdom
Message 1058837 - Posted: 22 Dec 2010, 20:42:13 UTC - in response to Message 1058816.
Last modified: 22 Dec 2010, 20:43:09 UTC

I zipped up all my WU and set them aside and put them back in place once I put BOINC and Lunatic's back on the machine.

I hope you backed up your whole data directory, otherwise the stored WUs are lost.

Gruß,
Gundolf


Well the OP has a new host now, so i guess not,

He could try dropping his old Wu's into his project folder, and then Merge his hosts, the project might resend his lost tasks, it would save downloading the Wu's again,

Claggy

Profile Gundolf Jahn
Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 358,195
RAC: 15
Germany
Message 1058846 - Posted: 22 Dec 2010, 21:02:37 UTC - in response to Message 1058837.

He could try dropping his old Wu's into his project folder...

You cannot drop old WUs into the project folder if you don't have saved the client_state.xml too. And if you've saved it, you don't need to merge, since the host info is saved too.

Gruß,
Gundolf

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4085
Credit: 32,991,469
RAC: 5,858
United Kingdom
Message 1058850 - Posted: 22 Dec 2010, 21:08:16 UTC - in response to Message 1058846.

He could try dropping his old Wu's into his project folder...

You cannot drop old WUs into the project folder if you don't have saved the client_state.xml too. And if you've saved it, you don't need to merge, since the host info is saved too.

Gruß,
Gundolf


But if he Merges the two hosts into one, the Wu's from the old host will become part of the new host, Will the Server resend the Wu's from the old host?, or will they be listed as timed out?

Claggy

SAHBster
Send message
Joined: 6 Jan 03
Posts: 12
Credit: 3,509,078
RAC: 3,101
United Kingdom
Message 1058855 - Posted: 22 Dec 2010, 21:27:33 UTC - in response to Message 1058850.

I too have a variation on this, Black Screen Of Death. Only happens when running GPU w/u's. Comp. switched to never sleep or hibernate. Have been running on this new pc for nearly a year and it has only started happening this last 7 days or so. Card is Nvidia GT220. Using optimised Lunatics client. Have rolled back driver for card but makes no difference.

When coming back to pc, even after just 5 - 10 minutes, screen just comes up black, then no signal detected and then just goes black again. Have to hold 'off button' to shut down. Restarts with options for safe mode et. al. But alway starts up fine 'normally'. Device manager tells me graphics cards working fine.

Any pointers welcome.

Thanks Chaps !!

22/12/2010 21:04:58 Starting BOINC client version 6.10.58 for windows_x86_64
22/12/2010 21:04:58 log flags: file_xfer, sched_ops, task
22/12/2010 21:04:58 Libraries: libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
22/12/2010 21:04:58 Data directory: C:\ProgramData\BOINC
22/12/2010 21:04:58 Running under account
22/12/2010 21:04:58 Processor: 4 GenuineIntel Intel(R) Core(TM)2 Quad CPU Q8300 @ 2.50GHz [Family 6 Model 23 Stepping 10]
22/12/2010 21:04:58 Processor: 2.00 MB cache
22/12/2010 21:04:58 Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 syscall nx lm vmx tm2 pbe
22/12/2010 21:04:58 OS: Microsoft Windows 7: Home Premium x64 Edition, (06.01.7600.00)
22/12/2010 21:04:58 Memory: 6.00 GB physical, 12.00 GB virtual
22/12/2010 21:04:58 Disk: 931.51 GB total, 875.74 GB free
22/12/2010 21:04:58 Local time is UTC +0 hours
22/12/2010 21:04:59 NVIDIA GPU 0: GeForce GT 220 (driver version 19107, CUDA version 2030, compute capability 1.2, 1024MB, 115 GFLOPS peak)
22/12/2010 21:04:59 SETI@home Found app_info.xml; using anonymous platform


____________

Olli
Send message
Joined: 25 Apr 07
Posts: 143
Credit: 2,089,162
RAC: 0
Finland
Message 1058864 - Posted: 22 Dec 2010, 21:39:18 UTC - in response to Message 1058855.

Try disabling all ACPI and power-saving features from BIOS first. Does it re-occur?
____________

SAHBster
Send message
Joined: 6 Jan 03
Posts: 12
Credit: 3,509,078
RAC: 3,101
United Kingdom
Message 1058871 - Posted: 22 Dec 2010, 21:46:16 UTC - in response to Message 1058864.
Last modified: 22 Dec 2010, 22:05:30 UTC

Sorry Olli for being an idiot. Please can you explain to me where I can find the ACPI and BIOS

Thank you for your quick response and patience.

Ok Olli, will google the ACPI and BIOS stuff.

Additionally, just set GPU to crunch whilst PC in use, screen shut down within 30 seconds. Had to reboot to get back on here !!

Machine will crunch CPU work fine and can be left for hours, have just reset GPU only to run when machine idle. From that it doesn't look as if it is a power saving problem does it ?

Olli
Send message
Joined: 25 Apr 07
Posts: 143
Credit: 2,089,162
RAC: 0
Finland
Message 1058880 - Posted: 22 Dec 2010, 22:08:10 UTC - in response to Message 1058871.

So, the clue is somewhere within the seven days you have not been able to run it ok. What happened seven days ago?
____________

Olli
Send message
Joined: 25 Apr 07
Posts: 143
Credit: 2,089,162
RAC: 0
Finland
Message 1058881 - Posted: 22 Dec 2010, 22:10:52 UTC - in response to Message 1058871.

It still can be about power saving. Try turning off any power saving features from your computer's screen saver.
____________

SAHBster
Send message
Joined: 6 Jan 03
Posts: 12
Credit: 3,509,078
RAC: 3,101
United Kingdom
Message 1058884 - Posted: 22 Dec 2010, 22:20:40 UTC - in response to Message 1058881.


Thanks Olli, will try the power saving route first. Then try to go through error logs to see if I can get any clues.

Thanks for your assistance, much appreciated.

Olli
Send message
Joined: 25 Apr 07
Posts: 143
Credit: 2,089,162
RAC: 0
Finland
Message 1058890 - Posted: 22 Dec 2010, 22:40:03 UTC - in response to Message 1058871.
Last modified: 22 Dec 2010, 23:20:08 UTC

Still, have you got GPU-Z in use?. It can monitor graphic adapter's temperatures. High temps might be the issue.

When is the last time you have opened your computer case and vacuumed all the fans and dissipators properly? Excess dust? I do it twice a month.
____________

Olli
Send message
Joined: 25 Apr 07
Posts: 143
Credit: 2,089,162
RAC: 0
Finland
Message 1058899 - Posted: 22 Dec 2010, 22:57:49 UTC - in response to Message 1058884.

Here is CPU-Z, if you don't already have it.
____________

Profile ignorance is no excuse
Avatar
Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,274
RAC: 0
Korea, North
Message 1059017 - Posted: 23 Dec 2010, 5:47:47 UTC - in response to Message 1058899.

You may have already mentioned this but you'll need to turn off the screensaver. Just set it to go blank after 20 minutes. You definitely can't run a BOINC/seti screensaver if you are crunching the WU on your GPU
____________
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

Profile lupo
Send message
Joined: 29 Aug 10
Posts: 91
Credit: 4,736,407
RAC: 0
United States
Message 1059063 - Posted: 23 Dec 2010, 10:09:12 UTC

I did a fresh install of windows 7 and wiped BOINC and reinstalled BOINC and lunatic. Im hoping that the issue has now gone away. I guess Ill know for sure once they start distributing WU. I got 3 tonight which I ripped through in like 30 minute so I'll need something more substantial. IIRC, they are back online tomorrow.

For tonight I'll juts run the GPU's doing dnet. That will stress then hard. I'm running a 1200W PSU and I can tell from the APC I ave it plugged into that it;s not drawing more than 700W at any given time so it's not that.

Thank for the help thought guys. Appreciate it.

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4085
Credit: 32,991,469
RAC: 5,858
United Kingdom
Message 1059074 - Posted: 23 Dec 2010, 10:37:35 UTC - in response to Message 1059063.

If you try merging your two hosts, there's a possibility that the server will resend the Wu's from your old host.

Claggy

SAHBster
Send message
Joined: 6 Jan 03
Posts: 12
Credit: 3,509,078
RAC: 3,101
United Kingdom
Message 1061128 - Posted: 29 Dec 2010, 23:48:56 UTC - in response to Message 1058899.


Olli,

Thanks for the pointer, downloaded GPUZ, running at 64 - 65 C, fan speed 50pct.

A curious thing though, my nephew is in local government IT support here in the U.K. He was visiting over the holidays, reseated GPU and memory modules. All appears to be running fine now.

Thanks for your patience Olli.

1 · 2 · Next

Message boards : Number crunching : BSOD with Seti, tried reinstall of everything...

Copyright © 2014 University of California