System issue ?

Message boards : Number crunching : System issue ?
Message board moderation

To post messages, you must log in.

AuthorMessage
The_bestest

Send message
Joined: 7 Oct 06
Posts: 36
Credit: 82,706,887
RAC: 79
United States
Message 1231606 - Posted: 13 May 2012, 16:26:02 UTC

Good Morning,

Relative SETI noob here, writing to pick your collective brains.

My main crunching machine seems to have taken a RAC dump in the past couple of weeks (yes, I know it's not about the RAC, but it IS an indicator of possible issues).
Total RAC was at 43400 on April 25, now at 38k+ and falling fast, especially in the past 10 days.

Overview of main cruncher, my other two are minor
Computer ID 5632153 User ID 8495676
Win 7 SP1 64-bit
6GB RAM
EVGA GeForce 460v2 Qty 2
Driver 296.10
I7 920 CPU, not overclocked
Lunatics x41g
Boinc 6.10.60
Pending Credit 0 per SETI@HOME
Recent RAC


BOINC Startup – Shows GPU’s
5/13/2012 10:28:52 AM Starting BOINC client version 6.10.60 for windows_x86_64
5/13/2012 10:28:52 AM Config: use all coprocessors
5/13/2012 10:28:52 AM log flags: file_xfer, sched_ops, task
5/13/2012 10:28:52 AM Libraries: libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
5/13/2012 10:28:52 AM Data directory: C:\ProgramData\BOINC
5/13/2012 10:28:52 AM Running under account Scott
5/13/2012 10:28:52 AM Processor: 8 GenuineIntel Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz [Family 6 Model 26 Stepping 5]
5/13/2012 10:28:52 AM Processor: 256.00 KB cache
5/13/2012 10:28:52 AM Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 syscall nx lm vmx tm2 popcnt pbe
5/13/2012 10:28:52 AM OS: Microsoft Windows 7: x64 Edition, Service Pack 1, (06.01.7601.00)
5/13/2012 10:28:52 AM Memory: 5.99 GB physical, 14.97 GB virtual
5/13/2012 10:28:52 AM Disk: 698.54 GB total, 617.00 GB free
5/13/2012 10:28:52 AM Local time is UTC -5 hours
5/13/2012 10:28:52 AM NVIDIA GPU 0: GeForce GTX 460 v2 (driver version 29610, CUDA version 4020, compute capability 2.1, 1024MB, 738 GFLOPS peak)
5/13/2012 10:28:52 AM NVIDIA GPU 1: GeForce GTX 460 v2 (driver version 29610, CUDA version 4020, compute capability 2.1, 1024MB, 738 GFLOPS peak)
5/13/2012 10:28:52 AM SETI@home Found app_info.xml; using anonymous platform
5/13/2012 10:28:54 AM SETI@home URL http://setiathome.berkeley.edu/; Computer ID 5632153; resource share 100
5/13/2012 10:28:54 AM SETI@home General prefs: from SETI@home (last modified 18-Mar-2012 20:54:57)
5/13/2012 10:28:54 AM SETI@home Computer location: home
5/13/2012 10:28:54 AM SETI@home General prefs: no separate prefs for home; using your defaults
5/13/2012 10:28:54 AM Reading preferences override file
5/13/2012 10:28:54 AM Preferences:
5/13/2012 10:28:54 AM max memory usage when active: 5214.00MB
5/13/2012 10:28:54 AM max memory usage when idle: 5827.41MB
5/13/2012 10:28:54 AM max disk usage: 10.00GB
5/13/2012 10:28:54 AM (to change preferences, visit the web site of an attached project, or select Preferences in the Manager)
5/13/2012 10:28:54 AM Not using a proxy

Status’ of Work Units
In Progress 3414
Pending 1360
Valid 345
Invalid 0
Error 7

Does this look like there is an issue on my end ? Or just a cyclical down tic, although I can't remember anything this extreme before without there being an outage of some type.

Timing is close to the date of Windows updates (May 10), was looking at doing a system restore to a prior date

Thoughts appreciated, and I'll be glad to supply any additional info as needed.

Thanks much
Scott

ID: 1231606 · Report as offensive
Profile red-ray
Avatar

Send message
Joined: 24 Jun 99
Posts: 308
Credit: 9,029,848
RAC: 0
United Kingdom
Message 1231607 - Posted: 13 May 2012, 16:30:58 UTC
Last modified: 13 May 2012, 16:46:09 UTC

This is probably a result of the limits being removed and wingmen taking longer. Is the pending count going up?

http://setiathome.berkeley.edu/results.php?hostid=5632153

My systems typically do 100K per day but today it's looking like it will be 85K and my pending has gone up by about 400.
ID: 1231607 · Report as offensive
The_bestest

Send message
Joined: 7 Oct 06
Posts: 36
Credit: 82,706,887
RAC: 79
United States
Message 1231614 - Posted: 13 May 2012, 16:40:10 UTC - in response to Message 1231607.  

Not sure yet, I'll keep an eye on that.

Thanks for your input

Scott
ID: 1231614 · Report as offensive
Profile dnolan
Avatar

Send message
Joined: 30 Aug 01
Posts: 1228
Credit: 47,779,411
RAC: 32
United States
Message 1231617 - Posted: 13 May 2012, 16:46:37 UTC

Do you allow your monitor(s) to go to sleep? The version of the driver you're using apparently has an issue with that. Could be what's causing your issue here?

-Dave
ID: 1231617 · Report as offensive
The_bestest

Send message
Joined: 7 Oct 06
Posts: 36
Credit: 82,706,887
RAC: 79
United States
Message 1231623 - Posted: 13 May 2012, 17:11:25 UTC - in response to Message 1231617.  

I just updated to this version of the Nvidia driver this morning, to resolve a totally different issue. To be honest I don't remember the previous version number (could have been 285.62). I have turned off 'Sleep' for the monitor to be safe though, will start manually turning the thing off.

Thanks for the suggestion.

Scott
ID: 1231623 · Report as offensive
Profile red-ray
Avatar

Send message
Joined: 24 Jun 99
Posts: 308
Credit: 9,029,848
RAC: 0
United Kingdom
Message 1231624 - Posted: 13 May 2012, 17:12:28 UTC - in response to Message 1231617.  
Last modified: 13 May 2012, 17:13:58 UTC

Do you allow your monitor(s) to go to sleep? The version of the driver you're using apparently has an issue with that. Could be what's causing your issue here?

I feel this is unlikely as with the driver DVI sleep issue you you get lots of errors like http://setiathome.berkeley.edu/show_host_detail.php?hostid=5851682 is getting.

It's fixed in the 301.24 driver, but that is a Beta release.
ID: 1231624 · Report as offensive
The_bestest

Send message
Joined: 7 Oct 06
Posts: 36
Credit: 82,706,887
RAC: 79
United States
Message 1231635 - Posted: 13 May 2012, 17:35:35 UTC

One more minor piece of information:

I have a few results that are returning the following

<core_client_version>6.10.60</core_client_version>
<![CDATA[
<stderr_txt>
Cuda error 'Couldn't get cuda device count
' in file 'c:/[Projects]/X_CudaMB/client/cuda/cudaAcceleration.cu' in line 146 : no CUDA-capable device is detected.
setiathome_CUDA: cudaGetDeviceCount() call failed.
setiathome_CUDA: No CUDA devices found
setiathome_CUDA: Found 0 CUDA device(s):
In cudaAcc_initializeDevice(): Boinc passed DevPref 2
setiathome_CUDA: CUDA Device 2 specified, checking...
Device cannot be used
Cuda device initialisation retry 1 of 6, waiting 5 secs...
Cuda error 'Couldn't get cuda device count
' in file 'c:/[Projects]/X_CudaMB/client/cuda/cudaAcceleration.cu' in line 146 : no CUDA-capable device is detected.
setiathome_CUDA: cudaGetDeviceCount() call failed.
setiathome_CUDA: No CUDA devices found
setiathome_CUDA: Found 0 CUDA device(s):

So it looks like there have been a very few work units, it appears mostly today, that for some reason that have not seen/used the GPU. I made two changes to my machine today, number one upgrading the Nvidia driver, and number two being I installed Boinc Tasks on my primary cruncher. That has been removed. If I see many more tasks not recognize CUDA, I will roll back the video driver.

Again, thank you everyone for your suggestions

Scott
ID: 1231635 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1231636 - Posted: 13 May 2012, 17:40:42 UTC - in response to Message 1231635.  
Last modified: 13 May 2012, 17:41:48 UTC

You're running the 296.10 driver which has the Sleeping Monitor Bug, once the Monitor goes to sleep the cuda device becomes unavailable,
eithier stop the Monitor going to sleep, and physically turn it off, or downgrade to 290.53 or earlier drivers, or upgrade to 301.xx or later drivers,

Claggy
ID: 1231636 · Report as offensive

Message boards : Number crunching : System issue ?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.