Problem with Cuda WU

Questions and Answers : Windows : Problem with Cuda WU
Message board moderation

To post messages, you must log in.

AuthorMessage
Nighthunter

Send message
Joined: 21 Jan 00
Posts: 2
Credit: 16,442,084
RAC: 0
Poland
Message 1649640 - Posted: 5 Mar 2015, 18:36:33 UTC

All Cuda WU are ivalid or inconclusive, link:
https://setiathome.berkeley.edu/results.php?hostid=7507044

Anyone can tell me "wheere is problem" ?

Typicall Stderr is like this one:

Stderr output

<core_client_version>7.4.36</core_client_version>
<![CDATA[
<stderr_txt>
setiathome_CUDA: Found 1 CUDA device(s):
Device 1: GeForce GTX 560 Ti, 1024 MiB, regsPerBlock 32768
computeCap 2.1, multiProcs 8
pciBusID = 1, pciSlotID = 0
clockRate = 1800 MHz
In cudaAcc_initializeDevice(): Boinc passed DevPref 1
setiathome_CUDA: CUDA Device 1 specified, checking...
Device 1: GeForce GTX 560 Ti is okay
SETI@home using CUDA accelerated device GeForce GTX 560 Ti
pulsefind: blocks per SM 4 (Fermi or newer default)
pulsefind: periods per launch 100 (default)
Priority of process set to BELOW_NORMAL (default) successfully
Priority of worker thread set successfully

setiathome enhanced x41zc, Cuda 4.20

Detected setiathome_enhanced_v7 task. Autocorrelations enabled, size 128k elements.
Work Unit Info:
...............
WU true angle range is : 2.713047
re-using dev_GaussFitResults array for dev_AutoCorrIn, 4194304 bytes
re-using dev_GaussFitResults+524288x8 array for dev_AutoCorrOut, 4194304 bytes
Thread call stack limit is: 1k
cudaAcc_free() called...
cudaAcc_free() running...
cudaAcc_free() PulseFind freed...
cudaAcc_free() Gaussfit freed...
cudaAcc_free() AutoCorrelation freed...
cudaAcc_free() DONE.
Cuda sync'd & freed.
Preemptively acknowledging a safe Exit. ->
SETI@Home Informational message -9 result_overflow
NOTE: The number of results detected equals the storage space allocated.

Flopcounter: 157553587196.818360

Spike count: 0
Autocorr count: 30
Pulse count: 0
Triplet count: 0
Gaussian count: 0
Worker preemptively acknowledging an overflow exit.->
called boinc_finish
Exit Status: 0
boinc_exit(): requesting safe worker shutdown ->
boinc_exit(): received safe worker shutdown acknowledge ->
Cuda threadsafe ExitProcess() initiated, rval 0

</stderr_txt>
]]>
ID: 1649640 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22189
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1649689 - Posted: 5 Mar 2015, 20:12:11 UTC

First thing that strikes me is that all have very short run times - a quick look at my rig with a GTX760 shows that the run times are substantially longer.
Next, the number of "autocors" is 30, and so calculation will have stopped at that point. This is pointing to a problem with the GPU or the driver...

A few questions:
- did you notice when this string of invalid results started?
- have you recently changed anything?
- I see you are running the latest WHQL driver, did this driver come from Nvidia/Gforce or from Microsoft?

Others may be along with further comments, suggestions or questions...
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1649689 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1649710 - Posted: 5 Mar 2015, 21:08:11 UTC - in response to Message 1649640.  
Last modified: 5 Mar 2015, 21:16:38 UTC

 
GeForce GTX 560 Ti is notorious for producing false overflows (e.g. "Autocorr count: 30")
caused by too optimistic factory Overclock + too low GPU core voltage

You need to lower MHz (of "shaders" / "CUDA cores") and/or up the GPU voltage
For this you may use MSI Afterburner, EVGA Precision or similar tool:
http://setiathome.berkeley.edu/forum_thread.php?id=65894&postid=1165854#1165854


jason_gee (the programmer of the CUDA apps):
"The current school of thought is that some of these factory overclocked 560ti's need some extra GPU core voltage for stability running Cuda, the theory being that the manufacturers may have spent the effort getting them 'game stable' but overlooked Cuda/PhysX stability"
http://setiathome.berkeley.edu/forum_thread.php?id=66124&postid=1170945#1170945

Links to voltage/MHz table posted by user "f_n_t":
http://setiathome.berkeley.edu/forum_thread.php?id=63429&postid=1118632#1118632


Your: clockRate = 1800 MHz

NVIDIA say:
Processor Clock (MHz) 1645 or 1464

http://setiathome.berkeley.edu/forum_thread.php?id=69380&postid=1294814#1294814

 
 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1649710 · Report as offensive
Nighthunter

Send message
Joined: 21 Jan 00
Posts: 2
Credit: 16,442,084
RAC: 0
Poland
Message 1649818 - Posted: 6 Mar 2015, 4:21:50 UTC - in response to Message 1649710.  

Thx a lot for answer.

Best regards.
ID: 1649818 · Report as offensive

Questions and Answers : Windows : Problem with Cuda WU


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.