Same error on multiple machine with same workunit?

Questions and Answers : GPU applications : Same error on multiple machine with same workunit?
Message board moderation

To post messages, you must log in.

AuthorMessage
bupjae

Send message
Joined: 10 Mar 02
Posts: 1
Credit: 1,919,961
RAC: 0
Korea, South
Message 1179063 - Posted: 18 Dec 2011, 13:42:57 UTC
Last modified: 18 Dec 2011, 13:43:08 UTC

Some of my workunits reported error, so I want to examine what happened.

According to result of this unit, I think there might be something wrong with cuda application, but I'm not sure.

Is anyone have same result like this? Could it be a result of some program bug?
ID: 1179063 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1179088 - Posted: 18 Dec 2011, 17:24:00 UTC - in response to Message 1179063.  

In this case, it is the one running stock and getting what looks like a good result that is actually wrong. He is running Debian with the old 6.4.5 BOINC client. The rest of the wingmen found too many triplets and the WU will probably run through a couple more people before reporting too many errors and be wiped.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1179088 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1179434 - Posted: 20 Dec 2011, 0:37:01 UTC - in response to Message 1179063.  
Last modified: 20 Dec 2011, 0:53:19 UTC


The Linux CPU SETI@home Enhanced v6.03 (stock) validated OK against
Windows 7 GPU NVIDIA GeForce GTX 260 (optimized) SETI@home Enhanced Anonymous platform (NVIDIA GPU) - Lunatics Multibeam x38g Preview, Cuda 3.20

( It's known that stock CUDA apps (they are older) are more prone to "-12 error":
SETI@home error -12 Unknown error
cudaAcc_find_triplets erroneously found a triplet twice in find_triplets_kernel

than the Lunatics optimized CUDA apps (actively developed, newer, faster)
)

Stderr output
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<stderr_txt>
setiathome_enhanced 6.03 Revision: 306 g++ (Debian 4.3.1-6) 4.3.1
libboinc: BOINC 6.3.8

Work Unit Info:
...............
WU true angle range is :  2.724785
Optimal function choices:
-----------------------------------------------------
name                
-----------------------------------------------------
              v_BaseLineSmooth (no other)
   v_vGetPowerSpectrumUnrolled 0.00045 0.00000 
                   v_ChirpData 0.01915 0.00000 
            v_vTranspose4x8ntw 0.01132 0.00000 
                BH SSE folding 0.00141 0.00000 

Flopcounter: 10432754946380.425781

Spike count:    4
Pulse count:    0
Triplet count:  7
Gaussian count: 0
called boinc_finish

</stderr_txt>
]]>



Stderr output
<core_client_version>6.12.34</core_client_version>
<![CDATA[
<stderr_txt>
setiathome_CUDA: Found 2 CUDA device(s):
  Device 1: GeForce GTX 260, 842 MiB, regsPerBlock 16384
     computeCap 1.3, multiProcs 27 
     clockRate = 1350000 
  Device 2: GeForce GTX 260, 842 MiB, regsPerBlock 16384
     computeCap 1.3, multiProcs 27 
     clockRate = 1350000 
In cudaAcc_initializeDevice(): Boinc passed DevPref 2
setiathome_CUDA: CUDA Device 2 specified, checking...
   Device 2: GeForce GTX 260 is okay
SETI@home using CUDA accelerated device GeForce GTX 260
Priority of process raised successfully
Priority of worker thread raised successfully
Cuda Active: Plenty of total Global VRAM (>300MiB).
 All early cuFft plans postponed, to parallel with first chirp.

 )       _   _  _)_ o  _  _ 
(__ (_( ) ) (_( (_  ( (_ (  
 not bad for a human...  _) 

Multibeam x38g Preview, Cuda 3.20

Legacy setiathome_enhanced V6 mode.
Work Unit Info:
...............
WU true angle range is :  2.724785

Flopcounter: 10432115851830.561000

Spike count:    4
Pulse count:    0
Triplet count:  7
Gaussian count: 0
Worker preemptively acknowledging a normal exit.->
called boinc_finish
boinc_exit(): requesting safe worker shutdown ->
boinc_exit(): received safe worker shutdown acknowledge ->

</stderr_txt>
]]>


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1179434 · Report as offensive

Questions and Answers : GPU applications : Same error on multiple machine with same workunit?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.