First invalid

Message boards : Number crunching : First invalid
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 1580941 - Posted: 2 Oct 2014, 13:53:14 UTC

After doing Seti for awhile got my first invalid think I know what caused it my Stderr output was not complete wu is 1603747341 It did not have how many spikes etc. Am I correct in my train of thought?
ID: 1580941 · Report as offensive
Darth Beaver Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 20 Aug 99
Posts: 6728
Credit: 21,443,075
RAC: 3
Australia
Message 1580979 - Posted: 2 Oct 2014, 15:46:19 UTC - in response to Message 1580941.  

Not exactly so here is your spikes and the error



Spike count: 28
Autocorr count: 0
Pulse count: 0
Triplet count: 0
Gaussian count: 2

Worker preemptively acknowledging an overflow exit.->
called boinc_finish
Exit Status: 0
boinc_exit(): requesting safe worker shutdown ->
boinc_exit(): received safe worker shutdown acknowledge ->
Cuda threadsafe ExitProcess() initiated, rval 0

ID: 1580979 · Report as offensive
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 1581003 - Posted: 2 Oct 2014, 16:29:37 UTC
Last modified: 2 Oct 2014, 16:37:36 UTC

It must have happened when I shut the machine down or did an upgrade for Windows as everything on BOINC is done automatically. Cannot see that this is my output I can see;
<core_client_version>7.2.33</core_client_version>
<![CDATA[
<stderr_txt>

Build features: SETI7 Non-graphics FFTW USE_AVX x86
CPUID: AMD FX(tm)-8320 Eight-Core Processor

Cache: L1=64K L2=2048K

CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 SSSE3 FMA3 SSE4.1 SSE4.2 AVX SSE4A XOP FMA4
ar=0.391853 NumCfft=205083 NumGauss= 1206213756 NumPulse= 231057134337 NumTriplet= 19809917059072
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
Windows optimized S@H Enhanced application by Alex Kan
Version info: AVXx (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
AVXx Win32 Build 1846 , Ported by : Raistmer, JDWhale

SETI7 update by Raistmer
Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 0.391853
Gaussian: peak=2.972491, mean=0.4961086, ChiSq=1.314535, time=76.34, d_freq=1419143435.46,
score=1.528075, null_hyp=2.280114, chirp=-38.524, fft_len=16k

Build features: SETI7 Non-graphics FFTW USE_AVX x86
CPUID: AMD FX(tm)-8320 Eight-Core Processor

Cache: L1=64K L2=2048K

CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 SSSE3 FMA3 SSE4.1 SSE4.2 AVX SSE4A XOP FMA4
ar=0.391853 NumCfft=205083 NumGauss= 1206213756 NumPulse= 231057134337 NumTriplet= 19809917059072
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
Restarted at 68.13 percent.

</stderr_txt>
]]>
So could it be I get the invalid yet someone else had it
Computer no 7280900 CPU using anonymous the first of the 3 looks like it was the Cuda that caused the problem and I got the blame so any chance of getting my points
ID: 1581003 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1581036 - Posted: 2 Oct 2014, 17:19:35 UTC - in response to Message 1580979.  
Last modified: 2 Oct 2014, 17:25:39 UTC

Not exactly so here is your spikes and the error



Spike count: 28
Autocorr count: 0
Pulse count: 0
Triplet count: 0
Gaussian count: 2

Worker preemptively acknowledging an overflow exit.->
called boinc_finish
Exit Status: 0
boinc_exit(): requesting safe worker shutdown ->
boinc_exit(): received safe worker shutdown acknowledge ->
Cuda threadsafe ExitProcess() initiated, rval 0



There's no error apparent in that Cuda (2nd) nor stock CPU (3rd, decider) results. The stock CPU result agreed it was a genuine overflow result (lots of signals).
http://setiathome.berkeley.edu/workunit.php?wuid=1603747341

No idea why that Opt CPU run of MadMac's would appear to have processed full length (?), and found no spikes at all. Hard drive needs some checking perhaps ?
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1581036 · Report as offensive
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 1581051 - Posted: 2 Oct 2014, 17:53:30 UTC

Any idea as to check my hard drive all the others after that one seems ok maybe it was me checking to see if I got any viruses etc with skybot or upgrading my windows
ID: 1581051 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1581148 - Posted: 2 Oct 2014, 19:50:10 UTC - in response to Message 1581036.  

Not exactly so here is your spikes and the error



Spike count: 28
Autocorr count: 0
Pulse count: 0
Triplet count: 0
Gaussian count: 2

Worker preemptively acknowledging an overflow exit.->
called boinc_finish
Exit Status: 0
boinc_exit(): requesting safe worker shutdown ->
boinc_exit(): received safe worker shutdown acknowledge ->
Cuda threadsafe ExitProcess() initiated, rval 0



There's no error apparent in that Cuda (2nd) nor stock CPU (3rd, decider) results. The stock CPU result agreed it was a genuine overflow result (lots of signals).
http://setiathome.berkeley.edu/workunit.php?wuid=1603747341

No idea why that Opt CPU run of MadMac's would appear to have processed full length (?), and found no spikes at all. Hard drive needs some checking perhaps ?

The stock CPU run time IMO indicates the overflow was roughly at 90% progress.

I assume that MadMac's result also overflowed. In that case, the truncated stderr does matter. It would be missing the "result_overflow" keyword and the result file would have no best_autocorr because it did actually overflow. Then the validator would mark it invalid using the protective logic necessitated by idiots who tried to process SETI@home v7 tasks with SETI@home v6 applications.

I'll note that the r2549 builds in the Lunatics 0.42 installer should better encourage BOINC to actually capture all of the stderr. Although MinGW/GCC doesn't provide a commode.obj, I modified BOINC's diagnostics.cpp to do the freopen of stderr.txt in "ac" mode. The r2549 builds are also generally faster than r1846 builds, and those in the 64 bit installer are definitely faster than the 32 bit builds.
                                                                  Joe
ID: 1581148 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1581239 - Posted: 2 Oct 2014, 21:46:21 UTC - in response to Message 1581148.  
Last modified: 2 Oct 2014, 21:53:01 UTC

Oh yeah, the old boincapi and the missing stderr err bits. Amazing that keeps popping up.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1581239 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1581243 - Posted: 2 Oct 2014, 21:52:34 UTC - in response to Message 1581051.  
Last modified: 2 Oct 2014, 21:53:49 UTC

Any idea as to check my hard drive all the others after that one seems ok maybe it was me checking to see if I got any viruses etc with skybot or upgrading my windows


After Joe's comment, I would suggest for the specific example you can't do much & your host is unlikely at fault through accident or misadventure on your part. As always keep an eye on when they release new apps/installers etc. There are obscure problems in Boinc code that crop up sometimes in weird ways like this, sometimes workarounds are found. In other cases the server logic is not ideal, so strange things can happen.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1581243 · Report as offensive

Message boards : Number crunching : First invalid


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.