different kinds of errors

Author	Message
David S Volunteer tester Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12	Message 1343259 - Posted: 5 Mar 2013, 15:07:45 UTC On this WU, five of us got MAX_TRIPLETTS errors and one got 28 pulses and 3 triplets for a -9 overflow. So how come the -9 counts as a completion when the others don't? David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. ID: 1343259 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1343261 - Posted: 5 Mar 2013, 15:12:52 UTC Last modified: 5 Mar 2013, 16:05:55 UTC The other results were done by GPUs. Which when they get to many triplets throw out a -12 exit status which is classified as an error. The -9 exit status is not classified as an error, but indicates that 30 or more than 30 results were found. Which stops processing at that point.. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1343261 ·

William Volunteer tester Send message Joined: 14 Feb 13 Posts: 2037 Credit: 17,689,662 RAC: 0	Message 1343282 - Posted: 5 Mar 2013, 16:44:57 UTC -9 is only 'informational' - after finding more than 30 signals the task is deemed too noisy and stopped - results are returned, the '-9' is used for outlier detection, so the short runtime won't influence APR. It's not an error, it just has an errorcode. Errorcodes can be used to caryy other information as well ;) -12 is that infamous triplet design flaw (as discussed in the panic thread). It may be on a task that would overflow, but it can crop up on any task, depending on how the triplets are distributed. The app can't handle it and quits. That's a hard error that kills the task. At least the app quits graciously... A person who won't read has no advantage over one who can't read. (Mark Twain) ID: 1343282 ·

David S Volunteer tester Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12	Message 1343732 - Posted: 7 Mar 2013, 14:19:18 UTC Okay then, how come this one's 4th host got a -9 overflow for 31 triplets instead of a -12 MAX_TRIPLETS? David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. ID: 1343732 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1343734 - Posted: 7 Mar 2013, 14:47:35 UTC - in response to Message 1343732. Last modified: 7 Mar 2013, 14:54:00 UTC That host is using x41zc which looks to have more code to handle triplets than the stock or older releases. Thread call stack limit is: 1k Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... cudaAcc_free() called... cudaAcc_free() running... cudaAcc_free() PulseFind freed... cudaAcc_free() Gaussfit freed... cudaAcc_free() AutoCorrelation freed... cudaAcc_free() DONE. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1343734 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874	Message 1343736 - Posted: 7 Mar 2013, 15:00:47 UTC - in response to Message 1343732. Okay then, how come this one's 4th host got a -9 overflow for 31 triplets instead of a -12 MAX_TRIPLETS? It's a difficult workunit, with all those triplets. Different versions of the science application handle the processing in different ways: in particular, the GPU (cuda) applications do a lot of the triplet checking earlier in the run than the CPU applications. So, from the top: #1, stock v6.10 cuda. Found triplets early, called error. #2, optimised x41g cuda. Found triplets early, called error. #3, stock v6.03 CPU. Found 6 pulses before getting bogged down in triplets. Called overflow. #4, optimised x41zc cuda. Found the triplets early, but was able to handle it by calling overflow instead of error. It's unfortunate that the different processing order means that the reported pulses caused the #3 and #4 results to be only 'weakly similar', and so yet another copy was sent out as a tiebreaker. That makes five downloads for what is, basically, a pretty uninteresting workunit, with (almost certainly) too much RFI swamping any science of value. The various developers (Eric Korpela for the stock app, and the various members of the Lunatics crew) are aware of the validation problems caused by the different processing architectures: but to make efficient use of the parallel processing capabilities of GPUs, things have to be shuffled around quite a lot, and sometimes - though mercifully rarely - artefacts like this are unavoidable. ID: 1343736 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874	Message 1345031 - Posted: 10 Mar 2013, 18:10:50 UTC WU 1179402281 is an interesting example of this. There are the returned results, before they get purged: 2855533288 6910484 1 Mar 2013, 16:39:55 UTC 2 Mar 2013, 1:56:38 UTC Completed, marked as invalid 92.82 1.14 0.00 SETI@home Enhanced Anonymous platform (NVIDIA GPU) 2855533289 6366977 1 Mar 2013, 16:39:56 UTC 2 Mar 2013, 5:11:26 UTC Error while computing 3.25 1.58 --- SETI@home Enhanced Anonymous platform (NVIDIA GPU) 2856466467 5378646 2 Mar 2013, 12:28:10 UTC 7 Mar 2013, 22:20:40 UTC Completed and validated 17.19 16.16 0.11 SETI@home Enhanced v6.03 2863422788 4830263 8 Mar 2013, 5:23:44 UTC 10 Mar 2013, 6:46:52 UTC Completed and validated 50.99 48.58 0.11 SETI@home Enhanced v6.03 I'm at the top with 31 triplets found by x41zc, and reported as success. Next came x41g, with the triplet finding error. The WU was finally completed by two steady old stock CPU crunchers, who agreed on 31 pulses. Even the single-core P4 spent less time on it than my i7 with GTX 670. Sometimes it doesn't pay to be too clever :P ID: 1345031 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.