Computation Errors on the Rise?

Author	Message
Tklop Send message Joined: 11 May 03 Posts: 175 Credit: 613,952 RAC: 0	Message 651710 - Posted: 30 Sep 2007, 18:18:34 UTC Last modified: 30 Sep 2007, 18:22:40 UTC I've not encountered a ton, but there are several WU's in my Results page which are showing as "Computation Error" -- a result that matches several others' crunching attempts... Bad work units maybe? They all stsrt with the same date: 07mr07aj... Anybody else running into these? Just a heads up for all of you... [edit] Here's a few of them: 07mr07aj.7643.20931.3.6.20 07mr07aj.9032.7843.5.6.6 07mr07aj.9032.16023.5.6.12 [/edit] Keep on crunching, all... SETI@Home Forever! ___Tklop (Step-Founder, U.S. Air Force team) ID: 651710 ·

Osiris30 Send message Joined: 19 Aug 07 Posts: 264 Credit: 41,917,631 RAC: 0	Message 651716 - Posted: 30 Sep 2007, 18:23:04 UTC - in response to Message 651710. I've not encountered a ton, but there are several WU's in my Results page which are showing as "Computation Error" -- a result that matches several others' crunching attempts... Bad work units maybe? They all stsrt with the same date: 07mr07aj... Anybody else running into these? Just a heads up for all of you... [edit] Here's a few of them: 07mr07aj.7643.20931.3.6.20 07mr07aj.9032.7843.5.6.6 07mr07aj.9032.16023.5.6.12 07mr07aj.9032.16023.5.6.12 [/edit] Yesterday's outage caused some WUs with a bad header to go out... I had about 10 myself that surfaced last time I checked. ID: 651716 ·

Tklop Send message Joined: 11 May 03 Posts: 175 Credit: 613,952 RAC: 0	Message 651719 - Posted: 30 Sep 2007, 18:23:53 UTC - in response to Message 651716. Last modified: 30 Sep 2007, 18:24:04 UTC Yesterday's outage caused some WUs with a bad header to go out... I had about 10 myself that surfaced last time I checked. Ah... Thanks! Keep on crunching, all... SETI@Home Forever! ___Tklop (Step-Founder, U.S. Air Force team) ID: 651719 ·

Osiris30 Send message Joined: 19 Aug 07 Posts: 264 Credit: 41,917,631 RAC: 0	Message 651758 - Posted: 30 Sep 2007, 19:07:43 UTC - in response to Message 651719. Yesterday's outage caused some WUs with a bad header to go out... I had about 10 myself that surfaced last time I checked. Ah... Thanks! You're welcome.. and I just checked again and I have 39 of 'em :( ID: 651758 ·

davidrobertson Send message Joined: 9 Sep 99 Posts: 1 Credit: 520,727 RAC: 0	Message 656804 - Posted: 9 Oct 2007, 15:22:51 UTC - in response to Message 651758. Yesterday's outage caused some WUs with a bad header to go out... I had about 10 myself that surfaced last time I checked. Ah... Thanks! You're welcome.. and I just checked again and I have 39 of 'em :( So do we have to do anything with or to them? I have one 17mr07ab.30863.409083.14.6.24_0 ID: 656804 ·

TeamDGC Send message Joined: 27 Oct 99 Posts: 19 Credit: 7,091,042 RAC: 0	Message 656876 - Posted: 9 Oct 2007, 21:24:02 UTC - in response to Message 651710. Anybody else running into these? Yepp! :-( All time #1 M.U.R.C. Cruncher! ID: 656876 ·

Nicholas Roberts Send message Joined: 25 Jun 06 Posts: 4 Credit: 1,195,498 RAC: 0	Message 656891 - Posted: 9 Oct 2007, 21:50:30 UTC - in response to Message 656876. Anybody else running into these? Yepp! :-( All my 17xxxxx's have failed within 3 seconds of starting. Should I abort my remaining 17xxxxx's before they attempt to crunch? Is there any benefit in aborting them rather than letting them fail?ÃƒÂ§ 13's, 29's and 18's working OK. Regards, Nicholas ID: 656891 ·

Viking Send message Joined: 2 Nov 03 Posts: 17 Credit: 1,051,900 RAC: 1	Message 656919 - Posted: 9 Oct 2007, 22:27:15 UTC - in response to Message 651710. Everything sent today is failing with either client error or compute error. 17mr07ab.30757.4162.12.6.80 17mr07ab.30087.416036.11.6.145 18mr07aa.10677.21749.3.6.156 18mr07aa.10744.5389.5.6.174 18mr07aa.10568.21340.4.6.146 17mr07ab.14071.413991.3.6.121 And others with that workunit are also having their crunching fail. It's broke. Get the duct tape. * Viking * ID: 656919 ·

Viking Send message Joined: 2 Nov 03 Posts: 17 Credit: 1,051,900 RAC: 1	Message 656988 - Posted: 10 Oct 2007, 0:03:09 UTC - in response to Message 656919. Update: I deleted the setiathome_5.27 exe and redownloaded it (which took forever) but it's working okay, at least on this latest workunit - 08mr07ad.14035.1300.3.6.154. I guess we'll see what happens... * Viking * ID: 656988 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 656989 - Posted: 10 Oct 2007, 0:03:44 UTC - in response to Message 656891. Anybody else running into these? Yepp! :-( All my 17xxxxx's have failed within 3 seconds of starting. Should I abort my remaining 17xxxxx's before they attempt to crunch? Is there any benefit in aborting them rather than letting them fail?ÃƒÂ§ 13's, 29's and 18's working OK. Regards, Nicholas There could also be good 17xxxxx's, so it's better to do some checking. The problem causes the WU to be smaller than usual, so you could look in the boinc\\projects\\setiathome.berkeley.edu folder for those which are less than the usual size, then abort just those. It doesn't matter much if you abort or let them attempt to crunch, either will be seen as a client-side error and cause a reissue to someone else (until the sixth error is returned). Another approach is to suspend the good WUs so BOINC tries to crunch the other stuff, then resume the good ones. That way you get the same error as if you'd just let them run normally, but sooner. Joe ID: 656989 ·

RandyC Send message Joined: 20 Oct 99 Posts: 714 Credit: 1,704,345 RAC: 0	Message 657007 - Posted: 10 Oct 2007, 0:41:46 UTC - in response to Message 656989. There could also be good 17xxxxx's, so it's better to do some checking. The problem causes the WU to be smaller than usual, so you could look in the boinc\\projects\\setiathome.berkeley.edu folder for those which are less than the usual size, then abort just those. It doesn't matter much if you abort or let them attempt to crunch, either will be seen as a client-side error and cause a reissue to someone else (until the sixth error is returned). Another approach is to suspend the good WUs so BOINC tries to crunch the other stuff, then resume the good ones. That way you get the same error as if you'd just let them run normally, but sooner. Joe This is like playing whack-a-mole. Every time I abort one, another zero length WU downloads. And my max daily quota is dropping too. I'm going to let them crash normally. At least that way my max daily quota will suffer less. ID: 657007 ·

Francesco Forti Send message Joined: 24 May 00 Posts: 334 Credit: 204,421,005 RAC: 15	Message 657191 - Posted: 10 Oct 2007, 5:24:49 UTC Last modified: 10 Oct 2007, 5:29:50 UTC I too have seen some Compute errors in the last week and also now, after the restart from the last tuesday outage. Some minute ago this host http://setiathome.berkeley.edu/show_host_detail.php?hostid=1852935 had four errors like: <core_client_version>5.10.20</core_client_version> <![CDATA[ <message> - exit code -6 (0xfffffffa) </message> <stderr_txt> SETI@home error -6 Bad workunit header !swi.data_type \|\| !found \|\| !swi.nsamples File: ..\\seti_header.cpp Line: 235 </stderr_txt> ]]> I use 5.10.20 with optimized seti (2.4V) Optimized SETI@Home Enhanced application Optimizers: Ben Herndon, Josef Segur, Alex Kan, Simon Zadra Version: Windows SSE 32-bit based on S@H V5.15 'Noo? No - Ni!' Revision: R-2.4V\|xK\|FFT:IPP_SSE\|Ben-Joe Bye, Franz PS: I had to add... al the four run unit was sent to me the 9 Oct 2007 22:08:59 UTC ... after the last outage. It is new job, not old. ID: 657191 ·

Alinator Volunteer tester Send message Joined: 19 Apr 05 Posts: 4178 Credit: 4,647,982 RAC: 0	Message 657193 - Posted: 10 Oct 2007, 5:28:22 UTC Yep, there was a mess of clinkers early on after the restart for awhile, but the ones I've been getting since around midnight UTC seem to be clean so far. Alinator ID: 657193 ·

Osiris30 Send message Joined: 19 Aug 07 Posts: 264 Credit: 41,917,631 RAC: 0	Message 657201 - Posted: 10 Oct 2007, 6:06:32 UTC - in response to Message 657193. Yep, there was a mess of clinkers early on after the restart for awhile, but the ones I've been getting since around midnight UTC seem to be clean so far. Alinator I'm still getting 5-10% bad WUs, but nothing to worry about. They'll burn through quick enough.. ID: 657201 ·

Matthias Lehmkuhl Volunteer tester Send message Joined: 5 Oct 99 Posts: 28 Credit: 10,832,348 RAC: 53	Message 657278 - Posted: 10 Oct 2007, 11:00:34 UTC I got two of them <message> process exited with code 250 (0xfa, -6) </message> or <message> - exit code -6 (0xfffffffa) </message> <stderr_txt> SETI@home error -6 Bad workunit header !swi.data_type \|\| !found \|\| !swi.nsamples File: seti_header.cpp Line: 235 wuid=163695795 wuid=163212600 Matthias Matthias ID: 657278 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.