Bad Batch? All the WUs I get are being trashed

Author	Message
Tigher Volunteer tester Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0	Message 213133 - Posted: 13 Dec 2005, 18:59:58 UTC Last modified: 13 Dec 2005, 19:17:24 UTC On both Linux and Windows systems every downloaded WU I get is being trashed. Unrecoverable Error : Exit Code 144 (no minus) or -112 or -6 have all been seen. Anyone else seeing similar/same? Iddeas on what be wrong? ID: 213133 ·

[B^S] Spydermb Volunteer tester Send message Joined: 16 Jul 99 Posts: 496 Credit: 10,860,148 RAC: 0	Message 213143 - Posted: 13 Dec 2005, 19:11:21 UTC - in response to Message 213133. On both Linux and Windows systems every downloaded WU I get is being trashed. Unrecoverable Error : Exit Code 144 (no minus) or -112 or -6 have all been seen. Anyone else seeing similar/same? Iddeas on what be wrong? Tigher, Iam also getting these unrecoverable errors, mostly -112, I have one box that hasnt had any issues. I think iam going reboot the one that does and see if it corrects itself, I wish i knew what the issues is. at least it just isnt me. LOL ! BOINC SYNERGY is an International Team and We Welcome All BOINC Participants! BOINC Synergy Click to Join BOINC Synergy ID: 213143 ·

Tigher Volunteer tester Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0	Message 213150 - Posted: 13 Dec 2005, 19:18:09 UTC Last modified: 13 Dec 2005, 19:21:04 UTC Ah not alone then. All mine begin with 22ap04ab 11574 and 29697 or 26320 Just had one download (different range) that appears to be running OK. So it may be a very bad batch of data? ID: 213150 ·

Grenadier Volunteer tester Send message Joined: 15 May 99 Posts: 63 Credit: 5,445,784 RAC: 0	Message 213154 - Posted: 13 Dec 2005, 19:21:47 UTC I got a ton of these -6 errors this morning too. Shut down work requests for a few hours, and now everything seems fine again. (Although now I'm getting 'No work available' sometimes.) ID: 213154 ·

Nightlord Send message Joined: 17 Oct 01 Posts: 117 Credit: 1,316,241 RAC: 0	Message 213156 - Posted: 13 Dec 2005, 19:22:53 UTC Last modified: 13 Dec 2005, 19:29:36 UTC Yes I'm getting them on one machine that recently downloaded, all others are ok: 13/12/2005 18:10:43\|SETI@home\|Starting result 22ap04ab.11574.28833.311076.14_0 using setiathome version 411 13/12/2005 18:10:43\|SETI@home\|Starting result 27se04ab.23281.3714.940902.88_2 using setiathome version 411 13/12/2005 18:10:44\|SETI@home\|Unrecoverable error for result 22ap04ab.11574.28833.311076.14_0 ( - exit code -6 (0xfffffffa)) 13/12/2005 18:10:44\|SETI@home\|Unrecoverable error for result 27se04ab.23281.3714.940902.88_2 ( - exit code -6 (0xfffffffa)) 13/12/2005 18:10:44\|\|request_reschedule_cpus: process exited 13/12/2005 18:10:44\|SETI@home\|Finished download of 27se04ab.23281.3714.940902.84 13/12/2005 18:10:44\|SETI@home\|Throughput 0 bytes/sec 13/12/2005 18:10:44\|SETI@home\|Finished download of 27se04ab.23281.3714.940902.86 13/12/2005 18:10:44\|SETI@home\|Throughput 0 bytes/sec 13/12/2005 18:10:44\|SETI@home\|Computation for result Note the zero bytes per second download too. This machine is as stable as it gets, no overclock, run many thousands of WU with Tetsuji's app and hundreds with Crunch3r new app with no issues. All other machines also running same client. Do I smell a rat? ID: 213156 ·

Hans Dorn Volunteer developer Volunteer tester Send message Joined: 3 Apr 99 Posts: 2262 Credit: 26,448,570 RAC: 0	Message 213160 - Posted: 13 Dec 2005, 19:24:35 UTC Last modified: 13 Dec 2005, 19:29:48 UTC Hi, I got a bunch of these on one host: [SETI@home] Unrecoverable error for result 29se04aa.5782.13442.404844.8_1 (process exited with code 250 (0xfa)) Probably the same thing. It's back to normal now BTW. Regards Hans P.S: Yikes! I still have lots of them in store: Look out for zero sized WUs in your project folder. P.P.S: I'll try to delete them to force a new download. ID: 213160 ·

Albatros Send message Joined: 2 Jul 00 Posts: 7 Credit: 245,899 RAC: 0	Message 213163 - Posted: 13 Dec 2005, 19:25:34 UTC 13.12.2005 20:19:03\|\|Starting BOINC client version 5.3.1 for windows_intelx86 13.12.2005 20:19:03\|\|libcurl/7.14.0 OpenSSL/0.9.8 zlib/1.2.3 13.12.2005 20:19:03\|\|Data directory: D:\\Programme\\BOINC 13.12.2005 20:19:03\|SETI@home\|Found app_info.xml; using anonymous platform 13.12.2005 20:19:03\|\|Processor: 1 AuthenticAMD AMD Sempron(tm) 2500+ 13.12.2005 20:19:03\|\|Memory: 447.48 MB physical, 1.41 GB virtual 13.12.2005 20:19:03\|\|Disk: 40.00 GB total, 25.05 GB free 13.12.2005 20:19:03\|rosetta@home\|Computer ID: 97839; location: home; project prefs: default 13.12.2005 20:19:03\|Predictor @ Home\|Computer ID: 190087; location: home; project prefs: default 13.12.2005 20:19:03\|SETI@home\|Computer ID: 1028322; location: home; project prefs: default 13.12.2005 20:19:03\|\|General prefs: from rosetta@home (last modified 2005-12-12 20:30:59) 13.12.2005 20:19:03\|\|General prefs: no separate prefs for home; using your defaults 13.12.2005 20:19:03\|\|Remote control not allowed; using loopback address 13.12.2005 20:19:03\|rosetta@home\|Deferring computation for result 1ogw__topology_sample_38826_1 13.12.2005 20:19:03\|Predictor @ Home\|Resuming computation for result bprion_4_84032_2 using mfoldB125 version 428 13.12.2005 20:19:03\|SETI@home\|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi 13.12.2005 20:19:03\|SETI@home\|Reason: To fetch work 13.12.2005 20:19:03\|SETI@home\|Requesting 86400 seconds of new work 13.12.2005 20:19:04\|rosetta@home\|Restarting result 1ogw__topology_sample_38826_1 using rosetta version 480 13.12.2005 20:19:04\|Predictor @ Home\|Pausing result bprion_4_84032_2 (left in memory) 13.12.2005 20:19:13\|SETI@home\|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded 13.12.2005 20:19:14\|SETI@home\|Started download of 22ap04ab.11574.30993.997154.27 13.12.2005 20:19:14\|SETI@home\|Started download of 22ap04ab.11574.30993.997154.32 13.12.2005 20:19:17\|SETI@home\|Finished download of 22ap04ab.11574.30993.997154.27 13.12.2005 20:19:17\|SETI@home\|Throughput 0 bytes/sec 13.12.2005 20:19:17\|SETI@home\|Finished download of 22ap04ab.11574.30993.997154.32 13.12.2005 20:19:17\|SETI@home\|Throughput 0 bytes/sec 13.12.2005 20:19:17\|SETI@home\|Started download of 22ap04ab.11574.30993.997154.34 13.12.2005 20:19:17\|SETI@home\|Started download of 22ap04ab.11574.30993.997154.26 13.12.2005 20:19:18\|SETI@home\|Deferring communication with project for 10 minutes and 0 seconds 13.12.2005 20:19:18\|\|request_reschedule_cpus: files downloaded 13.12.2005 20:19:18\|\|request_reschedule_cpus: files downloaded 13.12.2005 20:19:20\|SETI@home\|Finished download of 22ap04ab.11574.30993.997154.34 13.12.2005 20:19:20\|SETI@home\|Throughput 0 bytes/sec 13.12.2005 20:19:21\|\|request_reschedule_cpus: files downloaded 13.12.2005 20:19:22\|SETI@home\|Finished download of 22ap04ab.11574.30993.997154.26 13.12.2005 20:19:22\|SETI@home\|Throughput 51468 bytes/sec 13.12.2005 20:19:23\|\|request_reschedule_cpus: files downloaded "Throughput 0 bytes/sec" - that sounds very strange to me... When I had a look at the directory I saw that 2 of the 3 files downloaded have a size of 0 bytes Uli ID: 213163 ·

Nightlord Send message Joined: 17 Oct 01 Posts: 117 Credit: 1,316,241 RAC: 0	Message 213174 - Posted: 13 Dec 2005, 19:33:58 UTC - in response to Message 213163. "Throughput 0 bytes/sec" - that sounds very strange to me... When I had a look at the directory I saw that 2 of the 3 files downloaded have a size of 0 bytes Uli Confirmed, my -6 error WU also zero bytes on disk too ID: 213174 ·

Tigher Volunteer tester Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0	Message 213176 - Posted: 13 Dec 2005, 19:35:10 UTC Last modified: 13 Dec 2005, 19:35:26 UTC Hmmm now I got an exit code 250 from a WU on a linux box. I reckon I might win this had royal flush! ID: 213176 ·

kev1701e Send message Joined: 28 Dec 99 Posts: 138 Credit: 10,216,553 RAC: 0	Message 213177 - Posted: 13 Dec 2005, 19:35:15 UTC Here's what I've been getting: <core_client_version>5.2.13</core_client_version> <message> - exit code -6 (0xfffffffa) </message> <stderr_txt> SETI@home error -6 Bad workunit header !swi.data_type \|\| !found \|\| !swi.nsamples File: ..\\seti_header.cpp Line: 194 From these WUs: 22ap04ab.11574.19952.222162.36_2 22ap04ab.11574.19952.222162.41_1 22ap04ab.11574.19952.222162.42_2 22ap04ab.11574.26368.684642.79_2 27se04ab.6580.2850.648582.219_0 kev kev X2 4400+,4200+ @2.75GHz, XP1800+ @1.65GHz, P4 @1.6GHz ID: 213177 ·

Grenadier Volunteer tester Send message Joined: 15 May 99 Posts: 63 Credit: 5,445,784 RAC: 0	Message 213193 - Posted: 13 Dec 2005, 19:46:08 UTC Spoke too soon. I'm still getting these too. Not sure if they're really bad WU's, or if they're just bad downloads (in which case turning off new work would help.) ID: 213193 ·

Hans Dorn Volunteer developer Volunteer tester Send message Joined: 3 Apr 99 Posts: 2262 Credit: 26,448,570 RAC: 0	Message 213199 - Posted: 13 Dec 2005, 19:50:09 UTC I tried to manually download one of these using TMR's ftodir and wget, but I'm getting 404 errors. These WUs seem to have disappeared before they could be downloaded. Usually you can re-download a WU until about 2 weeks after it went out first. Regards Hans ID: 213199 ·

Tigher Volunteer tester Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0	Message 213202 - Posted: 13 Dec 2005, 19:52:59 UTC - in response to Message 213199. I tried to manually download one of these using TMR's ftodir and wget, but I'm getting 404 errors. These WUs seem to have disappeared before they could be downloaded. Usually you can re-download a WU until about 2 weeks after it went out first. Regards Hans Hmmmm strange goings on! Well I guess we wait until they are out of bad units until we get good ones. ID: 213202 ·

Rayburner Volunteer tester Send message Joined: 25 Nov 03 Posts: 18 Credit: 11,745,976 RAC: 0	Message 213203 - Posted: 13 Dec 2005, 19:53:15 UTC Looks like a bad download to me; according to the log it was downloaded with 0 bytes/sec. In the project folder the wu is only 0 bytes large. 13.12.2005 20:41:51\|SETI@home\|Started download of 22ap04ab.11574.32290.373588.127 13.12.2005 20:41:54\|SETI@home\|Finished download of 22ap04ab.11574.32290.373588.127 13.12.2005 20:41:54\|SETI@home\|Throughput 0 bytes/sec ID: 213203 ·

Clyde C. Phillips, III Send message Joined: 2 Aug 00 Posts: 1851 Credit: 5,955,047 RAC: 0	Message 213207 - Posted: 13 Dec 2005, 19:58:20 UTC Notice the workunits with identical numbers except for the last one to three digits at the end (0 to 255). I got three today. The data for those units are from the same segment of tape, i.e., were taken at the same time. The band on the tape is 2.5 MHz wide but the splitter splits the data on the tape into 256 bandlets of 2,500,000 / 256 = 9766 Hz wide for each workunit. So if a workunit is contaminated it is highly probable that some, or all, of its bandmates will be contaminated, too. ID: 213207 ·

Hans Dorn Volunteer developer Volunteer tester Send message Joined: 3 Apr 99 Posts: 2262 Credit: 26,448,570 RAC: 0	Message 213208 - Posted: 13 Dec 2005, 19:58:26 UTC Last modified: 13 Dec 2005, 20:02:04 UTC This is from ethereal. Weird... GET /sah/download_fanout/5a/22ap04ab.11574.18450.161082.250 HTTP/1.0 User-Agent: BOINC client (i686-pc-linux-gnu 4.27) Host: setiboincdata.ssl.berkeley.edu:80 Connection: close Accept: / HTTP/1.1 302 Found Date: Tue, 13 Dec 2005 19:23:14 GMT Server: Apache/1.3.33 (Unix) mod_fastcgi/2.4.2 Location: http://boinc2.ssl.berkeley.edu/sah/download_fanout/5a/22ap04ab.11574.18450.161082.250 Connection: close Content-Type: text/html; charset=iso-8859-1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>302 Found</TITLE> </HEAD><BODY> <H1>Found</H1> The document has moved <A HREF="http://boinc2.ssl.berkeley.edu/sah/download_fanout/5a/22ap04ab.11574.18450.161082.250">here</A>.<P> <HR> <ADDRESS>Apache/1.3.33 Server at setiboincdata.ssl.berkeley.edu Port 80</ADDRESS> </BODY></HTML> Should the client be able to parse this? Regards Hans ID: 213208 ·

Tigher Volunteer tester Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0	Message 213213 - Posted: 13 Dec 2005, 20:03:05 UTC - in response to Message 213208. Last modified: 13 Dec 2005, 20:05:31 UTC This is from ethereal. Weird... GET /sah/download_fanout/5a/22ap04ab.11574.18450.161082.250 HTTP/1.0 User-Agent: BOINC client (i686-pc-linux-gnu 4.27) Host: setiboincdata.ssl.berkeley.edu:80 Connection: close Accept: / HTTP/1.1 302 Found Date: Tue, 13 Dec 2005 19:23:14 GMT Server: Apache/1.3.33 (Unix) mod_fastcgi/2.4.2 Location: http://boinc2.ssl.berkeley.edu/sah/download_fanout/5a/22ap04ab.11574.18450.161082.250 Connection: close Content-Type: text/html; charset=iso-8859-1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>302 Found</TITLE> </HEAD><BODY> <H1>Found</H1> The document has moved <A HREF="http://boinc2.ssl.berkeley.edu/sah/download_fanout/5a/22ap04ab.11574.18450.161082.250">here</A>.<P> <HR> <ADDRESS>Apache/1.3.33 Server at setiboincdata.ssl.berkeley.edu Port 80</ADDRESS> </BODY></HTML> Should the client be able to parse this? Regards Hans Hmmm 302. Exists but not here. Do not cache new URI. Do not re-direct without user agreement. Gets even more strange ! I guess the client cant make sense of this. If it could it would have to ask us for agreement to go to the new location. Otherwise it busts the RFC wide open and ALL security. So I think the client will dump it. ID: 213213 ·

Matt Lebofsky Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0	Message 213218 - Posted: 13 Dec 2005, 20:07:04 UTC Yup. Bad batch of workunits. Among other things last week one of the storage devices filled up, so there are about 20,000 workunits that are 0-length files. Apparently we need to handle this case a bit better, but in the meantime we're just causing the clients to DOS us for a little bit before these workunits error out and get deleted. Think of it as a system-wide stress test. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude ID: 213218 ·

Tigher Volunteer tester Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0	Message 213221 - Posted: 13 Dec 2005, 20:08:35 UTC - in response to Message 213218. Yup. Bad batch of workunits. Among other things last week one of the storage devices filled up, so there are about 20,000 workunits that are 0-length files. Apparently we need to handle this case a bit better, but in the meantime we're just causing the clients to DOS us for a little bit before these workunits error out and get deleted. Think of it as a system-wide stress test. - Matt Matt I would of thought you would have had enough of "stress" lately. :) ID: 213221 ·

[B^S] Spydermb Volunteer tester Send message Joined: 16 Jul 99 Posts: 496 Credit: 10,860,148 RAC: 0	Message 213238 - Posted: 13 Dec 2005, 20:18:22 UTC - in response to Message 213218. Yup. Bad batch of workunits. Among other things last week one of the storage devices filled up, so there are about 20,000 workunits that are 0-length files. Apparently we need to handle this case a bit better, but in the meantime we're just causing the clients to DOS us for a little bit before these workunits error out and get deleted. Think of it as a system-wide stress test. - Matt Thanks for FYI Matt, relax man, things will be ok... BOINC SYNERGY is an International Team and We Welcome All BOINC Participants! BOINC Synergy Click to Join BOINC Synergy ID: 213238 ·

©2025 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.