Message boards :
Number crunching :
Bad Batch? All the WUs I get are being trashed
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0 ![]() |
|
![]() ![]() Send message Joined: 16 Jul 99 Posts: 496 Credit: 10,860,148 RAC: 0 ![]() |
On both Linux and Windows systems every downloaded WU I get is being trashed. Unrecoverable Error : Exit Code 144 (no minus) or -112 or -6 have all been seen. Tigher, Iam also getting these unrecoverable errors, mostly -112, I have one box that hasnt had any issues. I think iam going reboot the one that does and see if it corrects itself, I wish i knew what the issues is. at least it just isnt me. LOL ! BOINC SYNERGY is an International Team and We Welcome All BOINC Participants! BOINC Synergy Click to Join BOINC Synergy |
![]() Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0 ![]() |
|
Grenadier ![]() Send message Joined: 15 May 99 Posts: 63 Credit: 5,445,784 RAC: 0 ![]() |
I got a ton of these -6 errors this morning too. Shut down work requests for a few hours, and now everything seems fine again. (Although now I'm getting 'No work available' sometimes.) ![]() |
![]() ![]() Send message Joined: 17 Oct 01 Posts: 117 Credit: 1,316,241 RAC: 0 ![]() |
Yes I'm getting them on one machine that recently downloaded, all others are ok: 13/12/2005 18:10:43|SETI@home|Starting result 22ap04ab.11574.28833.311076.14_0 using setiathome version 411 13/12/2005 18:10:43|SETI@home|Starting result 27se04ab.23281.3714.940902.88_2 using setiathome version 411 13/12/2005 18:10:44|SETI@home|Unrecoverable error for result 22ap04ab.11574.28833.311076.14_0 ( - exit code -6 (0xfffffffa)) 13/12/2005 18:10:44|SETI@home|Unrecoverable error for result 27se04ab.23281.3714.940902.88_2 ( - exit code -6 (0xfffffffa)) 13/12/2005 18:10:44||request_reschedule_cpus: process exited 13/12/2005 18:10:44|SETI@home|Finished download of 27se04ab.23281.3714.940902.84 13/12/2005 18:10:44|SETI@home|Throughput 0 bytes/sec 13/12/2005 18:10:44|SETI@home|Finished download of 27se04ab.23281.3714.940902.86 13/12/2005 18:10:44|SETI@home|Throughput 0 bytes/sec 13/12/2005 18:10:44|SETI@home|Computation for result Note the zero bytes per second download too. This machine is as stable as it gets, no overclock, run many thousands of WU with Tetsuji's app and hundreds with Crunch3r new app with no issues. All other machines also running same client. Do I smell a rat? ![]() |
Hans Dorn ![]() Send message Joined: 3 Apr 99 Posts: 2262 Credit: 26,448,570 RAC: 0 ![]() |
Hi, I got a bunch of these on one host: [SETI@home] Unrecoverable error for result 29se04aa.5782.13442.404844.8_1 (process exited with code 250 (0xfa)) Probably the same thing. It's back to normal now BTW. Regards Hans P.S: Yikes! I still have lots of them in store: Look out for zero sized WUs in your project folder. P.P.S: I'll try to delete them to force a new download. |
Albatros Send message Joined: 2 Jul 00 Posts: 7 Credit: 245,899 RAC: 0 ![]() |
13.12.2005 20:19:03||Starting BOINC client version 5.3.1 for windows_intelx86 13.12.2005 20:19:03||libcurl/7.14.0 OpenSSL/0.9.8 zlib/1.2.3 13.12.2005 20:19:03||Data directory: D:\\Programme\\BOINC 13.12.2005 20:19:03|SETI@home|Found app_info.xml; using anonymous platform 13.12.2005 20:19:03||Processor: 1 AuthenticAMD AMD Sempron(tm) 2500+ 13.12.2005 20:19:03||Memory: 447.48 MB physical, 1.41 GB virtual 13.12.2005 20:19:03||Disk: 40.00 GB total, 25.05 GB free 13.12.2005 20:19:03|rosetta@home|Computer ID: 97839; location: home; project prefs: default 13.12.2005 20:19:03|Predictor @ Home|Computer ID: 190087; location: home; project prefs: default 13.12.2005 20:19:03|SETI@home|Computer ID: 1028322; location: home; project prefs: default 13.12.2005 20:19:03||General prefs: from rosetta@home (last modified 2005-12-12 20:30:59) 13.12.2005 20:19:03||General prefs: no separate prefs for home; using your defaults 13.12.2005 20:19:03||Remote control not allowed; using loopback address 13.12.2005 20:19:03|rosetta@home|Deferring computation for result 1ogw__topology_sample_38826_1 13.12.2005 20:19:03|Predictor @ Home|Resuming computation for result bprion_4_84032_2 using mfoldB125 version 428 13.12.2005 20:19:03|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi 13.12.2005 20:19:03|SETI@home|Reason: To fetch work 13.12.2005 20:19:03|SETI@home|Requesting 86400 seconds of new work 13.12.2005 20:19:04|rosetta@home|Restarting result 1ogw__topology_sample_38826_1 using rosetta version 480 13.12.2005 20:19:04|Predictor @ Home|Pausing result bprion_4_84032_2 (left in memory) 13.12.2005 20:19:13|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded 13.12.2005 20:19:14|SETI@home|Started download of 22ap04ab.11574.30993.997154.27 13.12.2005 20:19:14|SETI@home|Started download of 22ap04ab.11574.30993.997154.32 13.12.2005 20:19:17|SETI@home|Finished download of 22ap04ab.11574.30993.997154.27 13.12.2005 20:19:17|SETI@home|Throughput 0 bytes/sec 13.12.2005 20:19:17|SETI@home|Finished download of 22ap04ab.11574.30993.997154.32 13.12.2005 20:19:17|SETI@home|Throughput 0 bytes/sec 13.12.2005 20:19:17|SETI@home|Started download of 22ap04ab.11574.30993.997154.34 13.12.2005 20:19:17|SETI@home|Started download of 22ap04ab.11574.30993.997154.26 13.12.2005 20:19:18|SETI@home|Deferring communication with project for 10 minutes and 0 seconds 13.12.2005 20:19:18||request_reschedule_cpus: files downloaded 13.12.2005 20:19:18||request_reschedule_cpus: files downloaded 13.12.2005 20:19:20|SETI@home|Finished download of 22ap04ab.11574.30993.997154.34 13.12.2005 20:19:20|SETI@home|Throughput 0 bytes/sec 13.12.2005 20:19:21||request_reschedule_cpus: files downloaded 13.12.2005 20:19:22|SETI@home|Finished download of 22ap04ab.11574.30993.997154.26 13.12.2005 20:19:22|SETI@home|Throughput 51468 bytes/sec 13.12.2005 20:19:23||request_reschedule_cpus: files downloaded "Throughput 0 bytes/sec" - that sounds very strange to me... When I had a look at the directory I saw that 2 of the 3 files downloaded have a size of 0 bytes Uli |
![]() ![]() Send message Joined: 17 Oct 01 Posts: 117 Credit: 1,316,241 RAC: 0 ![]() |
Confirmed, my -6 error WU also zero bytes on disk too ![]() |
![]() Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0 ![]() |
|
![]() ![]() Send message Joined: 28 Dec 99 Posts: 138 Credit: 10,216,553 RAC: 0 ![]() |
Here's what I've been getting: <core_client_version>5.2.13</core_client_version> <message> - exit code -6 (0xfffffffa) </message> <stderr_txt> SETI@home error -6 Bad workunit header !swi.data_type || !found || !swi.nsamples File: ..\\seti_header.cpp Line: 194 From these WUs: 22ap04ab.11574.19952.222162.36_2 22ap04ab.11574.19952.222162.41_1 22ap04ab.11574.19952.222162.42_2 22ap04ab.11574.26368.684642.79_2 27se04ab.6580.2850.648582.219_0 kev kev X2 4400+,4200+ @2.75GHz, XP1800+ @1.65GHz, P4 @1.6GHz ![]() |
Grenadier ![]() Send message Joined: 15 May 99 Posts: 63 Credit: 5,445,784 RAC: 0 ![]() |
Spoke too soon. I'm still getting these too. Not sure if they're really bad WU's, or if they're just bad downloads (in which case turning off new work would help.) ![]() |
Hans Dorn ![]() Send message Joined: 3 Apr 99 Posts: 2262 Credit: 26,448,570 RAC: 0 ![]() |
I tried to manually download one of these using TMR's ftodir and wget, but I'm getting 404 errors. These WUs seem to have disappeared before they could be downloaded. Usually you can re-download a WU until about 2 weeks after it went out first. Regards Hans |
![]() Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0 ![]() |
I tried to manually download one of these using TMR's ftodir and wget, but I'm getting 404 errors. Hmmmm strange goings on! Well I guess we wait until they are out of bad units until we get good ones. ![]() |
Rayburner Send message Joined: 25 Nov 03 Posts: 18 Credit: 11,745,976 RAC: 0 ![]() |
Looks like a bad download to me; according to the log it was downloaded with 0 bytes/sec. In the project folder the wu is only 0 bytes large. 13.12.2005 20:41:51|SETI@home|Started download of 22ap04ab.11574.32290.373588.127 13.12.2005 20:41:54|SETI@home|Finished download of 22ap04ab.11574.32290.373588.127 13.12.2005 20:41:54|SETI@home|Throughput 0 bytes/sec |
![]() Send message Joined: 2 Aug 00 Posts: 1851 Credit: 5,955,047 RAC: 0 ![]() |
Notice the workunits with identical numbers except for the last one to three digits at the end (0 to 255). I got three today. The data for those units are from the same segment of tape, i.e., were taken at the same time. The band on the tape is 2.5 MHz wide but the splitter splits the data on the tape into 256 bandlets of 2,500,000 / 256 = 9766 Hz wide for each workunit. So if a workunit is contaminated it is highly probable that some, or all, of its bandmates will be contaminated, too. |
Hans Dorn ![]() Send message Joined: 3 Apr 99 Posts: 2262 Credit: 26,448,570 RAC: 0 ![]() |
This is from ethereal. Weird...
Should the client be able to parse this? Regards Hans |
![]() Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0 ![]() |
This is from ethereal. Weird... Hmmm 302. Exists but not here. Do not cache new URI. Do not re-direct without user agreement. Gets even more strange ! I guess the client cant make sense of this. If it could it would have to ask us for agreement to go to the new location. Otherwise it busts the RFC wide open and ALL security. So I think the client will dump it. ![]() |
![]() ![]() Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 ![]() |
Yup. Bad batch of workunits. Among other things last week one of the storage devices filled up, so there are about 20,000 workunits that are 0-length files. Apparently we need to handle this case a bit better, but in the meantime we're just causing the clients to DOS us for a little bit before these workunits error out and get deleted. Think of it as a system-wide stress test. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
![]() Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0 ![]() |
Yup. Bad batch of workunits. Among other things last week one of the storage devices filled up, so there are about 20,000 workunits that are 0-length files. Apparently we need to handle this case a bit better, but in the meantime we're just causing the clients to DOS us for a little bit before these workunits error out and get deleted. Matt I would of thought you would have had enough of "stress" lately. :) ![]() |
![]() ![]() Send message Joined: 16 Jul 99 Posts: 496 Credit: 10,860,148 RAC: 0 ![]() |
Yup. Bad batch of workunits. Among other things last week one of the storage devices filled up, so there are about 20,000 workunits that are 0-length files. Apparently we need to handle this case a bit better, but in the meantime we're just causing the clients to DOS us for a little bit before these workunits error out and get deleted. Thanks for FYI Matt, relax man, things will be ok... BOINC SYNERGY is an International Team and We Welcome All BOINC Participants! BOINC Synergy Click to Join BOINC Synergy |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.