Message boards :
Number crunching :
Computation Error - Bad Workunit Header
Message board moderation
Author | Message |
---|---|
Jim Volfan Send message Joined: 22 May 99 Posts: 52 Credit: 24,239,706 RAC: 90 |
I just downloaded 4 workunits and all 4 errored out with Bad Workunit Header 3/7/2008 5:56:47 PM|SETI@home|Reason: Unrecoverable error for result 13fe08ac.24787.2526.4.7.242_1 ( - exit code -6 (0xfffffffa)) 3/7/2008 5:56:47 PM|SETI@home|Computation for task 13fe08ac.24787.2526.4.7.242_1 finished 3/7/2008 5:56:47 PM|SETI@home|Output file 13fe08ac.24787.2526.4.7.242_1_0 for task 13fe08ac.24787.2526.4.7.242_1 absent 3/7/2008 6:08:20 PM|SETI@home|Reason: Unrecoverable error for result 13fe08ac.24787.2526.4.7.81_0 ( - exit code -6 (0xfffffffa)) 3/7/2008 6:08:20 PM|SETI@home|Computation for task 13fe08ac.24787.2526.4.7.81_0 finished 3/7/2008 6:08:20 PM|SETI@home|Output file 13fe08ac.24787.2526.4.7.81_0_0 for task 13fe08ac.24787.2526.4.7.81_0 absent 3/7/2008 6:13:16 PM|SETI@home|Reason: Unrecoverable error for result 13fe08ac.24787.2526.4.7.223_0 ( - exit code -6 (0xfffffffa)) 3/7/2008 6:13:16 PM|SETI@home|Computation for task 13fe08ac.24787.2526.4.7.223_0 finished 3/7/2008 6:13:16 PM|SETI@home|Output file 13fe08ac.24787.2526.4.7.223_0_0 for task 13fe08ac.24787.2526.4.7.223_0 absent 3/7/2008 6:15:16 PM|SETI@home|Reason: Unrecoverable error for result 13fe08ac.24787.2526.4.7.187_1 ( - exit code -6 (0xfffffffa)) 3/7/2008 6:15:16 PM|SETI@home|Computation for task 13fe08ac.24787.2526.4.7.187_1 finished 3/7/2008 6:15:16 PM|SETI@home|Output file 13fe08ac.24787.2526.4.7.187_1_0 for task 13fe08ac.24787.2526.4.7.187_1 absent <core_client_version>5.8.16</core_client_version> <![CDATA[ <message> - exit code -6 (0xfffffffa) </message> <stderr_txt> SETI@home error -6 Bad workunit header !swi.data_type || !found || !swi.nsamples File: ..\\seti_header.cpp Line: 235 </stderr_txt> ]]> <core_client_version>5.8.16</core_client_version> <![CDATA[ <message> - exit code -6 (0xfffffffa) </message> <stderr_txt> SETI@home error -6 Bad workunit header !swi.data_type || !found || !swi.nsamples File: ..\\seti_header.cpp Line: 235 </stderr_txt> ]]> <core_client_version>5.8.16</core_client_version> <![CDATA[ <message> - exit code -6 (0xfffffffa) </message> <stderr_txt> SETI@home error -6 Bad workunit header !swi.data_type || !found || !swi.nsamples File: ..\\seti_header.cpp Line: 235 </stderr_txt> ]]> <core_client_version>5.8.16</core_client_version> <![CDATA[ <message> - exit code -6 (0xfffffffa) </message> <stderr_txt> SETI@home error -6 Bad workunit header !swi.data_type || !found || !swi.nsamples File: ..\\seti_header.cpp Line: 235 </stderr_txt> ]]> All 4 workunits were from the same series, 13fe08ac.24787.2526.4.7 I don't know what might have caused the issue, but I hope this helps others. |
Alinator Send message Joined: 19 Apr 05 Posts: 4178 Credit: 4,647,982 RAC: 0 |
Well thanks for bring that to our attention. There has been a few posts from folks over the last 24 to 36 hours which tended to suggest there may have been some 'shaky' work generated. Your case certainly adds some more evidence to that. ;-) Alinator |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
We've had some cases in the past when workunit data files have been noticably malformed: smaller than the normal 367KB, or even of zero size. Anyone who notices a task from the same series as Jim's (13fe08ac.24787) in their task list might like to check the data file in their SETI project directory and report back. |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
We've had some cases in the past when workunit data files have been noticably malformed: smaller than the normal 367KB, or even of zero size. Hi there i have about 15 WU's : 8-3-2008 1:13:32|SETI@home|Scheduler request succeeded: got 7 new tasks 8-3-2008 1:13:34|SETI@home|Started download of 13fe08ac.24787.11524.4.7.47 8-3-2008 1:13:34|SETI@home|Started download of 13fe08ac.24787.11524.4.7.40 8-3-2008 1:13:39|SETI@home|Finished download of 13fe08ac.24787.11524.4.7.47 8-3-2008 1:13:39|SETI@home|Finished download of 13fe08ac.24787.11524.4.7.40 8-3-2008 1:13:39|SETI@home|Started download of 01ap07ad.18916.5389.8.7.210 8-3-2008 1:13:39|SETI@home|Started download of 13fe08ac.24787.11524.4.7.20 8-3-2008 1:13:43|SETI@home|Finished download of 01ap07ad.18916.5389.8.7.210 8-3-2008 1:13:43|SETI@home|Finished download of 13fe08ac.24787.11524.4.7.20 8-3-2008 1:13:43|SETI@home|Started download of 13fe08ac.24787.11524.4.7.65 8-3-2008 1:13:43|SETI@home|Started download of 13fe08ac.24787.11524.4.7.15 8-3-2008 1:13:47|SETI@home|Sending scheduler request: To fetch work. Requesting 3752 seconds of work, reporting 1 completed tasks 8-3-2008 1:13:48|SETI@home|Finished download of 13fe08ac.24787.11524.4.7.65 8-3-2008 1:13:48|SETI@home|Finished download of 13fe08ac.24787.11524.4.7.15 8-3-2008 1:13:48|SETI@home|Started download of 13fe08ac.24787.11524.4.7.25 8-3-2008 1:13:53|SETI@home|Finished download of 13fe08ac.24787.11524.4.7.25 8-3-2008 1:13:53|SETI@home|Scheduler request succeeded: got 1 new tasks 8-3-2008 1:13:55|SETI@home|Started download of 13fe08ac.24787.11524.4.7.175 8-3-2008 1:13:59|SETI@home|Finished download of 13fe08ac.24787.11524.4.7.175 8-3-2008 1:14:04|SETI@home|Sending scheduler request: To fetch work. Requesting 1758 seconds of work, reporting 0 completed tasks 8-3-2008 1:14:10|SETI@home|Scheduler request succeeded: got 1 new tasks 8-3-2008 1:14:12|SETI@home|Started download of 13fe08ac.24787.11524.4.7.212 8-3-2008 1:14:16|SETI@home|Finished download of 13fe08ac.24787.11524.4.7.212 8-3-2008 1:14:21|SETI@home|Sending scheduler request: To fetch work. Requesting 204 seconds of work, reporting 0 completed tasks 8-3-2008 1:14:27|SETI@home|Scheduler request succeeded: got 1 new tasks 8-3-2008 1:14:29|SETI@home|Started download of 13fe08ac.24787.11524.4.7.251 8-3-2008 1:14:32|SETI@home|Finished download of 13fe08ac.24787.11524.4.7.251 Which look from the similar batch. Can't find off one has been 'crunched' and if OK, sended back to Berkely. I'll look in the BOINC Logfiles, if any has been processed. If so, i'll let ya know ;) |
Bert Send message Joined: 12 Oct 06 Posts: 84 Credit: 813,295 RAC: 0 |
We've had some cases in the past when workunit data files have been noticably malformed: smaller than the normal 367KB, or even of zero size. I have two 13fe08ac but the next 5 digits are 15641. Size seems OK, 367 KB. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
I though I'd better check if it was the Radar-blanking test tape, but no - that was 28ja08aa. |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
I got 19 [EDIT: 20] x Client error - Compute error on my QX6700 too.. ..because of: <message> - exit code -6 (0xfffffffa) </message> <stderr_txt> SETI@home error -6 Bad workunit header !swi.data_type || !found || !swi.nsamples File: ..\\seti_header.cpp Line: 235 |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
I got 19 x Client error - Compute error on my QX6700 too.. I checked the 13 on the first three pages, and they're all from 13fe08ac.24787 too. I'm not searching the remaining 874 tasks for the six I've missed, but it seems likely we're just seeing one rogue splitter on one tape. |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
I got 19 x Client error - Compute error on my QX6700 too.. ..they are on the first 8 sides.. |
Jim-R. Send message Joined: 7 Feb 06 Posts: 1494 Credit: 194,148 RAC: 0 |
This definately sounds like splitter problems. I received a bunch of errors like this on several occasions. One that I got to check on had a bunch of wu's sent out with the binary data intact but just no header in it. On another occasion I received a bunch of files that were zero length! They didn't contain any header or data. On both occasions it was determined that a splitter had acted up and created the defective files. ERIC! looks like you need to use the ole boot on one of the splitters again! Jim Some people plan their life out and look back at the wealth they've had. Others live life day by day and look back at the wealth of experiences and enjoyment they've had. |
dnolan Send message Joined: 30 Aug 01 Posts: 1228 Credit: 47,779,411 RAC: 32 |
We've had some cases in the past when workunit data files have been noticably malformed: smaller than the normal 367KB, or even of zero size. Got 13 of these between 2 of my machines, all look like the correct size to me, guess I'll wait and see what happens. [edit] Mine are all of the 13fe08ac.24787 series, though... not 15641 -Dave |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
I got 19 [EDIT: 20] x Client error - Compute error on my QX6700 too.. 19 x 13fe08ac.24787.xxxx 1 x 13fe08ac.8515.xxxx |
KWSN Ekky Ekky Ekky Send message Joined: 25 May 99 Posts: 944 Credit: 52,956,491 RAC: 67 |
In the last few hours: 13fe08ac.24787.18477.4.7.170 13fe08ac.8515.20931.3.7.146 13fe08ac.8515.890.3.7.214 but a few days ago: 01ap07aa.20556.16841.5.7.0 (created 05/03/08) Not had a computation error for months prior to the above. |
SATAN Send message Joined: 27 Aug 06 Posts: 835 Credit: 2,129,006 RAC: 0 |
Just had 68 go through with another load still down loading. I was hoping to fill the cache as well over weekend, never mind. |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
Just had 68 go through with another load still down loading. I was hoping to fill the cache as well over weekend, never mind. Had a batch off 13feb08.24787 & and others from 13feb08, see post from a few day's ago (earlier in this post). Didn't see any computing errors so far, most off them are still waiting for UPLOAD, as their deadline isn't reached until 30 march 2008. They almost all have TRIPLET's, with triplet power up to 11. How do you 'display' such images/triplets, i've "PRINTSREEN"them and saved a part off the images as *.png's . Can i upload them. Or do i have to 'own' a URL, to use it? |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
Just had 68 go through with another load still down loading. I was hoping to fill the cache as well over weekend, never mind. I use imageshack. Free and convenient. F. |
Iona Send message Joined: 12 Jul 07 Posts: 790 Credit: 22,438,118 RAC: 0 |
I have also had a load of 'Compute Errors' on my 'second' PC. In frustration, I virtually took the PC apart, tested some parts (RAM and CPU) in its virtual twin, brushed out and vacuum cleaned the HSF and cooling fans and reinstalled BOINC. Guess what?! More of the same.... d'oh! I then checked and found that some of the WUs had returned results, which was not the case at the time - thy're showing the same error. All those different PCs can't all be wrong, eh? Oh well, I'm off to visit my family for a couple of days, from tomorrow, so hopefully they'll get things sorted out by the time I return. Don't take life too seriously, as you'll never come out of it alive! |
SATAN Send message Joined: 27 Aug 06 Posts: 835 Credit: 2,129,006 RAC: 0 |
Have reached download limit for today, so will have to wait till tomorrow to get some more of the dodgy ones. |
Matthias Lehmkuhl Send message Joined: 5 Oct 99 Posts: 28 Credit: 10,832,348 RAC: 53 |
i got one WU with error - exit code -1073741819 (0xc0000005) 13fe08ac.23325 its not the same error, but also from 13fe08ac one result is with setiathome_5.27_windows_intel boinc 5.8.16 XP SP2 my is with KWSN_2.4V_SSSE3_MB boinc 5.10.30 vista both crunched results got the same error immediately after the start wuid=234464976 Matthias |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
I got three from the bad tape (13fe08ac) on host 3751792. They are WUs 234331734, 234211111, and 234081520 - identified as 13fe08ac.6464, 13fe08ac.24787 and 13fe08ac.8515 respectively. They all failed with the 'bad workunit header' error -6. So I'm looking at the header for the 13fe08ac.24787 task, and comparing it with the saved header for 772524490, a 13fe08ac.18960 task which crunched OK a few days ago. They're noticably different. Red for the bad header, green for the good header. [color=red] <tape_info> <name>13fe08ac</name> <start_time>2454509.997042</start_time> <last_block_time>2454509.997042</last_block_time> <last_block_done>16841</last_block_done> <missed>0</missed> <tape_quality>0</tape_quality> <beam>1</beam> <--------------------------------------difference </tape_info> <name>13fe08ac.24787.16841.4.7</name> <---------------difference <data_desc>[/color] [color=green] <tape_info> <name>13fe08ac</name> <start_time>2454510.0702579</start_time> <last_block_time>2454510.0702579</last_block_time> <last_block_done>47005</last_block_done> <missed>0</missed> <tape_quality>0</tape_quality> <sb_id>0</sb_id> <------------------------------------difference </tape_info> <name>13fe08ac</name> <-------------------------------difference <data_desc>[/color] [color=red] <receiver_cfg> <s4_id>4</s4_id> <name>Arecibo 1.4GHz Array, Beam 0, Pol 1</name> <beam_width>0.0500000007</beam_width> <center_freq>1420</center_freq> <latitude>18.3538056</latitude> <longitude>-66.7552222</longitude> <elevation>497</elevation> <diameter>168</diameter> <az_orientation>180</az_orientation> <az_corr_coeff length=99 encoding="x-csv"> -37,-6.05,92.35,-731.21,-1013.97,-24.53,-11.19,9.18,106.04,3.02,-1.74, -3.46,1.29 </az_corr_coeff> <zen_corr_coeff length=99 encoding="x-csv"> -57.55,-95.56,-4.13,141.69,677.51,-10.41,-7.71,-10.39,0.08,0.43,-0.62, 0.03,-0.36 </zen_corr_coeff> <array_az_ellipse>0</array_az_ellipse> <-------------addition <array_za_ellipse>0</array_za_ellipse> <-------------addition <array_angle>0</array_angle> <-----------------------addition </receiver_cfg>[/color] [color=green] <receiver_cfg> <s4_id>8</s4_id> <name>Arecibo 1.4GHz Array, Beam 2, Pol 1</name> <beam_width>0.0500000007</beam_width> <center_freq>1420</center_freq> <latitude>18.3538056</latitude> <longitude>-66.7552222</longitude> <elevation>497</elevation> <diameter>168</diameter> <az_orientation>180</az_orientation> <az_corr_coeff length=105 encoding="x-csv"> -37,-6.05,92.35,-731.21,-1013.97,-24.53,-11.19,9.18,106.04,3.02,-1.74, -3.46,1.29 </az_corr_coeff> <zen_corr_coeff length=105 encoding="x-csv"> -57.55,-95.56,-4.13,141.69,677.51,-10.41,-7.71,-10.39,0.08,0.43,-0.62, 0.03,-0.36 </zen_corr_coeff> </receiver_cfg>[/color] [color=red] <splitter_cfg> <version>0</version> <---------------------------difference <data_type></data_type> <------------------------empty <fft_len>0</fft_len> <---------------------------difference <ifft_len>0</ifft_len> <-------------------------difference <filter></filter> <------------------------------empty <window></window> <------------------------------empty <samples_per_wu>0</samples_per_wu> <-------------addition <highpass>0</highpass> <-------------------------addition </splitter_cfg>[/color] [color=green] <splitter_cfg> <version>0.200000003</version> <-----------------difference <data_type>encoded</data_type> <fft_len>2048</fft_len> <------------------------difference <ifft_len>8</ifft_len> <-------------------------difference <filter>fft</filter> <window>welsh</window> </splitter_cfg>[/color] The rest looks plausible, so I guess it's that splitter block which is causing the damage. Unless it's the new line at the bottom of analysis_cfg: <credit_rate>2.8499999</credit_rate>. Yea. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.