Computation Error - Bad Workunit Header

Message boards : Number crunching : Computation Error - Bad Workunit Header
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 · Next

AuthorMessage
Profile Keith T.
Volunteer tester
Avatar

Send message
Joined: 23 Aug 99
Posts: 962
Credit: 537,293
RAC: 9
United Kingdom
Message 724604 - Posted: 11 Mar 2008, 12:04:59 UTC - in response to Message 724589.  

I thought that bad WU's got cancelled after 5 errors, but I've just checked my only one of these and it has now reached 5 errors with a new Unsent generated. http://setiathome.berkeley.edu/workunit.php?wuid=234337343

The one that has been sent was sent before the last one that reported in. This would be normal operation, because before that BOINC would be trying to get two units to form quorum.
It should now mark the unsent unit as not needed.


3 seconds after the 5th Compute error was reported 11 Mar 2008 9:11:53 UTC another task was Created 11 Mar 2008 9:11:56 UTC

That task has not been sent yet, but I thought the 5 error rule should not have created it at all.
Sir Arthur C Clarke 1917-2008
ID: 724604 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 724611 - Posted: 11 Mar 2008, 13:11:54 UTC
Last modified: 11 Mar 2008, 13:12:54 UTC

Apparently the rule is inclusive. All the clinkers I have didn't get poofed until the sixth error came back.

I suppose that makes sense when you consider the case of an IR of 2 with Max Errors of 1. If you poofed the WU when the first error occured and it was because of a host side error rather than a bad WU, the wingman would always get burned if it had already run successfully or had started to run and would complete successfully.

I wonder if the inclusive hypothesis holds for the IR of 1 Max Error 0 case. I guess you'd have to ask that one over at Rosetta. ;-)

Alinator
ID: 724611 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 724642 - Posted: 11 Mar 2008, 14:57:09 UTC

I noted in an earlier post, last night, that Windows search using empty XML tags would not identify the duff WU's. As I am continuing to be issued resends of these little b******rs (7 so far today), I have written a batch file to run from the command window:
----------------------------------------------------------
@echo off
REM 
REM ** The following line must point to the drive letter of the disk or partition where your Boinc folder is located.
C:
REM 
REM ** The following line must contain the full path to your setiathome.berkeley.edu folder.
CD \\program files\\BOINC\\projects\\setiathome.berkeley.edu
@echo The following WU's should be aborted:
@echo.
REM 
REM ** Now the command to find the duff files
findstr /S "<data_type></data_type>" 13fe08ac*
REM 
REM ** And keep the window open until a key is pressed
pause 0

----------------------------------------------------------

I have saved this as 13fe08ac.bat (note the extension - especially if you run your system with the default "hide known file types") on my Windows Desktop.
If you want to use this file, then copy the text into Notepad (or an equivalent simplistic text editor), ensure that the REM ** lines are edited to suit your particular system, and save it with the .bat extension.
Double-clicking on this file lists all the faulty WU's in your cache. You can then sort the files by name in the Grid view of Boinc Manager and abort the matching ones.

Hope this is of use to someone.

F.
ID: 724642 · Report as offensive
Profile [KWSN]John Galt 007
Volunteer tester
Avatar

Send message
Joined: 9 Nov 99
Posts: 2444
Credit: 25,086,197
RAC: 0
United States
Message 724648 - Posted: 11 Mar 2008, 15:12:16 UTC - in response to Message 724642.  

I noted in an earlier post, last night, that Windows search using empty XML tags would not identify the duff WU's. As I am continuing to be issued resends of these little b******rs (7 so far today), I have written a batch file to run from the command window:
----------------------------------------------------------
@echo off
REM 
REM ** The following line must point to the drive letter of the disk or partition where your Boinc folder is located.
C:
REM 
REM ** The following line must contain the full path to your setiathome.berkeley.edu folder.
CD \\program files\\BOINC\\projects\\setiathome.berkeley.edu
@echo The following WU's should be aborted:
@echo.
REM 
REM ** Now the command to find the duff files
findstr /S "<data_type></data_type>" 13fe08ac*
REM 
REM ** And keep the window open until a key is pressed
pause 0

----------------------------------------------------------

I have saved this as 13fe08ac.bat (note the extension - especially if you run your system with the default "hide known file types") on my Windows Desktop.
If you want to use this file, then copy the text into Notepad (or an equivalent simplistic text editor), ensure that the REM ** lines are edited to suit your particular system, and save it with the .bat extension.
Double-clicking on this file lists all the faulty WU's in your cache. You can then sort the files by name in the Grid view of Boinc Manager and abort the matching ones.

Hope this is of use to someone.

F.




Fred...thanks. Found another of those buggers. Script works great!!
Clk2HlpSetiCty:::PayIt4ward

ID: 724648 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 724649 - Posted: 11 Mar 2008, 15:14:05 UTC - in response to Message 724648.  


Fred...thanks. Found another of those buggers. Script works great!!


No problemo. Happy to be of assistance.

F.
ID: 724649 · Report as offensive
Profile DEDamouth
Volunteer tester

Send message
Joined: 22 Jan 02
Posts: 2
Credit: 8,411,568
RAC: 0
United States
Message 724665 - Posted: 11 Mar 2008, 16:02:43 UTC

I've had half a dozen workunits fail over the past few days - all in the 13fe08ac series. Here's the most recent:

3/11/2008 12:31:42 AM|SETI@home|Reason: Unrecoverable error for result 13fe08ac.6032.4571.7.7.25_0 ( - exit code -6 (0xfffffffa))

/Dave
/Dave
Dave@Damouth.com
ID: 724665 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 724667 - Posted: 11 Mar 2008, 16:04:28 UTC - in response to Message 724642.  
Last modified: 11 Mar 2008, 16:14:34 UTC

@echo off
REM 
REM ** The following line must point to the drive letter of the disk or partition where your Boinc folder is located.
C:
REM 
REM ** The following line must contain the full path to your setiathome.berkeley.edu folder.
CD \\program files\\BOINC\\projects\\setiathome.berkeley.edu
@echo The following WU's should be aborted:
@echo.
REM 
REM ** Now the command to find the duff files
findstr /S "<data_type></data_type>" 13fe08ac*
REM 
REM ** And keep the window open until a key is pressed
pause 0

----------------------------------------------------------


You can always do this:

@echo off
if "%1"=="" if exist "C:\\Program Files\\BOINC\\projects\\setiathome.berkeley.edu\\." goto default
if exist "%1\\BOINC\\projects\\setiathome.berkeley.edu\\." goto CL_ARG
echo.
echo Error!  Cannot find default or specified directory.
echo Please check your spelling and try again.
REM Offer the user to see where this program is searching by default and what they typed at runtime:
echo BOINC cannot be found in "C:\\Program Files" or in "%1".
goto end

:default
C:
CD \\program files\\BOINC\\projects\\setiathome.berkeley.edu
@echo The following WU's should be aborted:
@echo.
REM 
REM ** Now the command to find the duff files
findstr /S "<data_type></data_type>" 13fe08ac*
goto end

:CL_ARG
@echo The following WU's should be aborted:
@echo.
findstr /S "<data_type></data_type>" 13fe08ac* "%1\\BOINC\\projects\\setiathome.berkeley.edu\\*.*"
goto end

:end
pause


This will allow you to either use a default destination directory or a user specified directory after the batch file name.
ID: 724667 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65771
Credit: 55,293,173
RAC: 49
United States
Message 724670 - Posted: 11 Mar 2008, 16:17:51 UTC - in response to Message 724642.  
Last modified: 11 Mar 2008, 16:22:06 UTC

I noted in an earlier post, last night, that Windows search using empty XML tags would not identify the duff WU's. As I am continuing to be issued resends of these little b******rs (7 so far today), I have written a batch file to run from the command window:
----------------------------------------------------------
@echo off
REM 
REM ** The following line must point to the drive letter of the disk or partition where your Boinc folder is located.
C:
REM 
REM ** The following line must contain the full path to your setiathome.berkeley.edu folder.
CD \\program files\\BOINC\\projects\\setiathome.berkeley.edu
@echo The following WU's should be aborted:
@echo.
REM 
REM ** Now the command to find the duff files
findstr /S "<data_type></data_type>" 13fe08ac*
REM 
REM ** And keep the window open until a key is pressed
pause 0

----------------------------------------------------------

I have saved this as 13fe08ac.bat (note the extension - especially if you run your system with the default "hide known file types") on my Windows Desktop.
If you want to use this file, then copy the text into Notepad (or an equivalent simplistic text editor), ensure that the REM ** lines are edited to suit your particular system, and save it with the .bat extension.
Double-clicking on this file lists all the faulty WU's in your cache. You can then sort the files by name in the Grid view of Boinc Manager and abort the matching ones.

Hope this is of use to someone.

F.

Nice script, But since I'm using 6.1.0 that Crunch3r released, I can sort Boinc by what It has, then abort or suspend and then abort at will, So the script is nice, But isn't something I need, Of course I can't do a benchmark either and I've tried a few times too, As It must be a bug in 6.1.0(I can live with that one). And I've found over 36 of the stinkers too.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 724670 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 724683 - Posted: 11 Mar 2008, 21:20:39 UTC - in response to Message 724667.  

You can always do this:

@echo off
if "%1"=="" if exist "C:\\Program Files\\BOINC\\projects\\setiathome.berkeley.edu\\." goto default
if exist "%1\\BOINC\\projects\\setiathome.berkeley.edu\\." goto CL_ARG
echo.
echo Error!  Cannot find default or specified directory.
echo Please check your spelling and try again.
REM Offer the user to see where this program is searching by default and what they typed at runtime:
echo BOINC cannot be found in "C:\\Program Files" or in "%1".
goto end

:default
C:
CD \\program files\\BOINC\\projects\\setiathome.berkeley.edu
@echo The following WU's should be aborted:
@echo.
REM 
REM ** Now the command to find the duff files
findstr /S "<data_type></data_type>" 13fe08ac*
goto end

:CL_ARG
@echo The following WU's should be aborted:
@echo.
findstr /S "<data_type></data_type>" 13fe08ac* "%1\\BOINC\\projects\\setiathome.berkeley.edu\\*.*"
goto end

:end
pause


This will allow you to either use a default destination directory or a user specified directory after the batch file name.

Nice tweak, OzzFan. As I don't expect to be changing the location of the Boinc folder, I didn't see any problem baking the path into the batch file. But a nice alternative to offer.

F.
ID: 724683 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 724688 - Posted: 11 Mar 2008, 21:27:22 UTC - in response to Message 724670.  


Nice script, But since I'm using 6.1.0 that Crunch3r released, I can sort Boinc by what It has, then abort or suspend and then abort at will, So the script is nice, But isn't something I need, Of course I can't do a benchmark either and I've tried a few times too, As It must be a bug in 6.1.0(I can live with that one). And I've found over 36 of the stinkers too.

You don't need to sort the Crunch3r 6.1.0 to sort - that's standard in Grid view (at least on the stock 5.10.x). Problem is, that doesn't tell you whether a particular WU is good or bad. It would appear that only 13fe08ac that were split between 7th and 9th March have the error and it is pointless aborting any others - indeed you will be reducing your daily max quota more than you need to for no purpose! The script identifies the particular WU's that have the problem - THEN you can use the sort capability in Boinc Manager to help you mark and abort them.

F.
ID: 724688 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65771
Credit: 55,293,173
RAC: 49
United States
Message 724699 - Posted: 11 Mar 2008, 21:42:17 UTC - in response to Message 724688.  


Nice script, But since I'm using 6.1.0 that Crunch3r released, I can sort Boinc by what It has, then abort or suspend and then abort at will, So the script is nice, But isn't something I need, Of course I can't do a benchmark either and I've tried a few times too, As It must be a bug in 6.1.0(I can live with that one). And I've found over 36 of the stinkers too.

You don't need to sort the Crunch3r 6.1.0 to sort - that's standard in Grid view (at least on the stock 5.10.x). Problem is, that doesn't tell you whether a particular WU is good or bad. It would appear that only 13fe08ac that were split between 7th and 9th March have the error and it is pointless aborting any others - indeed you will be reducing your daily max quota more than you need to for no purpose! The script identifies the particular WU's that have the problem - THEN you can use the sort capability in Boinc Manager to help you mark and abort them.

F.

Well then I've already done what your script does, But then I have the time and only 3 PCs, So It's not a big deal. As to the quota, I haven't had any problems with It, So I'm not worrying about It as It's not shown up.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 724699 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 724703 - Posted: 11 Mar 2008, 21:45:56 UTC - in response to Message 724699.  


Well then I've already done what your script does, But then I have the time and only 3 PCs, So It's not a big deal. As to the quota, I haven't had any problems with It, So I'm not worrying about It as It's not shown up.

I assume what you are saying is that you have aborted all 13fe08ac WU's whether they were bad or not?

F.
ID: 724703 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 724707 - Posted: 11 Mar 2008, 21:50:16 UTC - in response to Message 724683.  

Nice tweak, OzzFan. As I don't expect to be changing the location of the Boinc folder, I didn't see any problem baking the path into the batch file. But a nice alternative to offer.


I figured I was doing a favor for those who have changed their BOINC path due to running Vista, while keeping it simple for those who haven't changed their BOINC path.

Just trying to help on a larger scale. ;-)
ID: 724707 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65771
Credit: 55,293,173
RAC: 49
United States
Message 724708 - Posted: 11 Mar 2008, 21:50:46 UTC - in response to Message 724703.  
Last modified: 11 Mar 2008, 21:51:45 UTC


Well then I've already done what your script does, But then I have the time and only 3 PCs, So It's not a big deal. As to the quota, I haven't had any problems with It, So I'm not worrying about It as It's not shown up.

I assume what you are saying is that you have aborted all 13fe08ac WU's whether they were bad or not?

F.

Yes, Of course, But then Your script wasn't out until after I had done It, So what? If any were good they'll be reissued to somebody else, It doesn't matter that much, I just like to crunch fast, Not to waste cpu time or look for credit. But then It's just something to pass the time here, a hobby. Just like Model Railroading or an Aquarium is.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 724708 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 724709 - Posted: 11 Mar 2008, 21:52:56 UTC - in response to Message 724670.  

Nice script, But since I'm using 6.1.0 that Crunch3r released, I can sort Boinc by what It has, then abort or suspend and then abort at will, So the script is nice, But isn't something I need, Of course I can't do a benchmark either and I've tried a few times too, As It must be a bug in 6.1.0(I can live with that one). And I've found over 36 of the stinkers too.


I figured Fred was offering it as a simple, single command to take care of the problem instead of all that clicking and pointing (which is the great thing about scripts!).
ID: 724709 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 724713 - Posted: 11 Mar 2008, 22:00:15 UTC - in response to Message 724709.  

Nice script, But since I'm using 6.1.0 that Crunch3r released, I can sort Boinc by what It has, then abort or suspend and then abort at will, So the script is nice, But isn't something I need, Of course I can't do a benchmark either and I've tried a few times too, As It must be a bug in 6.1.0(I can live with that one). And I've found over 36 of the stinkers too.


I figured Fred was offering it as a simple, single command to take care of the problem instead of all that clicking and pointing (which is the great thing about scripts!).


Well, even after the script has run you still have to click and point and abort. But, as I have found today, there will be resends of these little sods for a good few days to come so I plan on running the batch file regularly over the next week or two (easier than eyeballing the list in Boinc Manager!)

F.
ID: 724713 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 724715 - Posted: 11 Mar 2008, 22:01:56 UTC - in response to Message 724707.  
Last modified: 11 Mar 2008, 22:02:56 UTC

I figured I was doing a favor for those who have changed their BOINC path due to running Vista, while keeping it simple for those who haven't changed their BOINC path.

Just trying to help on a larger scale. ;-)

All contributions gratefully received :)

F.
ID: 724715 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65771
Credit: 55,293,173
RAC: 49
United States
Message 724880 - Posted: 12 Mar 2008, 2:25:05 UTC - in response to Message 724642.  
Last modified: 12 Mar 2008, 2:30:08 UTC

I noted in an earlier post, last night, that Windows search using empty XML tags would not identify the duff WU's. As I am continuing to be issued resends of these little b******rs (7 so far today), I have written a batch file to run from the command window:
----------------------------------------------------------
@echo off
REM 
REM ** The following line must point to the drive letter of the disk or partition where your Boinc folder is located.
C:
REM 
REM ** The following line must contain the full path to your setiathome.berkeley.edu folder.
CD \\program files\\BOINC\\projects\\setiathome.berkeley.edu
@echo The following WU's should be aborted:
@echo.
REM 
REM ** Now the command to find the duff files
findstr /S "<data_type></data_type>" 13fe08ac*
REM 
REM ** And keep the window open until a key is pressed
pause 0

----------------------------------------------------------

I have saved this as 13fe08ac.bat (note the extension - especially if you run your system with the default "hide known file types") on my Windows Desktop.
If you want to use this file, then copy the text into Notepad (or an equivalent simplistic text editor), ensure that the REM ** lines are edited to suit your particular system, and save it with the .bat extension.
Double-clicking on this file lists all the faulty WU's in your cache. You can then sort the files by name in the Grid view of Boinc Manager and abort the matching ones.

Hope this is of use to someone.

F.

Ok I tried It, It does operate across an ethernet connection(by dropping the file onto the desktop of each PC and then clicking on the file[one needs their permissions set first], Of course I still have to go to each PC and abort them in person(KVMs are really useful here, No turning a PC off to install a video card, VM software is nice but I don't have the interest and like I said I have a KVM that I invested in), But the fact is the way the column sort works in 6.1.0(x64 version) I can find all of them at a glance as they can all be near the top(13fe08ac type files).
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 724880 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 724913 - Posted: 12 Mar 2008, 4:05:47 UTC

@ JokerCPoC

Use Crunch3rs V6.1.0.xx_V3..
V5 will coming soon..

RED PILL

Benchmark is working well..
ID: 724913 · Report as offensive
Profile KWSN Ekky Ekky Ekky
Avatar

Send message
Joined: 25 May 99
Posts: 944
Credit: 52,956,491
RAC: 67
United Kingdom
Message 725058 - Posted: 12 Mar 2008, 15:45:36 UTC - in response to Message 724332.  

I am thinking of establishing a group dedicated to the joys of the 13fe08ac set. Its avowed aim will be to achieve a total of zero credits for all eternity.

I've just seen that my work machine (which I cannot get to at present to abort it) has the following past and current results:

created 8 Mar 2008 23:08:56 UTC
name 13fe08ac.14310.20113.9.7.140

778895416 4239834 9 Mar 2008 19:49:30 UTC 10 Mar 2008 6:01:45 UTC Over Client error Compute error 23.00 0.10 ---

780044493 3800150 11 Mar 2008 0:30:42 UTC 11 Mar 2008 0:32:18 UTC Over Client error Compute error 16.34 0.07 ---

780719848 2982130 11 Mar 2008 20:57:30 UTC 11 Mar 2008 20:59:44 UTC Over Client error Compute error 42.08 0.05 ---

781501062 4069706 12 Mar 2008 13:07:21 UTC 19 Mar 2008 13:07:21 UTC In Progress Unknown New --- --- ---

778895415 2165675 9 Mar 2008 19:49:32 UTC 9 Mar 2008 19:51:27 UTC Over Client error Compute error 28.44 0.06 ---

780799058 3904771 11 Mar 2008 21:29:22 UTC 18 Mar 2008 21:29:22 UTC In Progress Unknown New --- --- ---

779661240 4245110 10 Mar 2008 15:01:05 UTC 10 Mar 2008 20:29:35 UTC Over Client error Compute error 18.91 0.08 ---

This one looks set to run and run! Is there any record for the number of attempts to complete a bad unit?



ID: 725058 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 · Next

Message boards : Number crunching : Computation Error - Bad Workunit Header


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.