can someone explain this

Message boards : Number crunching : can someone explain this
Message board moderation

To post messages, you must log in.

AuthorMessage
mmonroe

Send message
Joined: 22 Oct 06
Posts: 20
Credit: 5,132,754
RAC: 0
United States
Message 850816 - Posted: 8 Jan 2009, 12:45:44 UTC

I had this AP unit downloaded 1/03 and was crunching away over 90% complete. Downloaded another and the WU appeared to reset. Can someone tell me what happened?

http://setiathome.berkeley.edu/workunit.php?wuid=390572481
ID: 850816 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 850844 - Posted: 8 Jan 2009, 14:21:42 UTC - in response to Message 850816.  

Maybe have a look in the messages tab? If a WU starts over, it usually puts a message in the message tab mentioning something along the lines of "0 size statefile", so it doesn't have a checkpoint of how far it has gone and what it has found. Without those two things, it has no way of knowing how much has been done, and just starts over.

The zeroed statefile issue is one that happens from time to time, but I still haven't figured out a legitimate reason why it shows up sometimes and then not other times. On Linux/Mac systems, it's usually a file permissions issue, but Windows machines usually don't have this problem.

One of the things the message tab says to do is "if this happens repeatedly, you may need to reset the project."
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 850844 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 850858 - Posted: 8 Jan 2009, 14:56:42 UTC
Last modified: 8 Jan 2009, 14:59:22 UTC

Tne consensus at Lunatics is that it isn't a problem reading the statefile (as the error message will say), but an earlier problem which caused the file to be improperly written (or to be precise, not written). BOINC can close down running applications extremely abruptly on occasion, and if the application just happens to have started writing a checkpoint file - poof.

As Eric Korpela put it:

... the terminate with no mercy policy sucks and we should find if there is a way around it, or at least a way to allow I/O to finish.

Needless to say, the optimisers have attempted to put extra safeguards in their code to minimise the risk of this happening.
ID: 850858 · Report as offensive
mmonroe

Send message
Joined: 22 Oct 06
Posts: 20
Credit: 5,132,754
RAC: 0
United States
Message 850868 - Posted: 8 Jan 2009, 15:48:24 UTC - in response to Message 850858.  

Thanks all did look in the messages tab and got this

1/8/2009 7:29:11 AM|SETI@home|Task ap_23no08ad_B1_P0_00185_20090103_17637.wu_0 exited with zero status but no 'finished' file
1/8/2009 7:29:11 AM|SETI@home|If this happens repeatedly you may need to reset the project.


I guess I will abort and just go forward
ID: 850868 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 850933 - Posted: 8 Jan 2009, 19:34:42 UTC - in response to Message 850858.  

Tne consensus at Lunatics is that it isn't a problem reading the statefile (as the error message will say), but an earlier problem which caused the file to be improperly written (or to be precise, not written). BOINC can close down running applications extremely abruptly on occasion, and if the application just happens to have started writing a checkpoint file - poof.

As Eric Korpela put it:

... the terminate with no mercy policy sucks and we should find if there is a way around it, or at least a way to allow I/O to finish.

Needless to say, the optimisers have attempted to put extra safeguards in their code to minimise the risk of this happening.

Interesting to know. I'll have to make a mental sticky note of that for future reference.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 850933 · Report as offensive

Message boards : Number crunching : can someone explain this


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.