zero status

Questions and Answers : Unix/Linux : zero status
Message board moderation

To post messages, you must log in.

AuthorMessage
jkmobrien

Send message
Joined: 8 Dec 00
Posts: 7
Credit: 260,674
RAC: 0
Ireland
Message 50411 - Posted: 1 Dec 2004, 12:23:02 UTC

Hi,

Recently I've been looking at the message logs and have seen this quite a bit (probably once a day or more).

Result 09ap04aa.1991.23786.36084.79_0 exited with zero status but no 'finished' file
2004-12-01 11:32:50 [SETI@home] If this happens repeatedly you may need to reset the project.
2004-12-01 11:32:50 [SETI@home] Restarting result 09ap04aa.1991.23786.36084.79_0 using setiathome version 4.02


I have reset the project but it didn't help. I'm running my own compilation of 4.13 but get the same problem with the precompiled version.

Does anyone else get this? Is it on the "to be fixed" list?

Thanks,

John
ID: 50411 · Report as offensive
ChristianB
Avatar

Send message
Joined: 11 Jul 01
Posts: 139
Credit: 90,213
RAC: 0
Germany
Message 50426 - Posted: 1 Dec 2004, 14:20:19 UTC

This message could have different reasons and its on the list. On my Windows Boxes it happens everytime i start a defrag run. It seems that the client couldn't write to the disk and/or cache i think.

BOINC Doc | Team-Site | BOINC-Podcast
ID: 50426 · Report as offensive
Zardoz

Send message
Joined: 21 Nov 03
Posts: 13
Credit: 17,383,109
RAC: 0
United States
Message 50640 - Posted: 2 Dec 2004, 6:39:28 UTC

Hi John,

Are there any messages that appear prior to the "...exited with zero status but no 'finished' file" message? This might help you identify to the cause.

These specific messages pile up all day long on my system and it doesn't seem to hurt anything. I run the seti@home client on my laptop and setting the proxy and updating the resolv.conf in the boinc chroot'd directory whenever I'm on the network at work is a bother I don't need. In my case the boinc client has two problems to cope with, it can't get the fqn for the host from the DNS servers at work--the chroot resolv.conf points to my home DNS servers, and it can't get through the company's firewall without a proxy.

Anyway, in my case I see several other messages leading up to the "...exited with zero status but no 'finished' file" notice complaining about the situation. However, the client carries on, eventually completing the WU, and then goes to the next one and I've not noticed any appreciable difference in the average WU computation time while at work compared to at home.

I found this example in my log from earlier today:

2004-12-02 00:46:31 [---] Can't resolve hostname setiboincdata.ssl.berkeley.edu (host not found or server failure)
2004-12-02 00:46:31 [SETI@home] Couldn't start upload for http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler: error -113
2004-12-02 00:46:31 [SETI@home] Backing off 34 minutes and 29 seconds on transfer of file 10ap04aa.3795.19072.853420.101_0_0
2004-12-02 00:46:32 [SETI@home] Result 10ap04aa.3795.19072.853420.99_0 exited with zero status but no 'finished' file
2004-12-02 00:46:32 [SETI@home] If this happens repeatedly you may need to reset the project.
2004-12-02 00:46:32 [SETI@home] Restarting result 10ap04aa.3795.19072.853420.99_0 using setiathome version 4.07

2004-12-02 00:57:57 [SETI@home] Computation for result 10ap04aa.3795.19072.853420.99 finished
2004-12-02 00:57:57 [SETI@home] Starting result 10ap04aa.3795.19072.853420.95_1 using setiathome version 4.07

2004-12-02 00:58:53 [---] Can't resolve hostname setiboincdata.ssl.berkeley.edu (host not found or server failure)
2004-12-02 00:58:53 [SETI@home] Couldn't start upload for http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler: error -113
2004-12-02 00:58:53 [SETI@home] Backing off 1 minutes and 0 seconds on transfer of file 10ap04aa.3795.19072.853420.99_0_0
2004-12-02 00:58:54 [SETI@home] Result 10ap04aa.3795.19072.853420.95_1 exited with zero status but no 'finished' file
2004-12-02 00:58:54 [SETI@home] If this happens repeatedly you may need to reset the project.
2004-12-02 00:58:54 [SETI@home] Restarting result 10ap04aa.3795.19072.853420.95_1 using setiathome version 4.07

As you can see there are three sets of messages here with the first and third reporting the "...exited with zero status..." message. The first message in each of these two sets is complaining about the lack of host name resolution, followed by two messages reporting the upload failures... then I get the "...exited with zero status..." message and the result restart notice.

Note however that in the second block of messages the first WU has completed and it started the next WU in the queue.

Later when I connect to my network at home, all is well in the boinc directory environment, it can do it's DNS hostname resolution, connects, and then proceeds to upload the results for the day... after waiting the last recorded "backoff time" for each result set of course.

I'd check for other messages in the log, or compare timestamps in your other system logs to track down what was going on at the time. Hope this is of some help.

==> dave

ID: 50640 · Report as offensive
jkmobrien

Send message
Joined: 8 Dec 00
Posts: 7
Credit: 260,674
RAC: 0
Ireland
Message 50670 - Posted: 2 Dec 2004, 12:14:53 UTC - in response to Message 50640.  

Hi Dave,

Thanks for the reply. No, I don't see any other messages prior to the zero status. It's curious that you see those two error messages one after the other, but as the work units are different this might be down to task scheduling rather than the problem connecting causing the "zero status" error.

In my case it just seems that the results file gets wiped somehow. The client just redoes it, so it's really an efficiency issue rather than anything getting really lost. In your case, I think you're being told about two issues at the same time rather than one thing causing the other.

Thanks again,

John
ID: 50670 · Report as offensive
jkmobrien

Send message
Joined: 8 Dec 00
Posts: 7
Credit: 260,674
RAC: 0
Ireland
Message 50671 - Posted: 2 Dec 2004, 12:17:39 UTC - in response to Message 50426.  

> This message could have different reasons and its on the list. On my Windows
> Boxes it happens everytime i start a defrag run. It seems that the client
> couldn't write to the disk and/or cache i think.


Thanks,

How do you know it's on the list? Are you involved in writing/maintaining the code?

mfG,

John
ID: 50671 · Report as offensive

Questions and Answers : Unix/Linux : zero status


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.