Message boards :
Number crunching :
What's the consensus solution for "finish file present too long"?
Message board moderation
Author | Message |
---|---|
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
What's the consensus solution for "finish file present too long"? I'm pretty sure I've read the volunteer developers are aware of the issue. Do you just have to spend a lot more attention on any finished tasks that haven't updated their stderr.txt files and reported yet just as you are shutting down the system? I already manually shut down each BOINC client before shutting off the system. I wait about 5 seconds after the client exits before initiating system shutdown. I still am getting task errors of "finish file present too long" and hate wasting compute time for naught. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 ![]() |
Consensus (of 2, lol) between Juha and myself, at least, was that the client is being fussy about things it shouldn't be, and multiple workarounds were effective (though workarounds aren't solutions). The client is putting time expectations/assumptions/demands, on something that may not even get a chance to run, due to below normal priorities and tasks completing/uploading and starting new ones is a period of high system contention. Where the applications have such a setting, raise the priority to 'normal' or 'above normal'. Where they don't have a setting (e.g. stock CPU apps), you can use TThrottle (or Process Lasso, or similar) or the Boinc client's <no_priority_change> cc_config.xml option. In the case of the [Windows] Cuda MB application, I made a test build with its modified boincapi tweaked further, so as to minimise the likelihood of Boinc intercepting during the exit sequence. Probably some more refined variant of that will make it to the next stock builds in time. I've made that available on my download page A more ideal fix would involve changes to the client such that it either gives sufficient time (perhaps adjustable via cc_config.xml), restoring processes at low priorities to normalish ones for exit, applications being as prompt as they can anyway (even though they shouldn't have to be), and perhaps the client checking a few more things more carefully before starting starts (like are some apps still finishing cleanup?, is the system currently shutting down? [see other thread]) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
What's the consensus solution for "finish file present too long"? I'm pretty sure I've read the volunteer developers are aware of the issue. Do you just have to spend a lot more attention on any finished tasks that haven't updated their stderr.txt files and reported yet just as you are shutting down the system? I already manually shut down each BOINC client before shutting off the system. I wait about 5 seconds after the client exits before initiating system shutdown. I still am getting task errors of "finish file present too long" and hate wasting compute time for naught. BOINC API/Client issue, not thing application developer should think of. So right place would be to rise this issue on BOINC forums where chances to caugh some of BOINC devs will be higher. What I saw from OpenCL stderr is that error happens after app called boinc_finish() - the last function it calls in application source code and that never returns back to app. So - out of control. |
![]() ![]() ![]() ![]() Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 ![]() |
It's really the BOINC shutdown rather than the system shutdown that tends to be one of the apparent causes of "finish file present too long". If BOINC happens to shut down within a few seconds after a task finishes and writes that file, it gets hosed when BOINC restarts. So, separating BOINC shutdown from system shutdown probably won't have any effect on that particular error. Can you specifically tie the timing of your errors to shutdowns? I actually think, though, that the timing of task finish / BOINC shutdown causing the problem is not as often the cause as other system loads just slowing the whole process down between the time the task writes the "finish file" and the time that BOINC tries to read it. I've seen it happen on several occasions far removed from a BOINC shutdown. Sometimes I think I can attribute it to a task finishing while whatever AV I have running on a specific machine is downloading its latest definition files. Other times...who knows? I think almost any other CPU-intensive process could dribble molasses on everything else that's running. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Thanks for chiming in guys. I already am running my apps in high priority mode so there shouldn't be an issue where the app is getting starved for attention. I decided to just drop the CUDA app after the 8.12 SoG OpenCL app became available as it is so much more efficient than the old CUDA50 app. So only running OpenCL apps in SETI and MilkyWay. The only CUDA app running anymore is from Einstein. Also am running ProcessLasso and normally the only time I see a process restraint in the logfile is when running a MilkyWay task. Also thanks for pointing out that the problem lies with the BOINC API and NOT with any of the apps I run. I helped out with the "truncated stderr.txt" issues with the MilkyWay project tasks a year ago and have to say whatever the BOINC developers did to fix that problem is still working very well. I have not had the truncated stderr.txt issue over at MilkyWay since then. If I remember somewhat correctly, the BOINC developers inserted a 5 second delay in something or other before writing the file after task finish is called. Just guessing if I remember that correctly. If that was done to the BOINC client and it works for MilkyWay .... I wonder why it doesn't seem to work with SETI tasks? I have not ever had any kind of AV process going on when shutting down the systems. With regard to timing of shutdown, the first thing I do before shutting the client is to see what tasks are close to finishing in the next minute or so, and just wait for the task to finish and then upload successfully. I don't wait around for the next 305 seconds for the client to update and report finished tasks though, unless the update is going to happen in the next 15 or so seconds. I always thought that as long as the task result uploaded successfully then it doesn't matter when the next system update happens. All that affects is the current RAC. I don't seem to ever have issues with update Tuesday when I report several hundred tasks after the system gets turned back on and the project is back up and running. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
I wonder why it doesn't seem to work with SETI tasks? Truncated stderr is different issue from broken/absent finish file. They are different files. Again, from sample I saw for OpenCL app stderr was completely formed.. yet result was errored out. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Thanks for clarification Raistmer. I didn't think that out very well to realize they were completely different files. The issue as I understand it is that the actual task result science file is getting mangled or is missing then? Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Thanks for clarification Raistmer. I didn't think that out very well to realize they were completely different files. The issue as I understand it is that the actual task result science file is getting mangled or is missing then? No. Actual result file is third different file. And most probably it's completely OK (that makes this issue even more annoying cause correct processing lost for nothing). It's flag-file that BOINC API writes and BOIC client reads to understand that processing completed. Just as with complete wipe just because of single error in app_info such savage behavior looks little excessive to the case IMO. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Thanks again for the further clarification. Who knows .... I might actually understand the working goings-on of BOINC in a decade or so. So what kind of information can I post over in the BOINC forums that would help the BOINC developers fix this issue? Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Thanks again for the further clarification. Who knows .... I might actually understand the working goings-on of BOINC in a decade or so. So what kind of information can I post over in the BOINC forums that would help the BOINC developers fix this issue? Just same description of conditions that lead to error as in posts on these forums. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.