194 (0xc2) EXIT_ABORTED_BY_CLIENT - "finish file present too long"

Author	Message
Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1459246 - Posted: 31 Dec 2013, 20:00:01 UTC Yesterday, for reasons I haven't been able to identify, BOINC apparently crashed on one of my machines (7057115). When I discovered it, about 8 hours later, I simply restarted BOINC. All of the tasks that had been running at the time of the crash appeared to restart fine, but 22 seconds later, the 2 restarted AP tasks (one on each GPU) both failed, with 194 (0xc2) EXIT_ABORTED_BY_CLIENT. The STDERR for both show: <message> finish file present too long </message> As luck would have it, one AP restarted at 98.20% and the other at 97.30%. The STDERR for both appears to have been basically complete before the BOINC crash, and they appear to have restarted in their termination phase. This is actually the second time (that I'm aware of) that I've gotten the "194" error under similar circumstances. The previous occurrence was about 6 weeks ago. I only found one other reference to this sort of problem, in Message 1416127. Jason mentioned trying to document some of these instances, so I guess these are a couple more that can add to the list. It seems unfortunate that a task can be trashed by the client for a timing issue, when it has basically completed all its useful work. Although, in this case, the situation was caused by a BOINC crash, it seems as though it could also occur at any time that BOINC is shut down and restarted. Not everybody runs 24/7. For me, 5 of my 7 machines shut down every weekday at noon and don't restart until 6 P.M. (to avoid peak period electric rates). I suspect it's only a matter of time before this "bug" hits one or more of them on a restart. ID: 1459246 ·

skildude Send message Joined: 4 Oct 00 Posts: 9541 Credit: 50,759,529 RAC: 60	Message 1459305 - Posted: 31 Dec 2013, 22:22:54 UTC FFA thread block override value:12288 FFA thread fetchblock override value:4096 That might be the problem. FFA thread block override value:12288 FFA thread fetchblock override value:6144 those are my settings on my R9 290X which is a much bigger card than yours. In a rich man's house there is no place to spit but his face. Diogenes Of Sinope ID: 1459305 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1459328 - Posted: 31 Dec 2013, 23:10:53 UTC - in response to Message 1459305. FFA thread block override value:12288 FFA thread fetchblock override value:4096 That might be the problem. FFA thread block override value:12288 FFA thread fetchblock override value:6144 those are my settings on my R9 290X which is a much bigger card than yours. Well, I've been running with those settings since early-September and successfully completed nearly 1,000 AP GPU tasks, with only 6 errors (2 of which were my fault and 4 of which were these "194" errors). I'll admit I don't really understand what those settings are doing (the documentation is rather weak), but I think the problem is actually what was identified by Joe Segur in the earlier message I referenced, that: One of the last things an app does before exiting is write an empty "finish file", and that error indicates the app didn't exit within ten seconds after writing that file. I think the error was likely caused by shutting BOINC down after the file was written but before the app had gotten to its usual exit That would seem to mean that there's nothing we can do at the host end to avoid getting nailed occasionally. ID: 1459328 ·

Fred E. Volunteer tester Send message Joined: 22 Jul 99 Posts: 768 Credit: 24,140,697 RAC: 0	Message 1459332 - Posted: 31 Dec 2013, 23:12:12 UTC I'm not sure about those settings as the cause. I had the same thing happen yesterday. Boinc froze up, I rebooted, BOINC started normally but errored out an Astropulse gpu job after 15-20 seconds. I'm running a different app (r1843) and have different settings for my 670: FFA thread block override value:6144 FFA thread fetchblock override value:1536 My stderr has very little in it: http://setiathome.berkeley.edu/result.php?resultid=3306975364 Another Fred Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop. ID: 1459332 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1459340 - Posted: 31 Dec 2013, 23:25:43 UTC - in response to Message 1459332. I'm not sure about those settings as the cause. I had the same thing happen yesterday. Boinc froze up, I rebooted, BOINC started normally but errored out an Astropulse gpu job after 15-20 seconds. I'm running a different app (r1843) and have different settings for my 670: FFA thread block override value:6144 FFA thread fetchblock override value:1536 My stderr has very little in it: http://setiathome.berkeley.edu/result.php?resultid=3306975364 I'd almost bet that your empty STDERR might indicate that your task restarted at an even higher completion rate than my 2 tasks. The reason I say that is that your Run time and CPU time are both 0.00. With my tasks, the one that restarted at 97.30% showed fairly normal Run time, but the one that restarted at 98.20% only showed a Run time of 21.33, the time from the restart. That would indicate to me that the program's termination housekeeping had cleared those timers in the task that was restarted later. Yours may have had everything totally wiped out. ID: 1459340 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1474737 - Posted: 9 Feb 2014, 17:21:14 UTC Got up this morning and found that BOINC had crashed on my main cruncher, 7057115, so I decided to do some research before restarting BOINC. Here's what I found (all times are local, U.S. Pacific Standard Time): The last entry in the stdoutdae.txt is: 09-Feb-2014 01:12:42 [SETI@home] Starting task ap_10ap13aa_B5_P1_00191_20140208_30452.wu_1 using astropulse_v6 version 604 (opencl_nvidia_100) in slot 7 Along with that AP task, there were 2 other AP tasks running, each on a different GPU. In checking the active slot directories, the ones for the 3 AP tasks all have a boinc_finish_called file present. Here are the last several lines from the stderr.txt file for each of the 3 AP slot directories: SLOT 0 Found 30 single pulses and 30 repeating pulses, exiting. percent blanked: 3.08 class T_remove_radar: total=2.89e+009, N=1, <>=2.89e+009, min=2.89e+009, max=2.89e+009 class T_main_loop_L1: total=4.65e+012, N=83, <>=5.61e+010, min=4.32e+010, max=7.62e+010 class T_FFT_forward: total=2.13e+010, N=137113, <>=1.55e+005, min=1.24e+005, max=7.76e+006 class T_remove_radar_randomize: total=3.81e+011, N=1369126, <>=2.78e+005, min=6.80e+002, max=1.15e+007 class T_build_chirp_table: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000 class T_DataWrite: total=5.27e+008, N=5352, <>=9.84e+004, min=3.76e+004, max=1.10e+006 class T_DataWrite_ns: total=0, N=0, <>=0, min=0 max=0 class T_oclReadBuf: total=5.21e+006, N=137113, <>=3.70e+001, min=3.20e+001, max=8.56e+002 class T_ChirpWrite: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000 class T_ChirpWrite_ns: total=0, N=0, <>=0, min=0 max=0 class T_dechirp: total=3.69e+010, N=137113, <>=2.69e+005, min=2.42e+005, max=6.79e+006 class Dechirp_ns: total=0, N=0, <>=0, min=0 max=0 class Half_ns: total=0, N=0, <>=0, min=0 max=0 class T_PC_single_pulse_kernel_FFA_update: total=3.18e+012, N=137113, <>=2.32e+007, min=2.24e+007, max=4.08e+007 class PC_ns: total=0, N=0, <>=0, min=0 max=0 class T_oclReadBuf: total=5.21e+006, N=137113, <>=3.70e+001, min=3.20e+001, max=8.56e+002 class T_oclWriteBuf: total=5.29e+008, N=5352, <>=9.89e+004, min=3.77e+004, max=1.10e+006 class T_FFT_inverse: total=2.21e+010, N=137113, <>=1.61e+005, min=1.40e+005, max=6.66e+006 class T_ffa: total=9.85e+011, N=662, <>=1.49e+009, min=4.88e+008, max=8.76e+009 class T_GPU_buffer_read_backs: total=50, N=50, <>=1, min=1 max=1 USE_OPENCL OPENCL_WRITE USE_INCREASED_PRECISION SMALL_CHIRP_TABLE COMBINED_DECHIRP_KERNEL rev 1316 01:30:25 (3732): called boinc_finish SLOT 6 single pulses: 4 repetitive pulses: 4 percent blanked: 6.35 class T_remove_radar: total=2.87e+009, N=1, <>=2.87e+009, min=2.87e+009, max=2.87e+009 class T_main_loop_L1: total=6.79e+012, N=111, <>=6.12e+010, min=5.57e+010, max=8.14e+010 class T_FFT_forward: total=3.20e+010, N=182040, <>=1.76e+005, min=1.25e+005, max=6.22e+007 class T_remove_radar_randomize: total=1.05e+012, N=1817736, <>=5.75e+005, min=6.80e+002, max=1.72e+007 class T_build_chirp_table: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000 class T_DataWrite: total=1.65e+009, N=13320, <>=1.24e+005, min=3.76e+004, max=1.75e+006 class T_DataWrite_ns: total=0, N=0, <>=0, min=0 max=0 class T_oclReadBuf: total=7.93e+006, N=182040, <>=4.30e+001, min=3.20e+001, max=2.74e+005 class T_ChirpWrite: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000 class T_ChirpWrite_ns: total=0, N=0, <>=0, min=0 max=0 class T_dechirp: total=5.06e+010, N=182040, <>=2.78e+005, min=2.41e+005, max=1.90e+006 class Dechirp_ns: total=0, N=0, <>=0, min=0 max=0 class Half_ns: total=0, N=0, <>=0, min=0 max=0 class T_PC_single_pulse_kernel_FFA_update: total=3.19e+012, N=182040, <>=1.75e+007, min=1.61e+007, max=1.10e+008 class PC_ns: total=0, N=0, <>=0, min=0 max=0 class T_oclReadBuf: total=7.93e+006, N=182040, <>=4.30e+001, min=3.20e+001, max=2.74e+005 class T_oclWriteBuf: total=1.66e+009, N=13320, <>=1.24e+005, min=3.77e+004, max=1.75e+006 class T_FFT_inverse: total=3.01e+010, N=182040, <>=1.66e+005, min=1.40e+005, max=1.05e+006 class T_ffa: total=2.32e+012, N=1998, <>=1.16e+009, min=4.03e+008, max=1.17e+010 class T_GPU_buffer_read_backs: total=11, N=11, <>=1, min=1 max=1 USE_OPENCL OPENCL_WRITE USE_INCREASED_PRECISION SMALL_CHIRP_TABLE COMBINED_DECHIRP_KERNEL rev 1316 01:36:19 (1904): called boinc_finish SLOT 7 single pulses: 20 repetitive pulses: 30 percent blanked: 0.00 class T_remove_radar: total=2.91e+009, N=1, <>=2.91e+009, min=2.91e+009, max=2.91e+009 class T_main_loop_L1: total=4.29e+012, N=111, <>=3.86e+010, min=3.82e+010, max=6.89e+010 class T_FFT_forward: total=2.44e+010, N=182040, <>=1.34e+005, min=1.24e+005, max=8.15e+006 class T_remove_radar_randomize: total=1.68e+009, N=1817736, <>=9.25e+002, min=6.80e+002, max=1.04e+006 class T_build_chirp_table: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000 class T_DataWrite: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000 class T_DataWrite_ns: total=0, N=0, <>=0, min=0 max=0 class T_oclReadBuf: total=6.85e+006, N=182040, <>=3.70e+001, min=3.20e+001, max=1.68e+004 class T_ChirpWrite: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000 class T_ChirpWrite_ns: total=0, N=0, <>=0, min=0 max=0 class T_dechirp: total=4.60e+010, N=182040, <>=2.53e+005, min=2.44e+005, max=2.54e+006 class Dechirp_ns: total=0, N=0, <>=0, min=0 max=0 class Half_ns: total=0, N=0, <>=0, min=0 max=0 class T_PC_single_pulse_kernel_FFA_update: total=4.16e+012, N=182040, <>=2.28e+007, min=2.25e+007, max=1.80e+008 class PC_ns: total=0, N=0, <>=0, min=0 max=0 class T_oclReadBuf: total=6.85e+006, N=182040, <>=3.70e+001, min=3.20e+001, max=1.68e+004 class T_oclWriteBuf: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000 class T_FFT_inverse: total=2.76e+010, N=182040, <>=1.52e+005, min=1.46e+005, max=2.42e+006 class T_ffa: total=2.71e+010, N=1, <>=2.71e+010, min=2.71e+010, max=2.71e+010 class T_GPU_buffer_read_backs: total=31, N=31, <>=1, min=1 max=1 USE_OPENCL OPENCL_WRITE USE_INCREASED_PRECISION SMALL_CHIRP_TABLE COMBINED_DECHIRP_KERNEL rev 1316 01:39:35 (2928): called boinc_finish Note that the "finish" times are all much later than the last entry in the event log. However, I noted that the timestamps for the boinc_task_state.xml files are all 1:12 AM, the same as the last entry in the log. It seems likely that when I restart BOINC on that machine the 3 AP tasks will fail with the "finish file present too long", since it's now been almost 8 hours since they were written. However, I'm wondering what would happen if I delete those "finish" files before restarting BOINC. Does anybody have any sense of what might happen if I do that? I'll wait perhaps another hour before restarting, in case anybody wants to weigh in with any suggestions. (By the way, I've made copies of all the files in all three slot directories, in case that might help with later analysis.) ID: 1474737 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1474738 - Posted: 9 Feb 2014, 17:24:42 UTC Last modified: 9 Feb 2014, 17:26:25 UTC I belive itÂ´s the same error i related in this msg: http://setiathome.berkeley.edu/forum_thread.php?id=73970&postid=1473277 Follow the answers i receive, seems like we have a "hell of coincidence" that trigers the error. ID: 1474738 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1474739 - Posted: 9 Feb 2014, 17:28:34 UTC - in response to Message 1474738. Yes, I think it's exactly the same error. What led to the restart of your AP task? Was it a BOINC crash or some other cause? ID: 1474739 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1474746 - Posted: 9 Feb 2014, 17:35:06 UTC - in response to Message 1474739. Last modified: 9 Feb 2014, 17:35:24 UTC My only clue itÂ´s a AV/windows update or something similar (maybe Java or Above who knows?) that runs automaticaly on the background since the host was running alone (the room was empty - i check with the security camera to be sure) at that hour. The Boinc just mark the WU with error and continues to crunch normal (no crash), after that no other similar error apears on the host, so that why a call a "hell of coincidece bug" who only apears on some very specific situations, like explained by Jason on one of itÂ´s posts. ID: 1474746 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1474760 - Posted: 9 Feb 2014, 18:00:49 UTC - in response to Message 1474746. Okay, thanks, Juan. At the time BOINC crashed, there were also 11 MB tasks running on the machine, 5 on CPUs and 6 on GPUs (2 on each, along with the 1 AP on each). Looking at the timestamps in the slot directories for those tasks, I don't see anything beyond 1:13 AM., which would appear to indicate that all of those stopped at the same time BOINC did. However, the 3 AP tasks appeared to keep running! At least until they called boinc_finish, but by then there was no BOINC available to answer the call. I find that VERY interesting. ID: 1474760 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1474762 - Posted: 9 Feb 2014, 18:04:57 UTC - in response to Message 1474760. Okay, thanks, Juan. At the time BOINC crashed, there were also 11 MB tasks running on the machine, 5 on CPUs and 6 on GPUs (2 on each, along with the 1 AP on each). Looking at the timestamps in the slot directories for those tasks, I don't see anything beyond 1:13 AM., which would appear to indicate that all of those stopped at the same time BOINC did. However, the 3 AP tasks appeared to keep running! At least until they called boinc_finish, but by then there was no BOINC available to answer the call. I find that VERY interesting. I was just thinking the same thing. Most application testing is done 'standalone', so that precise case of BOINC crashing and the science application not noticing is perhaps hard to test. But the application should notice that BOINC is no longer running, and shut itself down so that everything is consistent on restart. ID: 1474762 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1474773 - Posted: 9 Feb 2014, 18:19:04 UTC - in response to Message 1474762. This is actually the 4th time I've had this happen on that machine. Last Nov 20, Dec 29, and Jan 5. In the first two instances, there were 2 GPU APs running (the machine only had 2 GPUs at the time), while the last one had only a single one. Two of the instances occurred at various times during the night (as did the most recent one), while the other happened in late morning. I don't allow Windows (8.1) to do any automatic updating, even Windows Defender, so I don't think that could be triggering the actual BOINC crash. However, I don't think I've ever had BOINC crash on that machine except when at least one AP GPU task was running. And I don't recall BOINC crashing on any of my other machines. So perhaps there's some food for thought. ID: 1474773 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1474776 - Posted: 9 Feb 2014, 18:22:27 UTC Last modified: 9 Feb 2014, 18:24:20 UTC Forget to mention, at the time of the error there where 3 AP running in the 780FTW of this host and 2 on the CPU (50% of the I5), all others compleated normaly. I use below priority and win 7/64 ultimate on this host. I know itÂ´s hard to imagine a background task who uses more than the 2 Cores allready freed specialy because i was ussing r2083 at that time. My host have 8GB of memory and use no virtual memory (no swap file). ID: 1474776 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1474780 - Posted: 9 Feb 2014, 18:30:09 UTC Okay, I think that, in the absence of any other suggestions thus far, I'm going to restart BOINC on that machine. First, though, I'll try deleting the "boinc_finish_called" file from two of the slot directories and leave it alone in the third, just for comparison purposes. Should be interesting! ID: 1474780 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1474791 - Posted: 9 Feb 2014, 18:48:24 UTC - in response to Message 1474780. I'll try deleting the "boinc_finish_called" file from two of the slot directories and leave it alone in the third, just for comparison purposes. That approach appears to have worked like a charm! The task where I left the "boinc_finish_called" file in place, 3378152671, quickly failed with the expected computation error. The two tasks where I deleted the file, 3378140958 and 3378159065, appear to have finished normally, making a "second" call to boinc_finish after the restart. Of course I won't know for sure if they'll actually validate until a wingman reports on each of those, but it looks promising so far. ID: 1474791 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1474799 - Posted: 9 Feb 2014, 19:15:29 UTC - in response to Message 1474791. I agree - interesting observation, and should be reproducible in testing. It leaves us with two separate questions: 1) Why did BOINC crash, and could it be doing so more often than we've suspected? 2) Why didn't the AP app notice? BTW, my understanding is that if BOINC crashes, but BOINC Manager stays running, then the Manager will attempt to restart the Client. But in the case that I reported recently, it was Windows Explorer that crashed, taking out both client and manager at the same time, so no automatic restart was possible. ID: 1474799 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1474811 - Posted: 9 Feb 2014, 19:35:03 UTC - in response to Message 1474799. Last modified: 9 Feb 2014, 19:35:38 UTC I agree - interesting observation, and should be reproducible in testing. It leaves us with two separate questions: 1) Why did BOINC crash, and could it be doing so more often than we've suspected? 2) Why didn't the AP app notice? BTW, my understanding is that if BOINC crashes, but BOINC Manager stays running, then the Manager will attempt to restart the Client. But in the case that I reported recently, it was Windows Explorer that crashed, taking out both client and manager at the same time, so no automatic restart was possible. One thing I came across while building MB Cuda for Linux some time back, was that some changes were in progress to replace the heartbeat mechanism. At the time, in Boincapi code, whatever the replacement was wasn't working at all, and would cause the app to exit. So my guess is whatever mechanism was devised there isn't quite working yet. Noticing multiple other breakages, for the case of the private Linux Cuda build I reverted to 7.0.65 boincapi (windows builds of course use much older modified boincapi, which I had fully intended to update if those multiple breakages and unresolved legacy problems weren't there. but they are there and so I didn't. ) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1474811 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1474814 - Posted: 9 Feb 2014, 19:51:29 UTC - in response to Message 1474799. But in the case that I reported recently, it was Windows Explorer that crashed, taking out both client and manager at the same time, so no automatic restart was possible. Looks like Explorer was at the root of my crash, too. FWIW, here's a Windows log entry which appears to match the time of the last BOINC log entry: Log Name: Application Source: Microsoft-Windows-Winlogon Date: 2/9/2014 1:12:43 AM Event ID: 1002 Task Category: None Level: Information Keywords: Classic User: N/A Computer: T7400 Description: The shell stopped unexpectedly and explorer.exe was restarted. Event Xml: <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event"> <System> <Provider Name="Microsoft-Windows-Winlogon" Guid="{DBE9B383-7CF3-4331-91CC-A3CB16A3B538}" EventSourceName="Winlogon" /> <EventID Qualifiers="16384">1002</EventID> <Version>0</Version> <Level>4</Level> <Task>0</Task> <Opcode>0</Opcode> <Keywords>0x80000000000000</Keywords> <TimeCreated SystemTime="2014-02-09T09:12:43.000000000Z" /> <EventRecordID>6305</EventRecordID> <Correlation /> <Execution ProcessID="0" ThreadID="0" /> <Channel>Application</Channel> <Computer>T7400</Computer> <Security /> </System> <EventData> <Data>explorer.exe</Data> </EventData> </Event> Don't know if that might be helpful to anybody, or not. ID: 1474814 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1474816 - Posted: 9 Feb 2014, 20:01:22 UTC - in response to Message 1474811. I agree - interesting observation, and should be reproducible in testing. It leaves us with two separate questions: 1) Why did BOINC crash, and could it be doing so more often than we've suspected? 2) Why didn't the AP app notice? BTW, my understanding is that if BOINC crashes, but BOINC Manager stays running, then the Manager will attempt to restart the Client. But in the case that I reported recently, it was Windows Explorer that crashed, taking out both client and manager at the same time, so no automatic restart was possible. One thing I came across while building MB Cuda for Linux some time back, was that some changes were in progress to replace the heartbeat mechanism. At the time, in Boincapi code, whatever the replacement was wasn't working at all, and would cause the app to exit. So my guess is whatever mechanism was devised there isn't quite working yet. Noticing multiple other breakages, for the case of the private Linux Cuda build I reverted to 7.0.65 boincapi (windows builds of course use much older modified boincapi, which I had fully intended to update if those multiple breakages and unresolved legacy problems weren't there. but they are there and so I didn't. ) David himself reported the heartbeat mechanism as flawed some seven years ago, and flagged it for replacement - the reasoning is in ticket [trac]#336[/trac]. If there are problems with the PID/app_init.xml replacement, it would be helpful to feed them back in via boinc_alpha. ID: 1474816 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1474841 - Posted: 9 Feb 2014, 21:49:57 UTC - in response to Message 1474816. Last modified: 9 Feb 2014, 21:50:15 UTC David himself reported the heartbeat mechanism as flawed some seven years ago, and flagged it for replacement - the reasoning is in ticket [trac]#336[/trac]. If there are problems with the PID/app_init.xml replacement, it would be helpful to feed them back in via boinc_alpha. Boinc alpha being a mechanism for those participating in Boinc alpha testing to report issues ? I'm not a Boinc alpha tester, and nor do I particularly want to be, thanks anyway. The Boinc development documentation instructs to email the assigned department head, which I'll do so again once the problems I already reported are fixed. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1474841 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.