Is this a new error message

Questions and Answers : Windows : Is this a new error message
Message board moderation

To post messages, you must log in.

AuthorMessage
Bob Giel
Volunteer tester

Send message
Joined: 11 Jan 04
Posts: 76
Credit: 5,419,128
RAC: 0
United States
Message 1830372 - Posted: 13 Nov 2016, 20:25:15 UTC

Been getting an error I haven't seen before in the events log. Any idea what its referring to? I've rebooted the machine, as suggested, and still get the error.

SETI@home: Notice from BOINC
Task postponed: Suspicious pulse results, host needs reboot or maintenance
11/13/2016 2:12:49 PM

ID: 1830372 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1830398 - Posted: 13 Nov 2016, 21:47:43 UTC - in response to Message 1830372.  

No, it's not a new message, it's been around for a while already. Best explanation is by Joe Segur, here:
Thanks for seeing that, a sanity check within the app has seen a pulse too strong to be possible, so something went wrong. The Lunatics CPU and OpenCL GPU apps for MB all have those checks now for each of the 5 signal types. The apps generally do a BOINC temporary exit with a message like the one you saw, and a time value so BOINC will restart the task from the last checkpoint 5 minutes later. That restart from checkpoint rereads the WU file, etc., so is starting fresh from a point in time before the problem. That may often be all that's needed and the task can go on and finish normally (as appears to have happened for your case). If not, another temporary exit will happen, and after 100 of those BOINC will abort the task. We hope the user will notice before that 8 hours and 20 minutes, at least try the suggested reboot and consider whether some cleaning or other maintenance is due.

There's one exception which doesn't use that temporary exit approach. For the OpenCL GPU apps the sanity check for Autocorr signals needed to be at a lower level. Therefore that sanity check also requires that a result_overflow happen, and since it has run all the way to that outcome the last checkpoint isn't a reliable restart point. Hence, for that combination of unusually strong Autocorrs with overflow the OpenCL GPU application will do an error exit.

It should be understood that these sanity checks have not been fully tested. They're really meant to catch things caused by a memory bit flip or other random occurrence which is nearly impossible to reproduce. So we had to simply do normal testing of the apps and thereby gain confidence the checks would not kick in when they shouldn't.

ID: 1830398 · Report as offensive

Questions and Answers : Windows : Is this a new error message


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.