Message boards :
Number crunching :
completed and can't validate - Too many errors (may have bug)
Message board moderation
Author | Message |
---|---|
RottenMutt Send message Joined: 15 Mar 01 Posts: 1011 Credit: 230,314,058 RAC: 0 |
http://setiathome.berkeley.edu/workunit.php?wuid=905557298 I was able to complete it but six attempts by other computers running the stock gpu application failed. |
RottenMutt Send message Joined: 15 Mar 01 Posts: 1011 Credit: 230,314,058 RAC: 0 |
http://setiathome.berkeley.edu/workunit.php?wuid=906814527 this one is just the opposite, lunatics failed. |
Granite T. Rock Send message Joined: 9 Jun 99 Posts: 17 Credit: 1,248,634 RAC: 1 |
I have a jan 14th and 18th one that have both failed on GPU despite doing a 150 or so WU over the last few days that were perfectly find. The 2nd person on a GPU to try them also failed. (one currently has about 5 errors and one success. Another has just sent a few more to be done after the 2nd person failed) http://setiathome.berkeley.edu/results.php?hostid=6270551&offset=0&show_names=0&state=5&appid= |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
http://setiathome.berkeley.edu/workunit.php?wuid=906814527 None of those actually "failed". They exited with -12. Which is kind of like the -9 exit status. In that a threshold trigger was passed that said "I'm done with this thing make it go away". I think it was said that the current version of the Lunatics Nvidia app generates fewer -12's than the previous one. There was an explanation of some kind about that. IIRC it was something Jason said in the post about the newest installer. Hopefully another few years with GPGPU apps and we won't be seeing these issues anymore. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
LadyL Send message Joined: 14 Sep 11 Posts: 1679 Credit: 5,230,097 RAC: 0 |
http://setiathome.berkeley.edu/workunit.php?wuid=906814527 -12 is a long standing bug in the orginal (stock) CUDA app. 'too many triplets found' Something about too many of them for the array or too close together. Jason managed to elimated the majority of those very early on (x32f) and is currently (among other things) working on getting rid of the rest of them. The next installer release is planned for when AP V6 or MB V7 are released to main, because those require updated applications and more crucially updated app_info.xml entries. |
Granite T. Rock Send message Joined: 9 Jun 99 Posts: 17 Credit: 1,248,634 RAC: 1 |
I wonder if there would be a way to flag WU's to replicate to different platforms when an error occurs. ie if a cuda causes trouble stick to the CPU platforms. |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
I wonder if there would be a way to flag WU's to replicate to different platforms when an error occurs. ie if a cuda causes trouble stick to the CPU platforms. BOINC doesn't have that exact option, but it does have an option to have reissues sent only to "reliable" hosts. See ProjectOptions#Acceleratingretries. Joe |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
I wonder if there would be a way to flag WU's to replicate to different platforms when an error occurs. ie if a cuda causes trouble stick to the CPU platforms. I wonder how effective that is on large GPU hosts. With those hosts processing several thousand tasks a day. A 1% error rate would make them "reliable", but still generate a large volume of errors. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.