SETI@home v8.12 Windows GPU applications support thread

Author	Message
Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1812140 - Posted: 24 Aug 2016, 11:29:55 UTC Last modified: 24 Aug 2016, 11:32:45 UTC Please keep overflowed results out of this thread until solid evidence of false positive or real signal omitting will be acquired. I strongly refuse to spend time on any discussion of partial subsets reported. FYI those enthusiasts who still don't know what "overflow" is: SETI@Home Informational message -9 result_overflow in stderr means overflow. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1812140 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1812142 - Posted: 24 Aug 2016, 11:35:33 UTC - in response to Message 1812102. The interesting thing about this one is that NV SoG r3500 was invalid, while ATi SoG r3430 was valid. What exact interesting here? The evidence that GPU is multiprocessor device w/o strong ordering indeed? This fact represented in any GPGPU review. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1812142 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1812144 - Posted: 24 Aug 2016, 11:48:12 UTC Last modified: 24 Aug 2016, 11:51:08 UTC How to acquire meaningful data from suspicious overflow: 1) download corresponding task for offline testing 2) edit corresponding task for signal restriction removal/increase (big increase!) 3) re-run edited task with reference CPU app 4) compare all results from that run with reported subset by app under investigation. 5) report if there are false positives, with considerable excess power over threshold in that subset (that is, no match for particular subset result versus full list of reported signals for task). Now details: 3, 5 : <analysis_cfg> <spike_thresh>24</spike_thresh> <spikes_per_spectrum>1</spikes_per_spectrum> <autocorr_thresh>17.8</autocorr_thresh> <autocorr_per_spectrum>1</autocorr_per_spectrum> <autocorr_fftlen>131072</autocorr_fftlen> <gauss_null_chi_sq_thresh>2.43685937</gauss_null_chi_sq_thresh> <gauss_chi_sq_thresh>1.41999996</gauss_chi_sq_thresh> <gauss_power_thresh>3</gauss_power_thresh> <gauss_peak_power_thresh>3.20000005</gauss_peak_power_thresh> <gauss_pot_length>64</gauss_pot_length> <pulse_thresh>19.7340908</pulse_thresh> <pulse_display_thresh>0.5</pulse_display_thresh> <pulse_max>40960</pulse_max> <pulse_min>16</pulse_min> <pulse_fft_max>8192</pulse_fft_max> <pulse_pot_length>256</pulse_pot_length> <triplet_thresh>9.73841</triplet_thresh> <triplet_max>131072</triplet_max> <triplet_min>16</triplet_min> <triplet_pot_length>256</triplet_pot_length> <pot_overlap_factor>0.5</pot_overlap_factor> <pot_t_offset>1</pot_t_offset> <pot_min_slew>0.00209999993</pot_min_slew> <pot_max_slew>0.0104999999</pot_max_slew> <chirp_resolution>0.333</chirp_resolution> <analysis_fft_lengths>262136</analysis_fft_lengths> <bsmooth_boxcar_length>8192</bsmooth_boxcar_length> <bsmooth_chunk_size>32768</bsmooth_chunk_size> <chirps> <chirp_parameter_t> <chirp_limit>3</chirp_limit> <fft_len_flags>262136</fft_len_flags> </chirp_parameter_t> <chirp_parameter_t> <chirp_limit>10</chirp_limit> <fft_len_flags>65528</fft_len_flags> </chirp_parameter_t> </chirps> <pulse_beams>1</pulse_beams> <max_signals>30</max_signals> <max_spikes>8</max_spikes> <max_gaussians>0</max_gaussians> <max_pulses>0</max_pulses> <max_triplets>0</max_triplets> <keyuniq>-7344129</keyuniq> <credit_rate>2.8499999</credit_rate> </analysis_cfg> Points of interest in bold. Also, my builds have extended ability regarding signals info in stderr. Per ReadMe: Levels from 2 to 5 reserved for increasing verbosity, higher levels reserved for specific usage. -v 2 enables all signals output. So, -v 2 will allow to follow "bests" formation through whole task processing. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1812144 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1812148 - Posted: 24 Aug 2016, 12:03:01 UTC In my understanding interesting part is -tt 60 in this case (overflow) not revision number. Because of longer kernel time signal logging is different because kernel now 60ms instead of 15ms. One of the reasons we think validator needs adjustment for overflows. With each crime and every kindness we birth our future. ID: 1812148 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1812153 - Posted: 24 Aug 2016, 12:06:27 UTC - in response to Message 1812148. In my understanding interesting part is -tt 60 in this case (overflow) not revision number. Because of longer kernel time signal logging is different because kernel now 60ms instead of 15ms. One of the reasons we think validator needs adjustment for overflows. And, of course, "ms" is the time measurement unit from real world. Amount of work different GPU models can do during similar time intervals is different. As I said there is no sense to compare subsets! Acquire full set first then do comparison. Or just waste of time occurs. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1812153 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1812169 - Posted: 24 Aug 2016, 12:22:55 UTC - in response to Message 1812153. Sure - we tried to obtain the data file for that example, but it had already been deleted from the server before the invalid result for NV_SoG was drawn to our attention. That's why I'm placing the emphasis on identifying the WUs at the earlier inconclusive stage, when the data may be held locally, and can certainly still be retrieved from the server. I'm probably processing over 500 guppies a day currently - possibly well over. I can't possibly check every one, which is why I'm trying to filter out the potentially interesting anomalies, in the hope that they can contribute to even better first-time validation rates in the future. ID: 1812169 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1812766 - Posted: 26 Aug 2016, 13:42:28 UTC Happened to visit a computer I don't normally watch closely, and found that this had just happened: 26/08/2016 14:04:06 \| SETI@home \| Task postponed: Suspicious pulse results, host needs reboot or maintenance - only symptom was that the oldest r3500 task was 'waiting to run', and the second-oldest was running instead. Anyway, the task restarted normally, and validated at the first attempt - even against a stock apple-darwin. WU 2246185168 The only clue in my stderr is the repeated lines starting with Priority of worker thread raised successfully (again) and later Restarted at 41.25 percent. Would it be a good idea to log these interruptions, with the data found to be 'suspicious'? And I second the suggestion a few days ago of pushing the task name into the Event Log - I actually had seven guppies running at the time, so it wouldn't have been easy to track down the offender, if I hadn't happened to notice the task waiting. Anyway, I've got both data and result, if they're worth looking at. ID: 1812766 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1812802 - Posted: 26 Aug 2016, 17:27:00 UTC - in response to Message 1812766. I would say that "offender" not the task but device that processes it. Of course, if there is multy-device host adding what device triggered task restart would be good. Though this info can be devised from stderr already if needed. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1812802 ·

robertmiles Volunteer tester Send message Joined: 16 Jan 12 Posts: 213 Credit: 4,117,756 RAC: 6	Message 1812871 - Posted: 27 Aug 2016, 0:55:53 UTC - in response to Message 1812802. I would say that "offender" not the task but device that processes it. Of course, if there is multy-device host adding what device triggered task restart would be good. Though this info can be devised from stderr already if needed. If so, more information is needed about what the device did wrong. I've seen that warning a few times, with no other indication of problems with the device. Should I eliminate the warning by setting SETI@home to No new tasks, since I currently have no other indication that anything is wrong? Currently, there is no indication of which task had this warning, which makes it hard to determine which stderr to inspect. ID: 1812871 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1812933 - Posted: 27 Aug 2016, 7:11:26 UTC - in response to Message 1812871. I would say that "offender" not the task but device that processes it. Of course, if there is multy-device host adding what device triggered task restart would be good. Though this info can be devised from stderr already if needed. If so, more information is needed about what the device did wrong. I've seen that warning a few times, with no other indication of problems with the device. Should I eliminate the warning by setting SETI@home to No new tasks, since I currently have no other indication that anything is wrong? Currently, there is no indication of which task had this warning, which makes it hard to determine which stderr to inspect. Device did wrong computation. That's all app knows. It's up to human owner to find the reason. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1812933 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1812934 - Posted: 27 Aug 2016, 7:21:55 UTC - in response to Message 1812871. Last modified: 27 Aug 2016, 7:24:41 UTC Should I eliminate the warning by setting SETI@home to No new tasks, since I currently have no other indication that anything is wrong? you had excess number of invalids also. What another indication of broken setup one need to start troubleshooting?? http://setiathome.berkeley.edu/forum_thread.php?id=79760&postid=1810663 SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1812934 ·

robertmiles Volunteer tester Send message Joined: 16 Jan 12 Posts: 213 Credit: 4,117,756 RAC: 6	Message 1813066 - Posted: 27 Aug 2016, 22:08:03 UTC - in response to Message 1812934. Should I eliminate the warning by setting SETI@home to No new tasks, since I currently have no other indication that anything is wrong? you had excess number of invalids also. What another indication of broken setup one need to start troubleshooting?? http://setiathome.berkeley.edu/forum_thread.php?id=79760&postid=1810663 What excess number of invalids? My task list shows one task list marked as invalid, eight days ago, and none before or since. Why should I assume that you are looking at a task list for the correct user, and why should I assume that the application can tell whether a problem is due to a specific graphics board instead of due to a problem with the way the application handles older graphics boards? ID: 1813066 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1813067 - Posted: 27 Aug 2016, 22:11:27 UTC - in response to Message 1813066. Should I eliminate the warning by setting SETI@home to No new tasks, since I currently have no other indication that anything is wrong? you had excess number of invalids also. What another indication of broken setup one need to start troubleshooting?? http://setiathome.berkeley.edu/forum_thread.php?id=79760&postid=1810663 What excess number of invalids? My task list shows one task list marked as invalid, eight days ago, and none before or since. Why should I assume that you are looking at a task list for the correct user, and why should I assume that the application can tell whether a problem is due to a specific graphics board instead of due to a problem with the way the application handles older graphics boards? I have found 3 invalids on your card. And i am his alpha tester. With each crime and every kindness we birth our future. ID: 1813067 ·

robertmiles Volunteer tester Send message Joined: 16 Jan 12 Posts: 213 Credit: 4,117,756 RAC: 6	Message 1813122 - Posted: 28 Aug 2016, 3:01:53 UTC - in response to Message 1813067. Should I eliminate the warning by setting SETI@home to No new tasks, since I currently have no other indication that anything is wrong? you had excess number of invalids also. What another indication of broken setup one need to start troubleshooting?? http://setiathome.berkeley.edu/forum_thread.php?id=79760&postid=1810663 What excess number of invalids? My task list shows one task list marked as invalid, eight days ago, and none before or since. Why should I assume that you are looking at a task list for the correct user, and why should I assume that the application can tell whether a problem is due to a specific graphics board instead of due to a problem with the way the application handles older graphics boards? I have found 3 invalids on your card. And i am his alpha tester. Then why aren't the two others still listed as invalid? Were they with the stock SoG application and therefore not what I'm currently testing? And how can you tell that they're a problem with the card instead of a problem with the way r3500 handles a GTX 560? ID: 1813122 ·

Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489	Message 1813131 - Posted: 28 Aug 2016, 4:16:43 UTC Then why aren't the two others still listed as invalid? Because once the tasks validate they're deleted 24hrs later. ;-) Cheers. ID: 1813131 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1813133 - Posted: 28 Aug 2016, 5:00:55 UTC - in response to Message 1813122. And how can you tell that they're a problem with the card instead of a problem with the way r3500 handles a GTX 560? If your card is the only one that's having the issue, then the card (or the system it's in) is most likely the cause. Grant Darwin NT ID: 1813133 ·

robertmiles Volunteer tester Send message Joined: 16 Jan 12 Posts: 213 Credit: 4,117,756 RAC: 6	Message 1813137 - Posted: 28 Aug 2016, 6:16:07 UTC - in response to Message 1813131. Then why aren't the two others still listed as invalid? Because once the tasks validate they're deleted 24hrs later. ;-) Cheers. That's not what I am seeing. The only one of my tasks marked invalid has been marked that way for several days. Or are you implying that some of the tasks the validator marks as valid are in fact invalid, but still deleted 24 hours later since they were marked valid? ID: 1813137 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1813138 - Posted: 28 Aug 2016, 6:17:02 UTC - in response to Message 1813122. And how can you tell that they're a problem with the card instead of a problem with the way r3500 handles a GTX 560? Try to find hosts with other GTX560 running OpenCL NV MB. And preferably - on beta (cause there results last longer). How they behave? Did you use some tuning other than proposed in ReadMe? What tuning line? SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1813138 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1813139 - Posted: 28 Aug 2016, 6:19:02 UTC - in response to Message 1813137. Last modified: 28 Aug 2016, 6:19:25 UTC Or are you implying that some of the tasks the validator marks as valid are in fact invalid, but still deleted 24 hours later since they were marked valid? When workunit validates all results (including invalids ones and computation errored ones) purged from BOINC database. Usually it happens after 24h from validation. Sometimes task can hand for much longer times but it's issue with BOINC backend in Berkeley, not rule of thumb. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1813139 ·

robertmiles Volunteer tester Send message Joined: 16 Jan 12 Posts: 213 Credit: 4,117,756 RAC: 6	Message 1813140 - Posted: 28 Aug 2016, 6:19:24 UTC - in response to Message 1813133. And how can you tell that they're a problem with the card instead of a problem with the way r3500 handles a GTX 560? If your card is the only one that's having the issue, then the card (or the system it's in) is most likely the cause. Where have you found tasks run on a different GTX 560 running on another host running Windows Vista so you can pin the cause to something in this host? ID: 1813140 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.