Message boards :
Number crunching :
SETI@home v8.12 Windows GPU applications support thread
Message board moderation
Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 17 · Next
Author | Message |
---|---|
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Please keep overflowed results out of this thread until solid evidence of false positive or real signal omitting will be acquired. I strongly refuse to spend time on any discussion of partial subsets reported. FYI those enthusiasts who still don't know what "overflow" is: SETI@Home Informational message -9 result_overflow in stderr means overflow. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
What exact interesting here? The evidence that GPU is multiprocessor device w/o strong ordering indeed? This fact represented in any GPGPU review. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
How to acquire meaningful data from suspicious overflow: 1) download corresponding task for offline testing 2) edit corresponding task for signal restriction removal/increase (big increase!) 3) re-run edited task with reference CPU app 4) compare all results from that run with reported subset by app under investigation. 5) report if there are false positives, with considerable excess power over threshold in that subset (that is, no match for particular subset result versus full list of reported signals for task). Now details: 3*, 5* : <analysis_cfg> <spike_thresh>24</spike_thresh> <spikes_per_spectrum>1</spikes_per_spectrum> <autocorr_thresh>17.8</autocorr_thresh> <autocorr_per_spectrum>1</autocorr_per_spectrum> <autocorr_fftlen>131072</autocorr_fftlen> <gauss_null_chi_sq_thresh>2.43685937</gauss_null_chi_sq_thresh> <gauss_chi_sq_thresh>1.41999996</gauss_chi_sq_thresh> <gauss_power_thresh>3</gauss_power_thresh> <gauss_peak_power_thresh>3.20000005</gauss_peak_power_thresh> <gauss_pot_length>64</gauss_pot_length> <pulse_thresh>19.7340908</pulse_thresh> <pulse_display_thresh>0.5</pulse_display_thresh> <pulse_max>40960</pulse_max> <pulse_min>16</pulse_min> <pulse_fft_max>8192</pulse_fft_max> <pulse_pot_length>256</pulse_pot_length> <triplet_thresh>9.73841</triplet_thresh> <triplet_max>131072</triplet_max> <triplet_min>16</triplet_min> <triplet_pot_length>256</triplet_pot_length> <pot_overlap_factor>0.5</pot_overlap_factor> <pot_t_offset>1</pot_t_offset> <pot_min_slew>0.00209999993</pot_min_slew> <pot_max_slew>0.0104999999</pot_max_slew> <chirp_resolution>0.333</chirp_resolution> <analysis_fft_lengths>262136</analysis_fft_lengths> <bsmooth_boxcar_length>8192</bsmooth_boxcar_length> <bsmooth_chunk_size>32768</bsmooth_chunk_size> <chirps> <chirp_parameter_t> <chirp_limit>3</chirp_limit> <fft_len_flags>262136</fft_len_flags> </chirp_parameter_t> <chirp_parameter_t> <chirp_limit>10</chirp_limit> <fft_len_flags>65528</fft_len_flags> </chirp_parameter_t> </chirps> <pulse_beams>1</pulse_beams> <max_signals>30</max_signals> <max_spikes>8</max_spikes> <max_gaussians>0</max_gaussians> <max_pulses>0</max_pulses> <max_triplets>0</max_triplets> <keyuniq>-7344129</keyuniq> <credit_rate>2.8499999</credit_rate> </analysis_cfg> Points of interest in bold. Also, my builds have extended ability regarding signals info in stderr. Per ReadMe: Levels from 2 to 5 reserved for increasing verbosity, higher levels reserved for specific usage. -v 2 enables all signals output. So, -v 2 will allow to follow "bests" formation through whole task processing. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
In my understanding interesting part is -tt 60 in this case (overflow) not revision number. Because of longer kernel time signal logging is different because kernel now 60ms instead of 15ms. One of the reasons we think validator needs adjustment for overflows. With each crime and every kindness we birth our future. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
In my understanding interesting part is -tt 60 in this case (overflow) not revision number. And, of course, "ms" is the time measurement unit from real world. Amount of work different GPU models can do during similar time intervals is different. As I said there is no sense to compare subsets! Acquire full set first then do comparison. Or just waste of time occurs. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Sure - we tried to obtain the data file for that example, but it had already been deleted from the server before the invalid result for NV_SoG was drawn to our attention. That's why I'm placing the emphasis on identifying the WUs at the earlier inconclusive stage, when the data may be held locally, and can certainly still be retrieved from the server. I'm probably processing over 500 guppies a day currently - possibly well over. I can't possibly check every one, which is why I'm trying to filter out the potentially interesting anomalies, in the hope that they can contribute to even better first-time validation rates in the future. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Happened to visit a computer I don't normally watch closely, and found that this had just happened: 26/08/2016 14:04:06 | SETI@home | Task postponed: Suspicious pulse results, host needs reboot or maintenance - only symptom was that the oldest r3500 task was 'waiting to run', and the second-oldest was running instead. Anyway, the task restarted normally, and validated at the first attempt - even against a stock apple-darwin. WU 2246185168 The only clue in my stderr is the repeated lines starting with Priority of worker thread raised successfully (again) and later Restarted at 41.25 percent. Would it be a good idea to log these interruptions, with the data found to be 'suspicious'? And I second the suggestion a few days ago of pushing the task name into the Event Log - I actually had seven guppies running at the time, so it wouldn't have been easy to track down the offender, if I hadn't happened to notice the task waiting. Anyway, I've got both data and result, if they're worth looking at. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
I would say that "offender" not the task but device that processes it. Of course, if there is multy-device host adding what device triggered task restart would be good. Though this info can be devised from stderr already if needed. SETI apps news We're not gonna fight them. We're gonna transcend them. |
robertmiles Send message Joined: 16 Jan 12 Posts: 213 Credit: 4,117,756 RAC: 6 |
I would say that "offender" not the task but device that processes it. If so, more information is needed about what the device did wrong. I've seen that warning a few times, with no other indication of problems with the device. Should I eliminate the warning by setting SETI@home to No new tasks, since I currently have no other indication that anything is wrong? Currently, there is no indication of which task had this warning, which makes it hard to determine which stderr to inspect. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
I would say that "offender" not the task but device that processes it. Device did wrong computation. That's all app knows. It's up to human owner to find the reason. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Should I eliminate the warning by setting SETI@home to No new tasks, since I currently have no other indication that anything is wrong? you had excess number of invalids also. What another indication of broken setup one need to start troubleshooting?? http://setiathome.berkeley.edu/forum_thread.php?id=79760&postid=1810663 SETI apps news We're not gonna fight them. We're gonna transcend them. |
robertmiles Send message Joined: 16 Jan 12 Posts: 213 Credit: 4,117,756 RAC: 6 |
Should I eliminate the warning by setting SETI@home to No new tasks, since I currently have no other indication that anything is wrong? What excess number of invalids? My task list shows one task list marked as invalid, eight days ago, and none before or since. Why should I assume that you are looking at a task list for the correct user, and why should I assume that the application can tell whether a problem is due to a specific graphics board instead of due to a problem with the way the application handles older graphics boards? |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
Should I eliminate the warning by setting SETI@home to No new tasks, since I currently have no other indication that anything is wrong? I have found 3 invalids on your card. And i am his alpha tester. With each crime and every kindness we birth our future. |
robertmiles Send message Joined: 16 Jan 12 Posts: 213 Credit: 4,117,756 RAC: 6 |
Should I eliminate the warning by setting SETI@home to No new tasks, since I currently have no other indication that anything is wrong? Then why aren't the two others still listed as invalid? Were they with the stock SoG application and therefore not what I'm currently testing? And how can you tell that they're a problem with the card instead of a problem with the way r3500 handles a GTX 560? |
Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489 |
Then why aren't the two others still listed as invalid? Because once the tasks validate they're deleted 24hrs later. ;-) Cheers. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
And how can you tell that they're a problem with the card instead of a problem with the way r3500 handles a GTX 560? If your card is the only one that's having the issue, then the card (or the system it's in) is most likely the cause. Grant Darwin NT |
robertmiles Send message Joined: 16 Jan 12 Posts: 213 Credit: 4,117,756 RAC: 6 |
Then why aren't the two others still listed as invalid? That's not what I am seeing. The only one of my tasks marked invalid has been marked that way for several days. Or are you implying that some of the tasks the validator marks as valid are in fact invalid, but still deleted 24 hours later since they were marked valid? |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Try to find hosts with other GTX560 running OpenCL NV MB. And preferably - on beta (cause there results last longer). How they behave? Did you use some tuning other than proposed in ReadMe? What tuning line? SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
When workunit validates all results (including invalids ones and computation errored ones) purged from BOINC database. Usually it happens after 24h from validation. Sometimes task can hand for much longer times but it's issue with BOINC backend in Berkeley, not rule of thumb. SETI apps news We're not gonna fight them. We're gonna transcend them. |
robertmiles Send message Joined: 16 Jan 12 Posts: 213 Credit: 4,117,756 RAC: 6 |
And how can you tell that they're a problem with the card instead of a problem with the way r3500 handles a GTX 560? Where have you found tasks run on a different GTX 560 running on another host running Windows Vista so you can pin the cause to something in this host? |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.