Message boards :
Number crunching :
OpenCL SoG builds wrong best Gaussian issue
Message board moderation
Author | Message |
---|---|
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Please provide few test cases (tasks that experience this issue) for offline debugging. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
Try these for starters. I think they fit the profile. Workunit 2751991216 (23fe07ab.783.7025.8.35.123) Task 6178334846 (S=0, A=0, P=6, T=1, G=1, BS=23.29306, BG=3.708802) x41p_zi3t2b, Cuda 8.00 special Task 6178334847 (S=0, A=0, P=6, T=1, G=1, BS=23.29307, BG=3.324614) v8.22 (opencl_ati5_SoG_cat132) x86_64-pc-linux-gnu Workunit 2752150999 (10fe07ah.22962.20113.5.32.215) Task 6178672034 (S=0, A=0, P=5, T=0, G=1, BS=23.76112, BG=2.942735) x41p_zi3t2b, Cuda 8.00 special Task 6179298340 (S=0, A=0, P=5, T=0, G=1, BS=23.76112, BG=3.602647) v8.22 (opencl_nvidia_SoG) windows_intelx86 Workunit 2755940925 (23mr07ab.25461.19704.15.42.211) Task 6186604686 (S=3, A=0, P=7, T=0, G=0, BS=24.22033, BG=3.577149) x41p_zi3v, Cuda 9.00 special Task 6186604687 (S=3, A=0, P=7, T=0, G=0, BS=24.22038, BG=-0.123222) v8.22 (opencl_nvidia_SoG) windows_intelx86 Workunit 2756703314 (27mr07ad.26362.169896.15.42.136) Task 6188197822 (S=1, A=2, P=1, T=3, G=0, BS=24.18174, BG=3.814489) x41p_zi3t2b, Cuda 8.00 special Task 6188197823 (S=1, A=2, P=1, T=3, G=0, BS=24.18172, BG=3.205334) v8.22 (opencl_nvidia_SoG) windows_intelx86 This one doesn't involve SoG, but still looks like a Best Gaussian issue: Workunit 2755605345 (09fe07aa.23084.19600.14.41.85) Task 6185900355 (S=9, A=1, P=0, T=4, G=0, BS=25.14206, BG=7.303794) v8.08 (alt) windows_x86_64 Task 6185900356 (S=9, A=1, P=0, T=4, G=0, BS=25.14206, BG=7.303806) v8.00 (opencl_intel_gpu_sah) x86_64-apple-darwin Task 6189224114 (S=9, A=1, P=0, T=4, G=0, BS=25.14203, BG=7.837228) x41p_zi3t2b, Cuda 8.00 special These are all from yesterday, or earlier, so if they validate before you can grab the WUs, let me know and I'll zip and upload them. I haven't looked yet to see if any new ones cropped up overnight. I should be able to get to that a little later this morning. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
I've harvested all five data files. When I get a moment, I'll bench-run them against a variety of apps - deliberately including stock CPU, opti CPU, stock CUDA, and SoG for NVidia. The thing which stands out immediately is that there is a difference between SoG and the CUDA specials - but it doesn't say which of the two is at fault. I'll withhold comment until I've done the full tests. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
Here's one I didn't post, because it had already validated by the time I saw Raistmer's request. http://setiathome.berkeley.edu/workunit.php?wuid=2757159047 In this case, the intitial Best Gaussian dispute was between v 8.22 SoG and v8.08 (alt). The tiebreaker was one of my hosts running the Cuda8 zi3v Special, which agreed with the v8.08 (alt). I think that's almost always been the case, where SoG ends up being the odd app out. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
Looks like there's only one new one in my Inconclusives list this morning: Workunit 2758934343 (22au08ac.27666.15205.14.41.26) Task 6192838506 (S=2, A=0, P=0, T=0, G=1, BS=24.36489, BG=4.388716) x41p_zi3v, Cuda 9.00 special Task 6192838507 (S=2, A=0, P=0, T=0, G=1, BS=24.36489, BG=4.008747) SSE3xj Win32 Build 3584 If you still need more test cases, let me know, and I'll check for new ones daily (usually in the evening here, though). |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
OK, harvested that one too. I think six should be enough to point the finger. Edit - started a cpu-only run to get stock reference results, plus AVX and v8.08 (alt) for comparison. That can run overnight, and I'll do the GPUs tomorrow. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Thanks, got them.
That's for another day and another app :) SETI apps news We're not gonna fight them. We're gonna transcend them. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Here's a couple; https://setiathome.berkeley.edu/workunit.php?wuid=2758814753 at http://boinc2.ssl.berkeley.edu/sah/download_fanout/c6/22au08ac.30381.13160.13.40.190 https://setiathome.berkeley.edu/workunit.php?wuid=2759293007 at http://boinc2.ssl.berkeley.edu/sah/download_fanout/2d8/22fe07ab.15958.18477.4.31.143 I don't see any others right now, just dozens of Bad WingPeople. Mostly Intel and Old Cuda Apps. It's discouraging to see this many Bad machines in one location. I have to keep reminding myself that most people aren't being sent this many....and wonder Why. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
I noted several more candidates in my Inconclusives list this evening, but I'll just post this one, because the initial disagreement does not involve the Special App. It's actually just a SoG versus non-SoG OpenCL Best Gaussian dispute. The Special App will come into play when one of my hosts runs the tiebreaker, probably overnight. (I've grabbed the WU, just in case Workunit 2758653522 (01mr07ae.27039.6207.10.37.17) Task 6192256568 (S=0, A=1, P=0, T=4, G=0, BS=23.44804, BG=2.865992) v8.22 (opencl_nvidia_SoG) windows_intelx86 Task 6192256569 (S=0, A=1, P=0, T=4, G=0, BS=23.44798, BG=3.928634) v8.20 (opencl_ati5_mac) x86_64-apple-darwin EDIT: Corrected my punctuation just in case Professor Jord stops by. ;^) |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Here's an ATI SoG with a different Best gaussian than a CPU & CUDA 9, https://setiathome.berkeley.edu/workunit.php?wuid=2759313797 http://boinc2.ssl.berkeley.edu/sah/download_fanout/384/22fe07ab.15958.24203.4.31.179 The Tie-breaker was sent to another SoG App, my bet is the two SoG Apps will validate each other. ATI SoG = Best gaussian: peak=3.498996, mean=0.5477514, ChiSq=1.396472, time=7.55, d_freq=1419244285.04, score=0.8965543, null_hyp=2.296438, chirp=-78.421, fft_len=16k CUDA 9 = Best gaussian: peak=3.000421, mean=0.4881187, ChiSq=1.301829, time=64.59, d_freq=1419246384.14, score=1.297129, null_hyp=2.262592, chirp=45.543, fft_len=16k C P U === Best gaussian: peak=3.000417, mean=0.4881188, ChiSq=1.301833, time=64.59, d_freq=1419246384.14, score=1.297125, null_hyp=2.262594, chirp=45.543, fft_len=16k N V SoG = ??? Will the other SoG App also have a Negative Chirp? What is it with the Negative Numbers anyway? Look at the two results in the previous post. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Another machine with just One Inconclusive, this one; https://setiathome.berkeley.edu/workunit.php?wuid=2759814148 http://boinc2.ssl.berkeley.edu/sah/download_fanout/1ba/22fe07ab.27642.18068.13.40.23 Another, https://setiathome.berkeley.edu/workunit.php?wuid=2759806573 http://boinc2.ssl.berkeley.edu/sah/download_fanout/17d/22fe07ab.12124.7025.15.42.72 Already against a CPU, https://setiathome.berkeley.edu/workunit.php?wuid=2730052774 http://boinc2.ssl.berkeley.edu/sah/download_fanout/269/20fe07af.23104.1708.4.31.205 https://setiathome.berkeley.edu/workunit.php?wuid=2757602040 http://boinc2.ssl.berkeley.edu/sah/download_fanout/2bf/20fe07ah.14089.4162.13.40.110 https://setiathome.berkeley.edu/workunit.php?wuid=2716444548 http://boinc2.ssl.berkeley.edu/sah/download_fanout/2d0/14ap08aa.21398.377850.10.37.224 https://setiathome.berkeley.edu/workunit.php?wuid=2755684307 http://boinc2.ssl.berkeley.edu/sah/download_fanout/386/09fe07aa.31211.17146.16.43.250 |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Not sure I understand the question. Negative and positive chirps are to account for acceleration and deceleration inrelative motion of sender and receiver. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Well, my 'triple CPU app' back-to-back test is still running - so far with high Q (lowest so far 99.84%, others up to 99.97%) - but I'm beginning to see some possible clues. One of the apps I'm running is setiathome_8.08_windows_x86_64__alt.exe -verbwhich gives a handy readout of how BG gets updated as the run progresses. Here are some examples of the very end of the run, including the 'bests' at the end, from Jeff's samples. Workunit 2752150999 (10fe07ah.22962.20113.5.32.215)Best gaussian updated: score=-2.045143, fft_len=16384, PoT=2281, Offset=40, Peak=2.972347, TrueMean=0.5410559,ChiSq=1.354808,null_hyp=2.109536,PoTMaxPower=6.463551,icfft=19951 <snip> Pulse: peak=7.895826, time=34.21, period=2.958, d_freq=1419599382.31, score=1.01, chirp=-33.396, fft_len=64 Best pulse updated: score=1.052,power=8.223,fftlen=64,freq_bin=7,time_bin=5220,icfft=127118 Pulse: peak=8.223042, time=34.21, period=2.958, d_freq=1419599407.94, score=1.052, chirp=-37.107, fft_len=64 Best gaussian updated: score=1.007423, fft_len=16384, PoT=11475, Offset=46, Peak=2.942749, TrueMean=0.4834971,ChiSq=1.200529,null_hyp=2.192883,PoTMaxPower=5.892717,icfft=151616 Gaussian: peak=3.602648, mean=0.5467374, ChiSq=1.415486, time=84.72, d_freq=1419595265.97, score=0.14674, null_hyp=2.266679, chirp=-63.738, fft_len=16k Best spike: peak=23.7611, time=90.6, d_freq=1419598455.71, chirp=-15.446, fft_len=64k Best autocorr: peak=17.04326, time=87.24, delay=3.4986, d_freq=1419600635.6, chirp=11.763, fft_len=128k Best gaussian: peak=2.942749, mean=0.4834971, ChiSq=1.200529, time=78.01, d_freq=1419601121.25, score=1.007423, null_hyp=2.192883, chirp=56.885, fft_len=16k Best pulse: peak=8.223042, time=34.21, period=2.958, d_freq=1419599407.94, score=1.052, chirp=-37.107, fft_len=64 Best triplet: peak=0, time=-2.12e+11, period=0, d_freq=0, chirp=0, fft_len=0 ------------------------------------------------------------------------------------------------- Workunit 2758934343 (22au08ac.27666.15205.14.41.26)Best gaussian updated: score=-2.259846, fft_len=16384, PoT=11009, Offset=48, Peak=3.815549, TrueMean=0.5766529,ChiSq=1.244269,null_hyp=2.025229,PoTMaxPower=8.296173,icfft=116699 Best gaussian updated: score=-1.988101, fft_len=16384, PoT=11009, Offset=50, Peak=3.771862, TrueMean=0.5722938,ChiSq=1.369264,null_hyp=2.114343,PoTMaxPower=8.296173,icfft=116699 Best gaussian updated: score=0.2706466, fft_len=16384, PoT=5590, Offset=20, Peak=4.361972, TrueMean=0.556384,ChiSq=1.238112,null_hyp=2.164129,PoTMaxPower=10.61659,icfft=175496 Best gaussian updated: score=1.117292, fft_len=16384, PoT=5590, Offset=21, Peak=4.388737, TrueMean=0.5493843,ChiSq=1.228467,null_hyp=2.205332,PoTMaxPower=10.61659,icfft=175496 Gaussian: peak=4.008764, mean=0.55906, ChiSq=1.402518, time=37.75, d_freq=1420254014.8, score=0.5862517, null_hyp=2.274849, chirp=-85.39, fft_len=16k Best pulse updated: score=0.9579,power=1.2981,fftlen=128,freq_bin=98,time_bin=5967,icfft=180294 Best spike: peak=24.36486, time=100.7, d_freq=1420253872.35, chirp=-15.39, fft_len=128k Best autocorr: peak=16.88297, time=33.55, delay=0.75131, d_freq=1420253976.18, chirp=2.0842, fft_len=128k Best gaussian: peak=4.388737, mean=0.5493843, ChiSq=1.228467, time=36.07, d_freq=1420254158.06, score=1.117292, null_hyp=2.205332, chirp=-85.39, fft_len=16k Best pulse: peak=1.298145, time=78.22, period=0.2703, d_freq=1420258637.02, score=0.9579, chirp=89.745, fft_len=128 Best triplet: peak=0, time=-2.121e+11, period=0, d_freq=0, chirp=0, fft_len=0 ------------------------------------------------------------------------------------------------- Workunit 2751991216 (23fe07ab.783.7025.8.35.123)New best spike:score:-0.23483, power: 23.293, index=54641, fft_len=131072, ifft=0,icfft=104362 Pulse: peak=1.493503, time=75.64, period=0.3186, d_freq=1421201480.52, score=1.026, chirp=-44.334, fft_len=16 Pulse: peak=1.457052, time=75.64, period=0.3186, d_freq=1421201531.93, score=1.001, chirp=-51.723, fft_len=16 Best triplet updated:score=9.829; power=9.829; freq_bin=2.372e-322; icfft=176475 Triplet: peak=9.829372, time=60.85, period=5.095, d_freq=1421203395.36, chirp=76.662, fft_len=64 Pulse: peak=1.582683, time=62.04, period=0.3572, d_freq=1421205752.28, score=1.022, chirp=-79.894, fft_len=256 Best gaussian updated: score=0.5934811, fft_len=16384, PoT=5691, Offset=26, Peak=3.708802, TrueMean=0.5267774,ChiSq=1.296268,null_hyp=2.219926,PoTMaxPower=8.990918,icfft=196100 Gaussian: peak=3.324607, mean=0.529859, ChiSq=1.394297, time=46.14, d_freq=1421200299.6, score=0.5609522, null_hyp=2.275451, chirp=-92.428, fft_len=16k Best spike: peak=23.29306, time=6.711, d_freq=1421205065.57, chirp=-26.431, fft_len=128k Best autocorr: peak=17.08286, time=87.24, delay=5.1935, d_freq=1421201665.11, chirp=5.6537, fft_len=128k Best gaussian: peak=3.708802, mean=0.5267774, ChiSq=1.296268, time=44.46, d_freq=1421200454.67, score=0.5934811, null_hyp=2.219926, chirp=-92.428, fft_len=16k Best pulse: peak=10.4754, time=20.64, period=3.618, d_freq=1421199913.27, score=1.053, chirp=-24.014, fft_len=64 Best triplet: peak=9.829372, time=60.85, period=5.095, d_freq=1421203395.36, chirp=76.662, fft_len=64 ------------------------------------------------------------------------------------------------- Cyan marks what seems to be the accepted value from my test run. Red marks the stray one reported in Jeffs summary. In each case, the 'stray' value seems to come from an extra, lower-scoring, gaussian found after the last 'official' 'Best' update. Edit - looking at the original reports on the web (from Jeff's links), in every Cyan case, the 'Best Gaussian' is not reported as a standalone gaussian: in every Red case, the 'Best Gaussian' is reported separately too. Seems to be some confusion over the terms 'best', 'reportable', and perhaps 'score'. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Thanks for analysis! It almost shows lines in code that need to check, very handy. Yes, it's quite weird when "best" non-reportable is better than "reportable"but seems CPU apps prefer this way. Would be good to check with other tasks in this thread if it's the single situation when bug appears? //Actually after finding first reportable this weirdness removed cause only new reportable is checked for "bestness". This change is accepted by stock too now. So, the situation when reportable worse than "best" can happen only before first "best" signal update from reportable signals pool. Theoretically there can be any amount of such reportable but "not-good-enough-to-become-best" signals, but probability of such situation is extremely low. Also I'm not sure if it's correct at all to keep best un-updated when first reportable is found. But it should be possible (if non-SoG processes such case OK already) to bring SoG in line and process weirdness in common way. Hardly we should make update of whole CPU/GPU apps pool just to remove this inconsistency. Would be good to make (and not forget it) needed changes in base code when other reasons will require application version update also. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
Jason had a brief exchange with Eric back in June, after we had been discussing this issue in earlier posts. Nothing ever really got settled though, and then Jason seemed to go AWOL. In the meantime, multiple response tweets from Eric (lmao). Analysing & compiling the info. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
If intent is "always to have best gaussian" then yes... and no. To re-select best FROM already found gaussians will meet this intent also but will allow to have more consistent output from tasks WITH reportable gaussians. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
If someone have offline result (correct one) for 10fe07ah.22962.20113.5.32.215 please upload to speedup correctness check. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Bur which one is correct? :P |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Bur which one is correct? :P In current situation- one that matches CPU apps. As I said, hardly there will be any changes in basic code before next major version update. In few minutes I'll have stderr of changed build .... And got it. 2.94 peak for best and 3.xx for reported. Seems as should be. More formal comparison will follow but seems this bug fixed. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Good work. I'd suggest it would be wise to run the whole scenario past Eric again, reminding him of those tweets, before coming to a final release decision. (Eric is prepping up for Parkes data - expect an announcement on Wednesday - and keeps talking about a SETI v10) My CPU test run will have finished now, so I'll set up for GPU - NVidia in my case (though I'll toss the intel_gpu into the mix as well - test machine has both). Um, on second thoughts, iGPU may have to wait until a later run - I've forgotten how to manage the device assignment! :-( If you have a preview copy of the fix, I can toss that in at the same time - you have my email. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.