Message boards :
Number crunching :
Linux CUDA 'Special' App finally available, featuring Low CPU use
Message board moderation
Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 83 · Next
| Author | Message |
|---|---|
|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 6,279
|
Looking at some results it appears the AMD CPUs still don't like the CPU App. It seems better than the ssse3 version, but still giving errors. I'm going to add a folder to the download with an App compiled just for AMD CPUs and see if that works better. If you have an AMD CPU just move the CPU App from the folder to the root level replacing the existing CPU App. Since it's flagged for AMD, it won't work on my Intels, so, you're on your own. The current SSE41 App works fine on my Intel CPUs. The only difference with the zi3v GPU App is it uses the gaussfit & autocorr from zi3x, so, it is a little different from just zi3v compiled with CUDA 9 and might show some slight difference, hopefully for the better. It seems to work better on my machines anyway. The new file has been uploaded, if you have an AMD CPU download Linux_zi3v-CUDA90_Special.7z again. |
Jeff Buck ![]() Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0
|
So, compiling zi3v as Cuda 9 gives a more accurate result than the Cuda 8 version. That's good to know. Might be good to know why that happens, too, since I assume the underlying code is still the same, just linking to different libraries.I guess I'll take the risk of another tirade, but I think a discrepancy between the Cuda 9 zi3v and the Cuda 8 zi3v (on a non-overflow WU) is worth looking at. |
Stephen "Heretic" ![]() Send message Joined: 20 Sep 12 Posts: 5384 Credit: 192,787,363 RAC: 1,426
|
Well, I guess it's time to post the CUDA 9 version. It does seem to produce better results. . . Hi TBar, . . Are there any further hardware restrictions with the Cuda90 version? Is it OK with Maxwell as well as Pascal cards? Stephen ?? |
|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 6,279
|
Well, I guess it's time to post the CUDA 9 version. It does seem to produce better results.I guess I'll take the risk of another tirade, but I think a discrepancy between the Cuda 9 zi3v and the Cuda 8 zi3v (on a non-overflow WU) is worth looking at. Do everyone a favor and update to the New and Improved version of x41p_zi3v, it's in the same location; Linux_zi3v-CUDA90_Special App ...for fewer Inconclusive Results BTW, the Default CPU App is now SSE41 with FFTW 3.3.7. On my machine it's just a couple percent better than the old ssse3 App, YMMV. |
|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 6,279
|
Look at it now, https://setiathome.berkeley.edu/workunit.php?wuid=2752595674 The tie-breaker was sent to another SoG App, and Yes, they validated each other. So....my Correct result, that matched a CPU, was over-ruled by two SoG Apps known to produce Bad Best Gaussians. Who'd a thunk it? Well, anyone not in severe denial would have seen that coming. I wonder how many times that has happened in the past year or so... I got another one, https://setiathome.berkeley.edu/workunit.php?wuid=2752595674 It looked the same, I decided to run it. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6242 Credit: 106,370,077 RAC: 275
|
I guess I'll take the risk of another tirade, but I think a discrepancy between the Cuda 9 zi3v and the Cuda 8 zi3v (on a non-overflow WU) is worth looking at. CPU gave: Spike count: 0 Autocorr count: 1 Pulse count: 11 Triplet count: 8 Gaussian count: 0 So seems it's missing 11th pulse instead of extra one. In CPU it: Pulse: peak=1.497699, time=67.72, period=0.3382, d_freq=1418807571.33, score=1.009, chirp=75.027, fft_len=8 SETI apps news We're not gonna fight them. We're gonna transcend them. |
Jeff Buck ![]() Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0
|
I guess I'll take the risk of another tirade, but I think a discrepancy between the Cuda 9 zi3v and the Cuda 8 zi3v (on a non-overflow WU) is worth looking at. Workunit 2752775513 (28mr07am.15114.4162.8.35.134) Task 6179979325 (S=0, A=1, P=11, T=8, G=0, BS=23.41541, BG=3.899168) x41p_zi3v, Cuda 9.00 special Task 6179979326 (S=0, A=1, P=10, T=8, G=0, BS=23.41541, BG=3.89917) x41p_zi3v, Cuda 8.00 special The extra signal reported by the Cuda 9 version is:: Pulse: peak=1.497699, time=67.72, period=0.3382, d_freq=1418807571.33, score=1.009, chirp=75.027, fft_len=8 |
Raistmer Send message Joined: 16 Jun 01 Posts: 6242 Credit: 106,370,077 RAC: 275
|
Exactly. So no sense to post 99.7% as response on missing spike or wrong best issue. Those issues not about processing accuracy. Still they are issues. App's base accuracy is well within expectations range.
Hm. Accusations? Perhaps we read thread differently.
Well, AFAIK Nebula uses only signals, not best signals (so far). It can be changed in future of course. EDIT: I personally was very concerned about wrong Pulse selection in early builds. And cause there is no warranty that current wrong best pulse selection and old wrong pulse selection are from different bugs that makes strong willing that this issue be resolved too before any further movement. NaNs in spikes are so obvious flaws in resume logic that really nothing to add here. SETI apps news We're not gonna fight them. We're gonna transcend them. |
|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 6,279
|
Your reaction is understandable if you perceive any found bug in Petri's app as personal offence. But why so?When people start posting such carp as "will these problems cause us to miss ET", it's time for a little reality check. Yes, Petri's App occasionally misses the Best Pulse, However, other Apps have similar problems with missing Best Signals. Other Apps have horrendous Inconclusive rates, I don't see people asking about those other problems. There is absolutely Nothing wrong with Petri's App that isn't shared by other Apps. In fact, since Petri's App is quite a bit faster, it might actually Help us find ET. Isn't the Project requesting More Computing power? Well, increasing the speed of the current hardware is in fact the same as adding more users. I keep hearing accusations the App isn't "accurate" when actually it is Very accurate. Here's the Bench of the task with 21 signals, ---------------------------------------------------
Running app with command : setiathome_x41p_zi3v_x86_64-apple-darwin_cuda90 -device 0
207.60 real 46.56 user 37.22 sys
Elapsed Time : ……………………………… 208 seconds
Speed compared to default : 3705 %
-----------------
Comparing results
Result : Strongly similar, Q= 99.95%
---------------------------------------------------
Done with 22fe07aj.17322.13569.13.40.218.wuThe App was 99.95% similar to the CPU on 21 signals. It doesn't get much more accurate. The problem with selecting the correct best pulse has nothing to do with "accuracy", it's a simple matter of programming to select the correct result from a number of candidates. Now that people understand the most widely used Apps at SETI also have occasional problems with Bad Best Signals maybe the wild accusations made recently in this thread will stop? We can only Hope. This is merely a reality check. If someone is really concerned about "missing ET' I suggest you also avoid the SoG Apps as they have a similar problem as Petri's App...and definitely avoid the Intel Apps. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6242 Credit: 106,370,077 RAC: 275
|
just found Two examples of the Bad Best Results....and I wasn't even trying. I just hope you kept tasks cause after prev netbook death not sure if I have previously collected test cases. In summer that were last benches it did. Your reaction is understandable if you perceive any found bug in Petri's app as personal offence. But why so? And regarding Best Gaussian SoG app issue (though it was stated just in this very thread before, but if we make circles here...): - yes, issue exists -yes, I confirmed it already on ATi host (and there it was specific to SoG path; non-SoG OpenCL doesn't have it) -no, it still not fixed -yes, it will be fixed when I'll have working dev environment at my disposal again. Currently it's not the case. Here I could add possibility that some other could fix issue (it's open source freely available: https://setisvn.ssl.berkeley.edu/svn/branches/sah_v7_opt) but chances ~same as someone but Petri could fix his app (approach to zero). -yes, sad that this issue was found only after deployment on main. It just shows another flaw of beta site testing. In this point of view one should be happy that in some sense similar issue in Petri's CUDA app was found on much earlier stage. But, as they say, two wrongs doesn't make it right, yeah? SETI apps news We're not gonna fight them. We're gonna transcend them. |
|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 12990 Credit: 208,696,464 RAC: 690
|
So you see, it's not just the CUDA App that sometimes produces a Bad Best Result, the other Apps do as well. But, if you keep harping on the One App, I haven't been harping on- My initial response was to a post that claimed it wasn't an issue- the fact is it is an issue, that needs to be addressed. Since then I have just been responding to (mostly) your posts. Grant Darwin NT |
|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 12990 Credit: 208,696,464 RAC: 690
|
The fact that your host processes more work per hour increases the likely hood of it picking up bad wingmen...It sounds as though you just admitted your method of calculating Inconclusive rates is basically Flawed. Please don't cherry pick- read everything and read it in context. If you had actually read everything that I posted you would have seen the following part of the statement you quoted was this The fact that your host processes more work per hour also increases the likely hood of it picking up good wingmen. Each negates the other. BTW, I think you will find a 99.7% Accuracy rate is well within the guidelines, and very high when compared against other GPU Apps. Never said it wasn't. But for all of it's accuracy in the benchmarks, it hasn't translated to here on main where if it were as accurate as it's meant to be, it wouldn't have such a (relatively) high percentage of inconclusives. Grant Darwin NT |
Jeff Buck ![]() Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0
|
Seeing as how this thread has now apparently been expanded to encompass topics beyond the Linux Special App, I might as well just make my entire Inconclusive list available to all. Crowdsourcing it, I guess. That way, those who wish can pick out and report on any interesting Inconclusive that catches their eye, rather than the narrow focus I was trying to maintain. Special App of course (though no Cuda 9 this evening), stock Cuda, SoG, Intel GPU, Mac, runaway hosts, overflows, non-overflows, etc. There should be something interesting for almost everybody among this evening's 214 WUs (except Astropulse, which I did weed out). Download today's list here: https://www.dropbox.com/s/l1rp2kr5lv9u9vc/Inconclusives_20171120.7z?dl=0 The zip file includes the list itself, in HTML format, as well as a .bbc file which contains the same info but with BBC code tags instead of HTML tags, to allow easy copy-and-paste into a forum post. |
|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 6,279
|
I got another one, https://setiathome.berkeley.edu/workunit.php?wuid=2752595674 It looked the same, I decided to run it. CUDA Special 9.0; Best spike: peak=25.70546, time=85.56, d_freq=1420565431.23, chirp=90.947, fft_len=32k Best autocorr: peak=18.00695, time=100.7, delay=4.8311, d_freq=1420567715.68, chirp=13.008, fft_len=128k Best gaussian: peak=3.818873, mean=0.5320869, ChiSq=1.266398, time=26, d_freq=1420563643.87, score=0.9975023, null_hyp=2.222546, chirp=-36.341, fft_len=16k Best pulse: peak=5.220406, time=46.27, period=1.73, d_freq=1420565622.82, score=1.009, chirp=-68.462, fft_len=512 Best triplet: peak=0, time=-2.12e+11, period=0, d_freq=0, chirp=0, fft_len=0 Windows SoG 3584; Best spike: peak=25.70546, time=85.56, d_freq=1420565431.23, chirp=90.947, fft_len=32k Best autocorr: peak=18.00693, time=100.7, delay=4.8311, d_freq=1420567715.68, chirp=13.008, fft_len=128k Best gaussian: peak=3.664716, mean=0.5392017, ChiSq=1.408692, time=27.68, d_freq=1420563582.9, score=0.614264, null_hyp=2.284244, chirp=-36.341, fft_len=16k Best pulse: peak=5.22041, time=46.27, period=1.73, d_freq=1420565622.82, score=1.009, chirp=-68.462, fft_len=512 Best triplet: peak=0, time=-2.12e+011, period=0, d_freq=0, chirp=0, fft_len=0 CPU 3711; Best spike: peak=25.70545, time=85.56, d_freq=1420565431.23, chirp=90.947, fft_len=32k Best autocorr: peak=18.00695, time=100.7, delay=4.8311, d_freq=1420567715.68, chirp=13.008, fft_len=128k Best gaussian: peak=3.818873, mean=0.5320868, ChiSq=1.2664, time=26, d_freq=1420563643.87, score=0.9975224, null_hyp=2.222548, chirp=-36.341, fft_len=16k Best pulse: peak=5.220407, time=46.27, period=1.73, d_freq=1420565622.82, score=1.009, chirp=-68.462, fft_len=512 Best triplet: peak=0, time=-2.12e+11, period=0, d_freq=0, chirp=0, fft_len=0 This looks to be a pretty common problem. This Host had ZERO Inconclusive results before, https://setiathome.berkeley.edu/results.php?hostid=8282042 I'll bet he has more Bad Gaussians, they are just not showing up because the other SoG Hosts are validating with the same Bad Gaussians. He obviously isn't being introduced to mister Bad Wingman either...I wonder Why? I just found Two examples of the Bad Best Results....and I wasn't even trying. |
-= Vyper =- Send message Joined: 5 Sep 99 Posts: 1630 Credit: 1,065,191,981 RAC: 5,753
|
You can also check my host here to add it to the mix.. I don't really see the Point for anybody here to be upset over high amount of inconclusives. As long as there are alot of other applications that the particular host gets matched to and the host is fast the higher the inconclusive rate it would be until the others Catch up and the workunit is either invalid/errored out or validated. Ofcourse we need to monitor the values etc but as stated earlier the main story of the inconclusive rate is still the out of order sorting of the Linux Cuda app (if i haven't accidently missed somewhere that it has been fixed) and how fast the particular host is.. For instance.. Tbars Host has alot of 1050/1060s and those are faster than in my case 750Tis.. https://setiathome.berkeley.edu/results.php?hostid=8053171 My inconclusive rate is 3,5% and Tbars is 4,7% and Petris host has for the moment 4,6% of inconclusives . Tbar is running: x41p_zi3v , i am running x41p_zi3x-32 and Petri is running x41p_zi3xs3.. So we can't really compare apples/oranges there but there seems to be an indication that Petris app "seems" to be even more on spot and conclusive for the moment than Tbars.. And as i said the reason i Believe my inconclusive rate is lower is that my GPUs are slower so they don't turn back that much in time to get a large pending que compared. Tbars pending/inconclusive ratio is 26.9 , my is 19,7 and Petris is 22,1 I Think there is a correlation between the pending/inconclusive rate if compared with the same application to give the inconclusive rate in percentage. I suspect that if the host is faster the P/I ratio should rise and thus the I percentage likewise if compared! For example if my host had even slower GPUs the P/I should go down somewhat further to perhaps 17 and i Believe that the I percentage would drop perhaps to 3,35%. So in my mind we really cant compare the I percentage on which app is better or not fully to that extent that this discussion is leaning to. What we all should be concerned of is make sure the app matches the original stock CPU application to the max and keep the invalid/error rates down that really shows that the particular application run is not in par with the rest. Period! Inconclusives is just a hint/guideline but i belive that host speed must be taken in consideration aswell as only to look at the I percentage value. Insert alot of unnecessary wait states and slow down the application and the median value of I most certainly would decrease aswell.. Most importantly now, is my thinking and assumption wrong here regarding adding host speed to the mix? Give me your thoughts and please keep this thread less hostile than it has turned out to be lads.. _________________________________________________________________________ Addicted to SETI crunching! Founder of GPU Users Group |
|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 6,279
|
So, I grabbed one of the remaining 2 tasks on that page of inconclusives, mainly because it was a difference in Best Gaussians with a SoG task and I pretty much knew how it would turn out. Been there, got the T-Shirt. This one, https://setiathome.berkeley.edu/workunit.php?wuid=2751780910 CUDA Special 9.0; Best spike: peak=27.62942, time=73.82, d_freq=1419628836.76, chirp=-11.973, fft_len=128k Best autocorr: peak=18.17568, time=100.7, delay=1.3053, d_freq=1419627102.04, chirp=-17.923, fft_len=128k Best gaussian: peak=3.550708, mean=0.5256593, ChiSq=1.295515, time=34.39, d_freq=1419629788.24, score=1.28786, null_hyp=2.256684, chirp=-15.498, fft_len=16k Best pulse: peak=2.482953, time=48.47, period=0.7122, d_freq=1419631092.84, score=1.009, chirp=47.473, fft_len=512 Best triplet: peak=0, time=-2.12e+11, period=0, d_freq=0, chirp=0, fft_len=0 Windows SoG 3584; Best spike: peak=27.62945, time=73.82, d_freq=1419628836.76, chirp=-11.973, fft_len=128k Best autocorr: peak=18.17569, time=100.7, delay=1.3053, d_freq=1419627102.04, chirp=-17.923, fft_len=128k Best gaussian: peak=3.397433, mean=0.5405703, ChiSq=1.389597, time=37.75, d_freq=1419629736.23, score=0.7114021, null_hyp=2.280532, chirp=-15.498, fft_len=16k Best pulse: peak=2.482952, time=48.47, period=0.7122, d_freq=1419631092.84, score=1.009, chirp=47.473, fft_len=512 Best triplet: peak=0, time=-2.12e+011, period=0, d_freq=0, chirp=0, fft_len=0 CPU 3711; Best spike: peak=27.62945, time=73.82, d_freq=1419628836.76, chirp=-11.973, fft_len=128k Best autocorr: peak=18.17567, time=100.7, delay=1.3053, d_freq=1419627102.04, chirp=-17.923, fft_len=128k Best gaussian: peak=3.550708, mean=0.5256588, ChiSq=1.29552, time=34.39, d_freq=1419629788.24, score=1.287918, null_hyp=2.25669, chirp=-15.498, fft_len=16k Best pulse: peak=2.482954, time=48.47, period=0.7122, d_freq=1419631092.84, score=1.009, chirp=47.473, fft_len=512 Best triplet: peak=0, time=-2.12e+11, period=0, d_freq=0, chirp=0, fft_len=0 How did I know? Because I've tested quite a few of them and know the SoG Apps are at times producing Bad Best Gaussians. So you see, it's not just the CUDA App that sometimes produces a Bad Best Result, the other Apps do as well. But, if you keep harping on the One App, people will think it's the only App that has that problem and ask Stupid Sh..tuff like, Gee do you think these problems will cause us to miss ET? No more than the problems with the other Apps will...is the correct answer. Anyone who thinks the SoG Apps aren't cross validating with Bad Best Gaussians is in severe denial. There are many, Many more SoG tasks validating each other than there are CUDA tasks. This effects All SoG Apps, nVidia, ATI, Windows, and Linux. There are currently No Mac SoG Apps. |
Stephen "Heretic" ![]() Send message Joined: 20 Sep 12 Posts: 5384 Credit: 192,787,363 RAC: 1,426
|
. . Wrong! The inconclusive numbers you are looking at are a false impression created by the delay in the 3rd wingmen clearing tasks and represent inconclusives not from one day but from many. If you look at only the inconclusives from any one day's returns there are only a few... . . OK, I'll stop you there. The rate of inconclusives is as a subset of validated tasks not pending tasks which have yet to be checked at all. . . The right formula is 'tasks judged inconclusive within a given time period' multiplied by 100 and divided by 'the total number of tasks checked in that same time period'. In other words, the percentage of tasks within a set period of time which do not successfully validate from the total number of tasks checked for validation. Stephen .. |
Stephen "Heretic" ![]() Send message Joined: 20 Sep 12 Posts: 5384 Credit: 192,787,363 RAC: 1,426
|
Here Grant, tell me how many Real Inconclusives there are on this page, https://setiathome.berkeley.edu/results.php?hostid=6796479&state=3 These are the some of the Obvious Bad Wingpeople on that One page; . . Have you sent X-File a friendly message? Stephen ?? |
|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 6,279
|
The fact that your host processes more work per hour increases the likely hood of it picking up bad wingmen...It sounds as though you just admitted your method of calculating Inconclusive rates is basically Flawed. I would go further and suggest the Server actually plays favorites. The numbers some receive just can't be attributed to random chance, I've seen some Hosts that seem to never be sent Bad Wingpeople while others are sent constant streams. The only way to equalize the calculation is to remove them, and the Aborted tasks. That leaves you with Net Inconclusives, which is a much more accurate method. BTW, I think you will find a 99.7% Accuracy rate is well within the guidelines, and very high when compared against other GPU Apps. |
|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 6,279
|
Your case is only valid if the Apps were only matched against each other. They aren't. Most of the tasks are against other Apps, not the Special App. In fact, I receive a number of inconclusives against older versions of the Special App. I ran a few tasks at Beta where No One Else was running the Special App. There were only a couple inconclusives against the ever present Intel App, https://setiweb.ssl.berkeley.edu/beta/results.php?hostid=63959 The nice thing about Beta is you don't receive very many Obviously Bad Wingpeople, the Intel App seems to be unavoidable. If they ever get something besides BLC4s at Beta I might run a few more. BTW, there is yet another Intel iGPU attempt at C.A. This one doesn't have JSPF which makes it different than the one currently at Beta, https://setiweb.ssl.berkeley.edu/beta/result.php?resultid=29132652 Whether it will be any better is anyone's guess. |
©2020 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.