Linux CUDA 'Special' App finally available, featuring Low CPU use

Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 32 · 33 · 34 · 35 · 36 · 37 · 38 . . . 83 · Next

AuthorMessage
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1874208 - Posted: 21 Jun 2017, 4:33:03 UTC - in response to Message 1874199.  

Here's how it plays out on my Mac, I'll run it in Linux shortly. Of course, it needs to be run a couple of times to determine if it's consistent.
...
...
CUDA Wins 4 to 2
So...that's twice as good.
LOL...or...you could look at it as "CUDA gets it right 2/3 of the time". ;^)

Seriously, though, do you see any sort of pattern in those WUs, as to what characteristics favor the CUDA8.0 result vs. the SoG result? I know it's a small sample, but they look kind of random to me.
ID: 1874208 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1874209 - Posted: 21 Jun 2017, 4:33:15 UTC - in response to Message 1874206.  

Oh yeah, no illusions there. Have prepared for probable extended downtime by stocking up on the mobile data. Switching over to a completely new infrastructure will have teething problems.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1874209 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1874210 - Posted: 21 Jun 2017, 4:39:45 UTC - in response to Message 1874207.  

a couple of hours later I had 40/100Mbs NBN.
Sigh, and the best I can get is 620k/5Mb :(
And I am literally sitting about 600 feet away from a split in the TranCanada Fibre!

You're still doing better than my parents.
Approx 3km from the exchange on ADSL2+ it's not unusual for their download sync rate to be only 300kb/s. If they're lucky they can get 3-4Mb/s. When I had ADSL2+, at almost 4km from the exchange I was able to sync at 17Mb/s.
Grant
Darwin NT
ID: 1874210 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1874211 - Posted: 21 Jun 2017, 4:52:08 UTC - in response to Message 1874207.  

a couple of hours later I had 40/100Mbs NBN.
Sigh, and the best I can get is 620k/5Mb :(
Just ran a speed test on my line and got:
Download Speed: 3.77 Mbps (471.3 KB/sec transfer rate)
Upload Speed: 0.5 Mbps (62.5 KB/sec transfer rate)
Latency: 37 ms

Considering that my nominal D/L speed is only supposed to be 3.0 Mbps, anything over about 2.4 is turbocharged for me!
ID: 1874211 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1874213 - Posted: 21 Jun 2017, 5:15:48 UTC - in response to Message 1874211.  
Last modified: 21 Jun 2017, 5:23:36 UTC

Latency: 37 ms

Pretty high level of latency there. Way better than satellite, but still rather high.
Is that a wired/fibre connection, or a wireless connection?

EDIT-
My ISPs speed test page appears to be working again.
They're dropped the latency test, now it's just a basic Ping check.

Download 89.5Mb/s
Upload 32.4Mb/s
Ping 4ms
Dropped of a bit from what it was, but still pretty good.

Server for this test is 2653km away (as the crow flies. 4,400km by road, 4,300km if you take the shorter route).
Grant
Darwin NT
ID: 1874213 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1874215 - Posted: 21 Jun 2017, 5:29:28 UTC - in response to Message 1874213.  

Latency: 37 ms

Pretty high level of latency there. Way better than satellite, but still rather high.
Is that a wired/fibre connection, or a wireless connection?
Wired. Just DSL through an old AT&T telephone line that was originally installed/buried in about 1979, in a semi-rural area. I'm lucky I don't have to use semaphore flags. ;^)
ID: 1874215 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22189
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1874223 - Posted: 21 Jun 2017, 6:34:38 UTC

A few days later (was expecting it to be weeks again) and the second contractor calls me back to make an appointment. This time he's got a bloke with trenching & tunneling equipment- a couple of hours later I had 40/100Mbs NBN.


Great news Grant - no more reliance on that old bit of damp string :-)
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1874223 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1874227 - Posted: 21 Jun 2017, 6:59:24 UTC - in response to Message 1874223.  

Great news Grant - no more reliance on that old bit of damp string :-)

And one less source for surges from lightning strikes.
The last one through the phone line took out the modem & the network port on one of the computers.
Grant
Darwin NT
ID: 1874227 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1874306 - Posted: 21 Jun 2017, 19:05:51 UTC - in response to Message 1874202.  

Thanks TBar for zi3v, running it now. Petri gave me zi3w which is 'basically the same' which I ran through maintenance, so some tasks are from that app.

Computers you can watch:
1080+1080+980
1070+750Ti
The first machine looks impressive, but I've noticed most the Inconclusives are being produced by Device #3. The one less Pulse count is normally not seen. Is there anything significant about Device #3? On the machine I'm running at Beta Device #3 is using a PCIe extender cable connected to a slot wired for x4 lanes.

The other Linux machine is finishing the last Test WU on the CPU, in a couple hours I'll run the 6 Test WUs on the GTX 1050 and see how they compare to the Mac.
ID: 1874306 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1874307 - Posted: 21 Jun 2017, 19:12:53 UTC - in response to Message 1874306.  
Last modified: 21 Jun 2017, 19:16:39 UTC

No the isn't. I have been running low resource share (100 seti, 2 beta) and it has only been using 1 GPU for beta with that setting. So basically everything is Device #3.

PS, The errors are me forgetting I had an app_config for 2 tasks prior to this, the 'sauce' didn't like that at all.
ID: 1874307 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1874317 - Posted: 21 Jun 2017, 21:21:37 UTC - in response to Message 1874307.  

No the isn't. I have been running low resource share (100 seti, 2 beta) and it has only been using 1 GPU for beta with that setting. So basically everything is Device #3.

PS, The errors are me forgetting I had an app_config for 2 tasks prior to this, the 'sauce' didn't like that at all.

The magig of the sauce is to drain all the juice. Just once. :))
P.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1874317 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1874369 - Posted: 22 Jun 2017, 1:31:40 UTC
Last modified: 22 Jun 2017, 1:46:16 UTC

I ran the six problem WUs in Linux and the results are a tie. I ran it again on the Mac and as before 23se08ac.6875.22968.6.33.135 was a success on the Mac making it 4 - 2.

KWSN-Linux-MBbench v2.1.08
Running on TBar-iSETI at Thu 22 Jun 2017 12:47:33 AM UTC
----------------------------------------------------------------
Starting benchmark run...
----------------------------------------------------------------
Listing wu-file(s) in /testWUs :
03my17ab.4903.11519.16.43.91.wu
04oc08ab.31484.890.13.47.11.wu
23se08ac.6117.29512.7.34.110.wu
23se08ac.6875.22968.6.33.135.wu
26mr17aa.25216.7429.14.41.247.wu
28mr17ac.1412.331287.5.32.153.wu

Listing executable(s) in /APPS :
setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda80

Listing executable in /REF_APPS :
MBv8_8.0r3305_ssse3_x86_64-pc-linux-gnu
----------------------------------------------------------------
Current WU: 03my17ab.4903.11519.16.43.91.wu
----------------------------------------------------------------
Skipping default app MBv8_8.0r3305_ssse3_x86_64-pc-linux-gnu, displaying saved result(s)
Elapsed Time: ....................... 6448 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda80 -device 1
./setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda80 -device 1 180.99 sec 37.26 sec 9.35 sec
Elapsed Time : ...................... 181 seconds
Speed compared to default : ......... 3562 %
-----------------
Comparing results
                ------------- R1:R2 ------------     ------------- R2:R1 ------------
                Exact  Super  Tight  Good    Bad     Exact  Super  Tight  Good    Bad
        Spike      0      0      0      0      0        0      0      0      0      0
     Autocorr      0      2      2      2      0        0      2      2      2      0
     Gaussian      0      0      0      0      0        0      0      0      0      0
        Pulse      0      0      0      0      0        0      0      0      0      0
      Triplet      0      7      7      7      0        0      7      7      7      0
   Best Spike      0      1      1      1      0        0      1      1      1      0
Best Autocorr      0      1      1      1      0        0      1      1      1      0
Best Gaussian      0      0      0      0      1        0      0      0      0      1
   Best Pulse      0      1      1      1      0        0      1      1      1      0
 Best Triplet      0      1      1      1      0        0      1      1      1      0
                ----   ----   ----   ----   ----     ----   ----   ----   ----   ----
                   0     13     13     13      1        0     13     13     13      1

Unmatched signal(s) in R1 at line(s) 528
Unmatched signal(s) in R2 at line(s) 528
For R1:R2 matched signals only, Q= 99.98%
Result      : Weakly similar.
----------------------------------------------------------------
Done with 03my17ab.4903.11519.16.43.91.wu
====================================================================
Current WU: 04oc08ab.31484.890.13.47.11.wu
----------------------------------------------------------------
Skipping default app MBv8_8.0r3305_ssse3_x86_64-pc-linux-gnu, displaying saved result(s)
Elapsed Time: ....................... 8067 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda80 -device 1
./setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda80 -device 1 276.77 sec 57.19 sec 14.30 sec
Elapsed Time : ...................... 277 seconds
Speed compared to default : ......... 2912 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.93%
----------------------------------------------------------------
Done with 04oc08ab.31484.890.13.47.11.wu
====================================================================
Current WU: 23se08ac.6117.29512.7.34.110.wu
----------------------------------------------------------------
Skipping default app MBv8_8.0r3305_ssse3_x86_64-pc-linux-gnu, displaying saved result(s)
Elapsed Time: ....................... 7142 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda80 -device 1
./setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda80 -device 1 263.55 sec 54.62 sec 13.51 sec
Elapsed Time : ...................... 264 seconds
Speed compared to default : ......... 2705 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.95%
----------------------------------------------------------------
Done with 23se08ac.6117.29512.7.34.110.wu
====================================================================
Current WU: 23se08ac.6875.22968.6.33.135.wu
----------------------------------------------------------------
Skipping default app MBv8_8.0r3305_ssse3_x86_64-pc-linux-gnu, displaying saved result(s)
Elapsed Time: ....................... 8171 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda80 -device 1
./setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda80 -device 1 275.45 sec 56.85 sec 14.00 sec
Elapsed Time : ...................... 275 seconds
Speed compared to default : ......... 2971 %
-----------------
Comparing results
                ------------- R1:R2 ------------     ------------- R2:R1 ------------
                Exact  Super  Tight  Good    Bad     Exact  Super  Tight  Good    Bad
        Spike      1      3      3      3      0        1      3      3      3      0
     Autocorr      0      0      0      0      0        0      0      0      0      0
     Gaussian      0      0      0      0      0        0      0      0      0      0
        Pulse      0      1      1      1      0        0      1      1      1      0
      Triplet      0      3      3      3      0        0      3      3      3      0
   Best Spike      1      1      1      1      0        1      1      1      1      0
Best Autocorr      0      1      1      1      0        0      1      1      1      0
Best Gaussian      0      0      0      0      1        0      0      0      0      1
   Best Pulse      0      1      1      1      0        0      1      1      1      0
 Best Triplet      0      1      1      1      0        0      1      1      1      0
                ----   ----   ----   ----   ----     ----   ----   ----   ----   ----
                   2     11     11     11      1        2     11     11     11      1

Unmatched signal(s) in R1 at line(s) 500
Unmatched signal(s) in R2 at line(s) 500
For R1:R2 matched signals only, Q= 99.99%
Result      : Weakly similar.
----------------------------------------------------------------
Done with 23se08ac.6875.22968.6.33.135.wu
====================================================================
Current WU: 26mr17aa.25216.7429.14.41.247.wu
----------------------------------------------------------------
Skipping default app MBv8_8.0r3305_ssse3_x86_64-pc-linux-gnu, displaying saved result(s)
Elapsed Time: ....................... 8348 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda80 -device 1
./setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda80 -device 1 285.10 sec 57.68 sec 14.74 sec
Elapsed Time : ...................... 285 seconds
Speed compared to default : ......... 2929 %
-----------------
Comparing results
                ------------- R1:R2 ------------     ------------- R2:R1 ------------
                Exact  Super  Tight  Good    Bad     Exact  Super  Tight  Good    Bad
        Spike      0      0      0      0      0        0      0      0      0      0
     Autocorr      0      1      1      1      0        0      1      1      1      0
     Gaussian      0      0      0      0      0        0      0      0      0      0
        Pulse      0      2      2      2      0        0      2      2      2      0
      Triplet      0      0      0      0      0        0      0      0      0      0
   Best Spike      0      1      1      1      0        0      1      1      1      0
Best Autocorr      0      1      1      1      0        0      1      1      1      0
Best Gaussian      0      0      0      0      1        0      0      0      0      1
   Best Pulse      0      1      1      1      0        0      1      1      1      0
 Best Triplet      0      0      0      0      0        0      0      0      0      0
                ----   ----   ----   ----   ----     ----   ----   ----   ----   ----
                   0      6      6      6      1        0      6      6      6      1

Unmatched signal(s) in R1 at line(s) 448
Unmatched signal(s) in R2 at line(s) 447
For R1:R2 matched signals only, Q= 99.97%
Result      : Weakly similar.
----------------------------------------------------------------
Done with 26mr17aa.25216.7429.14.41.247.wu
====================================================================
Current WU: 28mr17ac.1412.331287.5.32.153.wu
----------------------------------------------------------------
Skipping default app MBv8_8.0r3305_ssse3_x86_64-pc-linux-gnu, displaying saved result(s)
Elapsed Time: ....................... 7690 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda80 -device 1
./setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda80 -device 1 279.65 sec 56.84 sec 14.26 sec
Elapsed Time : ...................... 280 seconds
Speed compared to default : ......... 2746 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.76%
----------------------------------------------------------------
Done with 28mr17ac.1412.331287.5.32.153.wu
====================================================================
Hosts CPU data ...
model name	: Intel(R) Xeon(R) CPU           X3330  @ 2.66GHz
cpu cores	: 4
cpu MHz		: 1998.000
cache size	: 3072 KB
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm dtherm tpr_shadow vnmi flexpriority

Done with Benchmark run! Removing temporary files!

Currently My Linux machine at Beta is showing 5 Inconclusives, all from the usual suspects. 2 from iGPUs and 3 from two ATI machines that have a High Inconclusive rate.

I also ran this task on the Mac where at Beta the CPU is showing one more Pulse than the CUDA App, https://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=9826838 On the Mac, zi3v shows the same count as the CPU at Beta;
setiathome v8 enhanced x41p_zi3v, Cuda 8.00 special
Modifications done by petri33, compiled by TBar
Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements.
Work Unit Info:
...............
WU true angle range is :  0.446598
Sigma 3
Thread call stack limit is: 1k
Autocorr: peak=18.39234, time=73.82, delay=2.2898, d_freq=1420878392.42, chirp=-6.9606, fft_len=128k
Spike: peak=24.69245, time=73.82, d_freq=1420876117.17, chirp=21.692, fft_len=128k
Spike: peak=24.98193, time=73.82, d_freq=1420876117.17, chirp=21.693, fft_len=128k
Pulse: peak=1.293585, time=48.05, period=0.2329, d_freq=1420881956.06, score=1.031, chirp=38.065, fft_len=32 
Pulse: peak=0.8152785, time=48.05, period=0.1164, d_freq=1420881955.42, score=1.013, chirp=50.753, fft_len=32 
Pulse: peak=1.300616, time=48.05, period=0.2329, d_freq=1420882056.71, score=1.037, chirp=59.212, fft_len=32 
Pulse: peak=0.8186681, time=48.05, period=0.1164, d_freq=1420882056.06, score=1.017, chirp=71.9, fft_len=32 
Pulse: peak=7.530814, time=11.98, period=2.796, d_freq=1420880936.48, score=1.046, chirp=93.047, fft_len=512 

Best spike: peak=24.98193, time=73.82, d_freq=1420876117.17, chirp=21.693, fft_len=128k
Best autocorr: peak=18.39234, time=73.82, delay=2.2898, d_freq=1420878392.42, chirp=-6.9606, fft_len=128k
Best gaussian: peak=3.808729, mean=0.532015, ChiSq=1.242625, time=24.33, d_freq=1420880404.63,
	score=1.23826, null_hyp=2.219082, chirp=-38.519, fft_len=16k
Best pulse: peak=0.8489983, time=48.05, period=0.1164, d_freq=1420882056.71, score=1.054, chirp=59.212, fft_len=32 
Best triplet: peak=0, time=-2.121e+11, period=0, d_freq=0, chirp=0, fft_len=0 

Spike count:    2
Autocorr count: 1
Pulse count:    5
Triplet count:  0
Gaussian count: 0

So, not sure why the 1080 is giving one count less. You can force Beta to use a different GPU, just exclude Device #3 from Beta using;

<exclude_gpu>
<url>http://setiweb.ssl.berkeley.edu/beta/</url>
<device_num>3</device_num>
<type>NVIDIA</type>
<app>setiathome_v8</app>
</exclude_gpu>
In cc_config.xml. That way you can test another GPU at Beta.
ID: 1874369 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1874372 - Posted: 22 Jun 2017, 2:11:55 UTC - in response to Message 1874369.  

I ran the six problem WUs in Linux and the results are a tie. I ran it again on the Mac and as before 23se08ac.6875.22968.6.33.135 was a success on the Mac making it 4 - 2.
So, does that provide any clues as to what might be going on with the Best gaussians, or is that going to require a larger sample size? Or are you indicating that with zi3v the issue might be solved? I can generate a new list of my Inconclusives each evening and/or switch my Linux boxes over to zi3v. (I also hope to have Linux running on my other xw9400 in a day or so.)
ID: 1874372 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1874378 - Posted: 22 Jun 2017, 2:51:06 UTC - in response to Message 1874372.  

It could be as Jason theorized here; https://setiathome.berkeley.edu/forum_thread.php?id=80636&postid=1874104#1874104
Those seem like pretty low peaks to start with for best Gaussian. 1 in 300 with contention deep in the noise floor 'Feels' as though we're pushing technology limits (once again), but it will warrant more definite understanding either way. Plotting the PoT data from the results and visually comparing if they look anything alike might say something. My suspicion is they won't look very 'Gaussiany' at all. If so, pushing further into the noisefloor, while possible, may be fruitless. Eric's ruled out that we need double-precision or bit-Identical results below reportable thresholds (in the case of Gaussians, iirc score derived from the ChiSq Fit and null hypothesis).

Or it could be something else. I'm currently running the same WUs with the older OpenCL App MBv8r3567, which doesn't use the nVidia SoG path, and the newer MBv8r3602 from Lunatics. So far r3602 is batting .33% while r3567 is batting 1000%. Seems r3567 is a little slower though, but it is producing the correct Best Gaussians.
Interesting.
oops, r3602 just failed another one...

In any event, seeing as how it takes Hundreds of tasks to find one bad Best Gaussian it's well within the Project's Goal of less than 5% Inconclusive. 5% would allow 5 Inconclusives per 100 tasks.
ID: 1874378 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1874379 - Posted: 22 Jun 2017, 3:13:18 UTC - in response to Message 1874378.  

Yeah, certainly seems like they're fairly rare, and since they ultimate all seem to validate, probably not a show-stopper for the app.

Anyway, FWIW, I did go ahead and generate a new list from my Inconclusives and skimmed through it looking for the proverbial low-hanging fruit (tasks that were non-overflow, with matching counts for all signals, no iGPU involvement, etc.) and just came up with two new candidates. There might be a few more among the 63 total Inconclusives I currently have for the Cuda 8.0 special, but it didn't seem worth digging any further. You can add them to your testing stash or ignore them, as you wish.

Workunit 2581227344 (09no16aa.18442.2116.6.33.31)
Task 5821860064 (S=3, A=2, P=1, T=0, G=3) x41p_zi3t2b, Cuda 8.00 special
Task 5821860065 (S=3, A=2, P=1, T=0, G=3) v8.22 (opencl_nvidia_SoG) windows_intelx86

Cuda 8.00 special - Best gaussian: peak=6.385563, mean=0.5961245, ChiSq=1.203035, time=67.95, d_freq=1420305212.47,
score=2.122812, null_hyp=2.212901, chirp=-62.462, fft_len=16k
v8.22 SoG - Best gaussian: peak=5.745589, mean=0.583622, ChiSq=1.414207, time=59.56, d_freq=1420299457.13,
score=2.057846, null_hyp=2.326043, chirp=90.136, fft_len=16k

Workunit 2581784900 (11no16aa.15419.21379.7.34.219)
Task 5823048772 (S=7, A=0, P=0, T=0, G=0) v8.22 (opencl_nvidia_SoG) windows_intelx86
Task 5823048773 (S=7, A=0, P=0, T=0, G=0) x41p_zi3t2b, Cuda 8.00 special

Cuda 8.00 special - Best gaussian: peak=5.887481, mean=0.6405677, ChiSq=1.381727, time=29.36, d_freq=1419641581.37,
score=-0.6105146, null_hyp=2.163218, chirp=52.627, fft_len=16k
v8.22 SoG - Best gaussian: peak=6.274475, mean=0.6576355, ChiSq=1.411415, time=10.91, d_freq=1419640238.56,
score=-0.6067085, null_hyp=2.1823, chirp=-69.28, fft_len=16k
ID: 1874379 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1874381 - Posted: 22 Jun 2017, 3:45:31 UTC

The Linux machine finished testing the OpenCL Apps. The results show 4 out of 6 failures using the SoG path, while the Non-SoG path was a complete Success. Actually, r3567 is using the iGPU compile path;

tbar@TBar-iSETI:~$ cd '/home/tbar/KWSN-Bench-Linux-MBv7' 
tbar@TBar-iSETI:~/KWSN-Bench-Linux-MBv7$ ./benchmark
KWSN-Linux-MBbench v2.1.08
Running on TBar-iSETI at Thu 22 Jun 2017 01:33:39 AM UTC
----------------------------------------------------------------
Starting benchmark run...
----------------------------------------------------------------
Listing wu-file(s) in /testWUs :
03my17ab.4903.11519.16.43.91.wu
04oc08ab.31484.890.13.47.11.wu
23se08ac.6117.29512.7.34.110.wu
23se08ac.6875.22968.6.33.135.wu
26mr17aa.25216.7429.14.41.247.wu
28mr17ac.1412.331287.5.32.153.wu

Listing executable(s) in /APPS :
MBv8_8.21r3567_NV_ssse3_x86_64-pc-linux-gnu
MBv8_8.23r3602_sse2_clNV_SoG_x86_64-pc-linux-gnu

Listing executable in /REF_APPS :
MBv8_8.0r3305_ssse3_x86_64-pc-linux-gnu
----------------------------------------------------------------
Current WU: 03my17ab.4903.11519.16.43.91.wu
----------------------------------------------------------------
Skipping default app MBv8_8.0r3305_ssse3_x86_64-pc-linux-gnu, displaying saved result(s)
Elapsed Time: ....................... 6448 seconds
----------------------------------------------------------------
Running app with command : .......... MBv8_8.21r3567_NV_ssse3_x86_64-pc-linux-gnu -sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -period_iterations_num 10 -device 1
Elapsed Time : ...................... 464 seconds
Speed compared to default : ......... 1389 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.94%
----------------------------------------------------------------
Running app with command : .......... MBv8_8.23r3602_sse2_clNV_SoG_x86_64-pc-linux-gnu -sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -period_iterations_num 10 -device 1
Elapsed Time : ...................... 398 seconds
Speed compared to default : ......... 1620 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.94%
----------------------------------------------------------------
Done with 03my17ab.4903.11519.16.43.91.wu
====================================================================
Current WU: 04oc08ab.31484.890.13.47.11.wu
----------------------------------------------------------------
Skipping default app MBv8_8.0r3305_ssse3_x86_64-pc-linux-gnu, displaying saved result(s)
Elapsed Time: ....................... 8067 seconds
----------------------------------------------------------------
Running app with command : .......... MBv8_8.21r3567_NV_ssse3_x86_64-pc-linux-gnu -sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -period_iterations_num 10 -device 1
Elapsed Time : ...................... 679 seconds
Speed compared to default : ......... 1188 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.93%
----------------------------------------------------------------
Running app with command : .......... MBv8_8.23r3602_sse2_clNV_SoG_x86_64-pc-linux-gnu -sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -period_iterations_num 10 -device 1
Elapsed Time : ...................... 586 seconds
Speed compared to default : ......... 1376 %
-----------------
Comparing results
                ------------- R1:R2 ------------     ------------- R2:R1 ------------
                Exact  Super  Tight  Good    Bad     Exact  Super  Tight  Good    Bad
        Spike      0      4      4      4      0        0      4      4      4      0
     Autocorr      0      6      6      6      0        0      6      6      6      0
     Gaussian      0      1      1      1      0        0      1      1      1      0
        Pulse      0      0      0      0      0        0      0      0      0      0
      Triplet      0      1      1      1      0        0      1      1      1      0
   Best Spike      0      1      1      1      0        0      1      1      1      0
Best Autocorr      0      1      1      1      0        0      1      1      1      0
Best Gaussian      0      0      0      0      1        0      0      0      0      1
   Best Pulse      0      1      1      1      0        0      1      1      1      0
 Best Triplet      0      1      1      1      0        0      1      1      1      0
                ----   ----   ----   ----   ----     ----   ----   ----   ----   ----
                   0     16     16     16      1        0     16     16     16      1

Unmatched signal(s) in R1 at line(s) 585
Unmatched signal(s) in R2 at line(s) 585
For R1:R2 matched signals only, Q= 99.95%
Result      : Weakly similar.
----------------------------------------------------------------
Done with 04oc08ab.31484.890.13.47.11.wu
====================================================================
Current WU: 23se08ac.6117.29512.7.34.110.wu
----------------------------------------------------------------
Skipping default app MBv8_8.0r3305_ssse3_x86_64-pc-linux-gnu, displaying saved result(s)
Elapsed Time: ....................... 7142 seconds
----------------------------------------------------------------
Running app with command : .......... MBv8_8.21r3567_NV_ssse3_x86_64-pc-linux-gnu -sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -period_iterations_num 10 -device 1
Elapsed Time : ...................... 609 seconds
Speed compared to default : ......... 1172 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.95%
----------------------------------------------------------------
Running app with command : .......... MBv8_8.23r3602_sse2_clNV_SoG_x86_64-pc-linux-gnu -sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -period_iterations_num 10 -device 1
Elapsed Time : ...................... 519 seconds
Speed compared to default : ......... 1376 %
-----------------
Comparing results
                ------------- R1:R2 ------------     ------------- R2:R1 ------------
                Exact  Super  Tight  Good    Bad     Exact  Super  Tight  Good    Bad
        Spike      0      9      9      9      0        0      9      9      9      0
     Autocorr      1      1      1      1      0        1      1      1      1      0
     Gaussian      0      2      2      2      0        0      2      2      2      0
        Pulse      0      0      0      0      0        0      0      0      0      0
      Triplet      0      1      1      1      0        0      1      1      1      0
   Best Spike      0      1      1      1      0        0      1      1      1      0
Best Autocorr      1      1      1      1      0        1      1      1      1      0
Best Gaussian      0      0      0      0      1        0      0      0      0      1
   Best Pulse      0      1      1      1      0        0      1      1      1      0
 Best Triplet      0      1      1      1      0        0      1      1      1      0
                ----   ----   ----   ----   ----     ----   ----   ----   ----   ----
                   2     17     17     17      1        2     17     17     17      1

Unmatched signal(s) in R1 at line(s) 607
Unmatched signal(s) in R2 at line(s) 607
For R1:R2 matched signals only, Q= 99.96%
Result      : Weakly similar.
----------------------------------------------------------------
Done with 23se08ac.6117.29512.7.34.110.wu
====================================================================
Current WU: 23se08ac.6875.22968.6.33.135.wu
----------------------------------------------------------------
Skipping default app MBv8_8.0r3305_ssse3_x86_64-pc-linux-gnu, displaying saved result(s)
Elapsed Time: ....................... 8171 seconds
----------------------------------------------------------------
Running app with command : .......... MBv8_8.21r3567_NV_ssse3_x86_64-pc-linux-gnu -sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -period_iterations_num 10 -device 1
Elapsed Time : ...................... 657 seconds
Speed compared to default : ......... 1243 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.93%
----------------------------------------------------------------
Running app with command : .......... MBv8_8.23r3602_sse2_clNV_SoG_x86_64-pc-linux-gnu -sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -period_iterations_num 10 -device 1
Elapsed Time : ...................... 544 seconds
Speed compared to default : ......... 1502 %
-----------------
Comparing results
                ------------- R1:R2 ------------     ------------- R2:R1 ------------
                Exact  Super  Tight  Good    Bad     Exact  Super  Tight  Good    Bad
        Spike      0      3      3      3      0        0      3      3      3      0
     Autocorr      0      0      0      0      0        0      0      0      0      0
     Gaussian      0      0      0      0      0        0      0      0      0      0
        Pulse      0      1      1      1      0        0      1      1      1      0
      Triplet      0      3      3      3      0        0      3      3      3      0
   Best Spike      0      1      1      1      0        0      1      1      1      0
Best Autocorr      0      1      1      1      0        0      1      1      1      0
Best Gaussian      0      0      0      0      1        0      0      0      0      1
   Best Pulse      0      1      1      1      0        0      1      1      1      0
 Best Triplet      0      1      1      1      0        0      1      1      1      0
                ----   ----   ----   ----   ----     ----   ----   ----   ----   ----
                   0     11     11     11      1        0     11     11     11      1

Unmatched signal(s) in R1 at line(s) 500
Unmatched signal(s) in R2 at line(s) 500
For R1:R2 matched signals only, Q= 99.97%
Result      : Weakly similar.
----------------------------------------------------------------
Done with 23se08ac.6875.22968.6.33.135.wu
====================================================================
Current WU: 26mr17aa.25216.7429.14.41.247.wu
----------------------------------------------------------------
Skipping default app MBv8_8.0r3305_ssse3_x86_64-pc-linux-gnu, displaying saved result(s)
Elapsed Time: ....................... 8348 seconds
----------------------------------------------------------------
Running app with command : .......... MBv8_8.21r3567_NV_ssse3_x86_64-pc-linux-gnu -sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -period_iterations_num 10 -device 1
Elapsed Time : ...................... 720 seconds
Speed compared to default : ......... 1159 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.93%
----------------------------------------------------------------
Running app with command : .......... MBv8_8.23r3602_sse2_clNV_SoG_x86_64-pc-linux-gnu -sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -period_iterations_num 10 -device 1
Elapsed Time : ...................... 570 seconds
Speed compared to default : ......... 1464 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.95%
----------------------------------------------------------------
Done with 26mr17aa.25216.7429.14.41.247.wu
====================================================================
Current WU: 28mr17ac.1412.331287.5.32.153.wu
----------------------------------------------------------------
Skipping default app MBv8_8.0r3305_ssse3_x86_64-pc-linux-gnu, displaying saved result(s)
Elapsed Time: ....................... 7690 seconds
----------------------------------------------------------------
Running app with command : .......... MBv8_8.21r3567_NV_ssse3_x86_64-pc-linux-gnu -sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -period_iterations_num 10 -device 1
Elapsed Time : ...................... 643 seconds
Speed compared to default : ......... 1195 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.90%
----------------------------------------------------------------
Running app with command : .......... MBv8_8.23r3602_sse2_clNV_SoG_x86_64-pc-linux-gnu -sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -period_iterations_num 10 -device 1
Elapsed Time : ...................... 538 seconds
Speed compared to default : ......... 1429 %
-----------------
Comparing results
                ------------- R1:R2 ------------     ------------- R2:R1 ------------
                Exact  Super  Tight  Good    Bad     Exact  Super  Tight  Good    Bad
        Spike      0      0      0      0      0        0      0      0      0      0
     Autocorr      0      0      0      0      0        0      0      0      0      0
     Gaussian      0      1      1      1      0        0      1      1      1      0
        Pulse      0      2      2      2      0        0      2      2      2      0
      Triplet      0      1      1      1      0        0      1      1      1      0
   Best Spike      0      1      1      1      0        0      1      1      1      0
Best Autocorr      0      1      1      1      0        0      1      1      1      0
Best Gaussian      0      0      0      0      1        0      0      0      0      1
   Best Pulse      0      1      1      1      0        0      1      1      1      0
 Best Triplet      0      1      1      1      0        0      1      1      1      0
                ----   ----   ----   ----   ----     ----   ----   ----   ----   ----
                   0      8      8      8      1        0      8      8      8      1

Unmatched signal(s) in R1 at line(s) 471
Unmatched signal(s) in R2 at line(s) 471
For R1:R2 matched signals only, Q= 99.90%
Result      : Weakly similar.
----------------------------------------------------------------
Done with 28mr17ac.1412.331287.5.32.153.wu
====================================================================
Done with Benchmark run! Removing temporary files!
tbar@TBar-iSETI:~/KWSN-Bench-Linux-MBv7$

Yes, Weakly similar Will Validate, but, will not be used in the search for ET.
ID: 1874381 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1874382 - Posted: 22 Jun 2017, 3:48:30 UTC - in response to Message 1874378.  
Last modified: 22 Jun 2017, 3:53:00 UTC

It could be as Jason theorized here; https://setiathome.berkeley.edu/forum_thread.php?id=80636&postid=1874104#1874104
Those seem like pretty low peaks to start with for best Gaussian. 1 in 300 with contention deep in the noise floor 'Feels' as though we're pushing technology limits (once again), but it will warrant more definite understanding either way. Plotting the PoT data from the results and visually comparing if they look anything alike might say something. My suspicion is they won't look very 'Gaussiany' at all. If so, pushing further into the noisefloor, while possible, may be fruitless. Eric's ruled out that we need double-precision or bit-Identical results below reportable thresholds (in the case of Gaussians, iirc score derived from the ChiSq Fit and null hypothesis).

Or it could be something else. I'm currently running the same WUs with the older OpenCL App MBv8r3567, which doesn't use the nVidia SoG path, and the newer MBv8r3602 from Lunatics. So far r3602 is batting .33% while r3567 is batting 1000%. Seems r3567 is a little slower though, but it is producing the correct Best Gaussians.
Interesting.
oops, r3602 just failed another one...

In any event, seeing as how it takes Hundreds of tasks to find one bad Best Gaussian it's well within the Project's Goal of less than 5% Inconclusive. 5% would allow 5 Inconclusives per 100 tasks.


Some history, just clarifying the origins of that 5% target. Back in ~v5 days, inconclusives were upwards of 20%. ~v6 ~10% as GPU apps came in, now <5% with v7.

That was due to a combination of stock CPU apps (then 32 bit only) using x87 FPU only, which are 80-bit internal, other KWSN and Alex Kan (Mac) builds using SIMD ( MMX, SSE through SSSE3). With v6, Joe Segur injecting KWSN and AK via Lunatics into stock CPU. For v7 I performed several numerical analyses of the algorithms in Matlab, mostly while attempting to devise a GPU form of the autocorrelation in v7 (which didn't previously exist).

The Cuda numbers actually came out more accurate, due to the way certain sums were calculated, but differing enough that something needed to be done to make stock 64 bit and cross platform (e.g. android science app, which didn;t exist yet either) more viable in terms of cross platform match, less error growth as the workunit analysis parameters were widened and later GBT added. So With Eric's permission I changed some stock sums to block sums (which are similar to Cuda blocked and AKv8 SSEx Striped summations), so pulling the results about 3-6 decimal places closer together.

Bearing in mind the platforms/devices all have different compilers, use different algorithms, and the vagaries inherent in floating point computation, the 5% is chosen as target (by me) because that's where Eric tends to set thresholds for the analysis, such that ~5% of results return overflow. That's where the analysis+(telescope)recording noise floor would be, so any better than 5% cross platform match we're pretty much digging into technological limitations outside our control (with existing Fourier method anyway).

At some point, I forget when, the improved cross platform matches and application reliability across the board allowed the workunit initial replication to be reduced from 3 to 2, reducing server load by a third.

On the basis of all that, If other various classes of builds/devices see much worse than 5% inconclusives, then they're not up to par, while at the same time Eric's assertions that additional precision shouldn't be needed suggest <5% is 'good enough' statistically. (Which I'm fine with, because much more tightening toward bit exact cross platform would be extremely expensive (development time, money and computationally) and lead to de-optimisation.

I'm mostly writing this out, because at some point I'll need to add explanations to stock documentation, because the temptation to optimise out the stock summing refinements may be a trap for future developers [Saved for when I revisit the stock codebase, at some point Astropulse should undergo a similar analysis, though it's beyond my resources at present].
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1874382 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1874417 - Posted: 22 Jun 2017, 11:39:14 UTC - in response to Message 1874379.  
Last modified: 22 Jun 2017, 11:48:47 UTC

Yeah, certainly seems like they're fairly rare, and since they ultimate all seem to validate, probably not a show-stopper for the app.

Anyway, FWIW, I did go ahead and generate a new list from my Inconclusives and skimmed through it looking for the proverbial low-hanging fruit (tasks that were non-overflow, with matching counts for all signals, no iGPU involvement, etc.) and just came up with two new candidates. There might be a few more among the 63 total Inconclusives I currently have for the Cuda 8.0 special, but it didn't seem worth digging any further. You can add them to your testing stash or ignore them, as you wish.

Workunit 2581227344 (09no16aa.18442.2116.6.33.31)
Task 5821860064 (S=3, A=2, P=1, T=0, G=3) x41p_zi3t2b, Cuda 8.00 special
Task 5821860065 (S=3, A=2, P=1, T=0, G=3) v8.22 (opencl_nvidia_SoG) windows_intelx86

Cuda 8.00 special - Best gaussian: peak=6.385563, mean=0.5961245, ChiSq=1.203035, time=67.95, d_freq=1420305212.47,
score=2.122812, null_hyp=2.212901, chirp=-62.462, fft_len=16k
v8.22 SoG - Best gaussian: peak=5.745589, mean=0.583622, ChiSq=1.414207, time=59.56, d_freq=1420299457.13,
score=2.057846, null_hyp=2.326043, chirp=90.136, fft_len=16k

Workunit 2581784900 (11no16aa.15419.21379.7.34.219)
Task 5823048772 (S=7, A=0, P=0, T=0, G=0) v8.22 (opencl_nvidia_SoG) windows_intelx86
Task 5823048773 (S=7, A=0, P=0, T=0, G=0) x41p_zi3t2b, Cuda 8.00 special

Cuda 8.00 special - Best gaussian: peak=5.887481, mean=0.6405677, ChiSq=1.381727, time=29.36, d_freq=1419641581.37,
score=-0.6105146, null_hyp=2.163218, chirp=52.627, fft_len=16k
v8.22 SoG - Best gaussian: peak=6.274475, mean=0.6576355, ChiSq=1.411415, time=10.91, d_freq=1419640238.56,
score=-0.6067085, null_hyp=2.1823, chirp=-69.28, fft_len=16k
The task 11no16aa.15419.21379.7.34.219 disappeared before I could download it. I was able to get 09no16aa.18442.2116.6.33.31. Just as with many others it Fails on the Best gaussian when run with the SoG App. I was able to download the Stock 8.22 Apps and they are the same as with the last test, the SoG App Fails where the Non-SoG App Succeeds. The CPU agrees with the CUDA App;
Cuda 8.00 special - Best gaussian: peak=6.385563, mean=0.5961245, ChiSq=1.203035, time=67.95, d_freq=1420305212.47,
score=2.122812, null_hyp=2.212901, chirp=-62.462, fft_len=16k
v8.22 SoG - Best gaussian: peak=5.745589, mean=0.583622, ChiSq=1.414207, time=59.56, d_freq=1420299457.13,
score=2.057846, null_hyp=2.326043, chirp=90.136, fft_len=16k
SSSE3xjf Linux64 Build 3305 - Best gaussian: peak=6.385561, mean=0.5961246, ChiSq=1.20304, time=67.95, d_freq=1420305212.47,
score=2.122838, null_hyp=2.212904, chirp=-62.462, fft_len=16k

The Benchmark App shows the Non-SoG App producing the correct result;

====================================================================
Current WU: 09no16aa.18442.2116.6.33.31.wu
----------------------------------------------------------------
Skipping default app MBv8_8.0r3305_ssse3_x86_64-pc-linux-gnu, displaying saved result(s)
Elapsed Time: ....................... 6283 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome_8.22_x86_64-pc-linux-gnu__opencl_nvidia_sah -sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -period_iterations_num 10 -device 1
Elapsed Time : ...................... 447 seconds
Speed compared to default : ......... 1405 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.91%
----------------------------------------------------------------
Running app with command : .......... setiathome_8.22_x86_64-pc-linux-gnu__opencl_nvidia_SoG -sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -period_iterations_num 10 -device 1
Elapsed Time : ...................... 408 seconds
Speed compared to default : ......... 1539 %
-----------------
Comparing results
                ------------- R1:R2 ------------     ------------- R2:R1 ------------
                Exact  Super  Tight  Good    Bad     Exact  Super  Tight  Good    Bad
        Spike      0      3      3      3      0        0      3      3      3      0
     Autocorr      0      2      2      2      0        0      2      2      2      0
     Gaussian      0      3      3      3      0        0      3      3      3      0
        Pulse      0      1      1      1      0        0      1      1      1      0
      Triplet      0      0      0      0      0        0      0      0      0      0
   Best Spike      0      1      1      1      0        0      1      1      1      0
Best Autocorr      0      1      1      1      0        0      1      1      1      0
Best Gaussian      0      0      0      0      1        0      0      0      0      1
   Best Pulse      0      1      1      1      0        0      1      1      1      0
 Best Triplet      0      0      0      0      0        0      0      0      0      0
                ----   ----   ----   ----   ----     ----   ----   ----   ----   ----
                   0     12     12     12      1        0     12     12     12      1

Unmatched signal(s) in R1 at line(s) 563
Unmatched signal(s) in R2 at line(s) 563
For R1:R2 matched signals only, Q= 99.91%
Result      : Weakly similar.
----------------------------------------------------------------
Done with 09no16aa.18442.2116.6.33.31.wu


The rest of the tasks are producing the same results as the last test, the Stock SoG 8.22 App is Mostly Wrong when compared to the CPU's Best Gaussian. So, I'd just list the SoG App as just another Usual Suspect with the Best Gaussian Inconclusives and not bother with it as there is a better chance the CUDA App is correct. It's a shame the Wrong result is being entered into the database, but at least you aren't being Robbed as with some of the AstroPulse tasks. There is a horde of ATI Windows Hosts trashing APs, Cross-Validating, and Robbing Innocent Hosts, yet no one seems to be concerned, https://setiathome.berkeley.edu/forum_thread.php?id=77586&postid=1873191#1873191 So, it could be worse ;-)

Anyone with a Windows machine might want to run the test WUs;
04oc08ab.31484.890.13.47.11.wu
09no16aa.18442.2116.6.33.31.wu
23se08ac.6117.29512.7.34.110.wu
23se08ac.6875.22968.6.33.135.wu
28mr17ac.1412.331287.5.32.153.wu
With a CPU App against the SoG App and see what you get...you might be surprised.
ID: 1874417 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1874436 - Posted: 22 Jun 2017, 14:54:50 UTC - in response to Message 1874417.  

Ugh, well that's a starker demonstration than I intended, while engineering in the truth. We've lost people recently.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1874436 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1874505 - Posted: 22 Jun 2017, 21:11:44 UTC - in response to Message 1874417.  

----------------------------------------------------------------
Done with 09no16aa.18442.2116.6.33.31.wu[/pre][/size]

Anyone with a Windows machine might want to run the test WUs;
04oc08ab.31484.890.13.47.11.wu
09no16aa.18442.2116.6.33.31.wu
23se08ac.6117.29512.7.34.110.wu
23se08ac.6875.22968.6.33.135.wu
28mr17ac.1412.331287.5.32.153.wu
With a CPU App against the SoG App and see what you get...you might be surprised.

Any chance to get some of those in binary form with direct link?
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1874505 · Report as offensive
Previous · 1 . . . 32 · 33 · 34 · 35 · 36 · 37 · 38 . . . 83 · Next

Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.