Linux CUDA 'Special' App finally available, featuring Low CPU use

Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 66 · 67 · 68 · 69 · 70 · 71 · 72 . . . 83 · Next

AuthorMessage
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1897357 - Posted: 25 Oct 2017, 10:34:59 UTC
Last modified: 25 Oct 2017, 11:04:08 UTC

It seems once again this thead becomes "hot".
Lets cool it down a little.
Indeed, overflow means only fraction of all signals are reported. And indeed arbitrary selected 30 from all isn't so "scientific" to stick with that choice as testimony.
But there are few considerations also to still put some efforts in proper results ordering (and I think Petri - the single man who currently can change anything on this side - already agreed with it):
1) in CPU app's ordering we go from small to big relative motions. Usually the less value to correct the more adequate result fo correction are (simple saying the highest chirps could be more distorted even after corrections, but it's IMHO, maybe I'm wrong here).
2) any deviation in reporting order means the need to result resend. That is, performance drop on project level. Hence, proper ordering is just another project-wide level of optimization.

Guys you all doing important job that required and valued. No need to alienate each other!

https://www.youtube.com/watch?v=bBZAccgvp8M :)
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1897357 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1897378 - Posted: 25 Oct 2017, 13:31:16 UTC - in response to Message 1897336.  
Last modified: 25 Oct 2017, 13:57:55 UTC

I see, EVERY example I provided the host just happened to be running doubles...in your world.
It's easy Stephen. If the run-times are equal to the time between reports then he is running singles. If the machine were running doubles there would be twice as many reports, there weren't. Especially on the last example where the only reports were from the GPU and the difference in report times matched the run-times. I can provide many examples, anyone can. Why don't you provide an example to support your claim?


. . 2nd try, system trashed my first response, so much for that hour ..............

. . Again, since you insist on it.

. .It takes a while on a slow machine to gather sufficient evidence to convince someone like you. Especially when my attempt to resurrect a windows setup that hadn't been used for nearly 9 months failed to recognize the GPU, and a reboot proved to be unwise as M$ kicked off one of their irresistible fupdates that trashed the setup and necessitated a hasty repair/rebuild. Sadly the system now is not achieving the results it was back then. While last year it was achieving 85% utilization of the GPU on single tasks and 95+% on doubles it is now only getting 75% and 85% respectively. I have no intention of wasting time trying to recover those extra 2 or 3 mins per task as this is a one off short term project called IYF-D.

http://setiathome.berkeley.edu/result.php?resultid=6115330054
http://setiathome.berkeley.edu/result.php?resultid=6115329573
http://setiathome.berkeley.edu/result.php?resultid=6115329851
http://setiathome.berkeley.edu/result.php?resultid=6115329857
http://setiathome.berkeley.edu/result.php?resultid=6115329859
http://setiathome.berkeley.edu/result.php?resultid=6115330124
http://setiathome.berkeley.edu/result.php?resultid=6115330161
http://setiathome.berkeley.edu/result.php?resultid=6115329954
http://setiathome.berkeley.edu/result.php?resultid=6115330221
http://setiathome.berkeley.edu/result.php?resultid=6115330223
http://setiathome.berkeley.edu/result.php?resultid=6115330228
http://setiathome.berkeley.edu/result.php?resultid=6113411994
http://setiathome.berkeley.edu/result.php?resultid=6113392134
http://setiathome.berkeley.edu/result.php?resultid=6113392653
http://setiathome.berkeley.edu/result.php?resultid=6113392679
http://setiathome.berkeley.edu/result.php?resultid=6113392011
http://setiathome.berkeley.edu/result.php?resultid=6113392533







. . I expect you're feeling pretty smug and self satisfied right now. The bad news for you, is that EVERY ONE of those results are from running doubles.

http://setiathome.berkeley.edu/result.php?resultid=6115329870
http://setiathome.berkeley.edu/result.php?resultid=6115329925
http://setiathome.berkeley.edu/result.php?resultid=6115330183
http://setiathome.berkeley.edu/result.php?resultid=6115329932
http://setiathome.berkeley.edu/result.php?resultid=6113380483
http://setiathome.berkeley.edu/result.php?resultid=6113361743
http://setiathome.berkeley.edu/result.php?resultid=6115330080
http://setiathome.berkeley.edu/result.php?resultid=6115330226
http://setiathome.berkeley.edu/result.php?resultid=6115329777
http://setiathome.berkeley.edu/result.php?resultid=6113524809
http://setiathome.berkeley.edu/result.php?resultid=6113525371
http://setiathome.berkeley.edu/result.php?resultid=6113525373
http://setiathome.berkeley.edu/result.php?resultid=6113525174
http://setiathome.berkeley.edu/result.php?resultid=6113525430
http://setiathome.berkeley.edu/result.php?resultid=6113525436

. . But the pieces de resistance are ...

http://setiathome.berkeley.edu/result.php?resultid=6113525433
http://setiathome.berkeley.edu/result.php?resultid=6113524696

. . Of course a clever man like you might have thought to actually look at the host (not hidden) from which you are demanding results. Evidently not.

. . It's really a shame that when people try to take part and help you have to be a douche and call them a liar.

. . Enjoy your reading ...........

Stephen

:(
ID: 1897378 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1897439 - Posted: 25 Oct 2017, 16:46:22 UTC - in response to Message 1897355.  


So, for this WU, having 3 Special Apps end up patting each other on the back worked out just fine. That doesn't mean that it's a good situation. There are still enough nagging little inconsistencies and inaccuracies in the Special App that this kind of cross-validation could manage to squeeze out legitimate results, and as use of the Special App becomes more widespread, that sort of scenario becomes more likely.

And correct way to solve this would be to send tiebreaker to distinct plan class device, preferably, CPU device.
In case of anonymous platform this would mean "to send to stock CPU".
I'll attempt to rise this issue with Eric.
Thanks, Raistmer. That would be very helpful in reducing the cross-validation risk for experimental apps like Petri's, and others that might come along in the future. Of course, some cross-validation risk still exists even for the initial two hosts on a WU, as we've seen occasionally, but I don't know how that could be addressed. There's such an overwhelming preponderance of nVIDIA devices in the mix that it would seem impossible to always have each one paired with a different device class for that first pass.
ID: 1897439 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1897483 - Posted: 25 Oct 2017, 21:54:44 UTC - in response to Message 1897356.  


Best spike: peak=-nan, time=1.678, d_freq=1418984342.92, chirp=-19.121, fft_len=32k

This appears to have resulted from a restart of the task, even though the 2 legitimate Spikes appear in the Stderr both before and after the restart.

Hm.... such not a number values better to catch by app's sanity checks.
AFAIK Petri implemented same sanity checks as OpenCL and CPU apps have currently. So, probably they all don't check for "not a number" condition... It's TODO for checking...
I just did a bit of research this afternoon and found that my own Linux boxes occasionally return "Best spike: peak=nan", (though not "peak=-nan") always associated with a restarted task. I found 15 of them in my October archives, spread across all 3 boxes and both flavors of the Special App (x41p_zi3t2b and x41p_zi3v). The most recent appears to be Task 6115510571. I believe that all of these were initially marked Inconclusive, though all but two (which also had bogus Spikes or Triplets) seemed to get validated in the end.

The negative "-nan" does, however, show up on those nagging restarted tasks that throw a bunch of bogus Triplets after the restart (Message 1875324), such as Task 6115692460 where 30 non-existent Triplets similar to the following were reported.

Triplet: peak=-nan, time=84.31, period=3.355, d_freq=1420832615.81, chirp=-18.47, fft_len=8k
Triplet: peak=-nan, time=84.72, period=3.775, d_freq=1420832608.06, chirp=-18.47, fft_len=8k
...
...
Triplet: peak=-nan, time=88.92, period=3.775, d_freq=1420832530.59, chirp=-18.47, fft_len=8k
Triplet: peak=-nan, time=88.08, period=2.097, d_freq=1420832546.08, chirp=-18.47, fft_len=8k

So far, I've only seen the bogus Triplets occur with the x41p_zi3v version.
ID: 1897483 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1897495 - Posted: 26 Oct 2017, 0:25:19 UTC - in response to Message 1897483.  
Last modified: 26 Oct 2017, 0:27:48 UTC

I just did a bit of research this afternoon and found that my own Linux boxes occasionally return "Best spike: peak=nan", (though not "peak=-nan") always associated with a restarted task. I found 15 of them in my October archives, spread across all 3 boxes and both flavors of the Special App (x41p_zi3t2b and x41p_zi3v). The most recent appears to be Task 6115510571. I believe that all of these were initially marked Inconclusive, though all but two (which also had bogus Spikes or Triplets) seemed to get validated in the end.

This means "Special" app needs to check resuming logic to properly initialize best signal values.
Nevertheless, even being unproperly initialized such values better to catch via sanity check (this way error would be much more obvious).

EDIT: AFAIK GPU uses finite arithmetics by default (it's faster) so NaN value can't occur due to GPU operations. That leaves NaN source single option - uninitialized memory (though can be both host and/or device based).
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1897495 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1897507 - Posted: 26 Oct 2017, 4:07:14 UTC - in response to Message 1897495.  

Yep, certainly seems likely. I wonder why it only seems to affect the individual Triplet reporting and the Best Spike value. I would think that all those fields would get initialized at the same time.

Oh well, hopefully Petri will have the time to track it down soon.
ID: 1897507 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1897512 - Posted: 26 Oct 2017, 4:39:33 UTC - in response to Message 1897483.  

The most recent appears to be Task 6115510571.
I just noticed that my initial wingman for this task (in WU 2721792884) got a "Maximum elapsed time exceeded" error. That host belongs to Mr. Kevvy and it looks like perhaps it's hit that 20x APR threshold due to excessive rescheduling. Almost all the CPU tasks on that machine are failing after a couple hours of run time, and appear to have been doing so for quite a while.
ID: 1897512 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1897526 - Posted: 26 Oct 2017, 6:49:47 UTC - in response to Message 1897378.  
Last modified: 26 Oct 2017, 6:55:42 UTC

I see, EVERY example I provided the host just happened to be running doubles...in your world.
It's easy Stephen. If the run-times are equal to the time between reports then he is running singles. If the machine were running doubles there would be twice as many reports, there weren't. Especially on the last example where the only reports were from the GPU and the difference in report times matched the run-times. I can provide many examples, anyone can. Why don't you provide an example to support your claim?


. . 2nd try, system trashed my first response, so much for that hour ..............

. . Again, since you insist on it...
Whoa Stephen, you didn't need to go through all that. I would have settled for links to machines at SETI showing similar times as you claimed, sorta what I did. Oh wait...you couldn't, because you can't find any. I certainly haven't found any, all of the ones I can find show about an hour for an AR 0.40 WU. So, I suppose it as I suggested...something peculiar to your machine? I did notice your clockrate is a bit higher than the rest, 1005 where the stock clock is 901? I trust you have the memory clock spiked as well? Yes, it does look better than all the ones I've been able to find. Shame it seems to have a problem with the Special App, most all the other GPUs are around twice as fast using the Special App. I think that's why You, and a number of others are now running Linux?

I ran my own little test using the Linux CUDA 42 app against the zi3v Apps. I think you'll find the Linux CUDA 42 App is just as fast as the Windows one, actually I think it's a little faster. So this is what I got on my 750Ti;
Current WU: 23se08ac.6875.22968.6.33.135.wu
true_angle_range>0.44813078486041
----------------------------------------------------------------
Running default app with command :... setiathome_x41zi_x86_64-pc-linux-gnu_cuda42 -device 0
Elapsed Time: ....................... 725 seconds

----------------------------------------------------------------
Running app with command : .......... setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda60 -device 0
Elapsed Time : ...................... 389 seconds
Speed compared to default : ......... 186 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.95%

----------------------------------------------------------------
Running app with command : .......... setiathome_x41p_zi3v_x86_64-pc-linux-gnu_cuda80 -device 0
Elapsed Time : ...................... 371 seconds
Speed compared to default : ......... 195 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.96%

So, just as with the other Special Apps zi3v CUDA60 is almost twice as fast as the baseline CUDA app on the Arecibo tasks, which means it's also about twice as fast as the OpenCL App, but, people already knew that...that's why they are running the Special App. I don't know why it's different on your particular card, not really something to brag about IMHO.

What's that old line...with friends like that who needs...
Actually, I don't gain a thing by going through all the trouble to build this App, I don't even have a Kepler card, and I'm certainly not being paid for it. So any help would be directed toward those that may find this App useful, not me. The question would be, did you help any of the people that may use this App? I find that questionable. Some may actually believe you saying it's just a little faster than CUDA 50 and choose not to bother using it. That certainly won't bother me, but, they may lose out. The main reason I built this App was for the people using 780s & Titans who were asking for a version that worked with their GPUs. Well, they now have one, whether or not they use it is up to them, I gain nothing either way.
Should I add an asterisk to the ReadMe? Something like, "works about Twice as fast as the stock Apps, *except on Stephen's 730". I dunno, I suppose that's possible.
ID: 1897526 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1897535 - Posted: 26 Oct 2017, 8:10:24 UTC - in response to Message 1897512.  

The most recent appears to be Task 6115510571.
I just noticed that my initial wingman for this task (in WU 2721792884) got a "Maximum elapsed time exceeded" error. That host belongs to Mr. Kevvy and it looks like perhaps it's hit that 20x APR threshold due to excessive rescheduling. Almost all the CPU tasks on that machine are failing after a couple hours of run time, and appear to have been doing so for quite a while.


. . Mr Kevvy seems to be quiet these days, I wonder if he is still checking his rigs?

Stephen

??
ID: 1897535 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1897613 - Posted: 26 Oct 2017, 18:42:05 UTC - in response to Message 1897535.  

. . Mr Kevvy seems to be quiet these days, I wonder if he is still checking his rigs?

Stephen

??
He's definitely still around, since he's currently one of the mods. He just hasn't posted in Number Crunching for awhile. I suppose if he doesn't pick up on this thread in a day or so, a PM would be in order.
ID: 1897613 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1898212 - Posted: 30 Oct 2017, 3:38:58 UTC
Last modified: 30 Oct 2017, 3:40:44 UTC

Here's a non-overflow Inconclusive that should probably be watched.

Workunit 2727927944 (02ap07ad.14827.10706.5.32.128)
Task 6128258808 (S=13, A=0, P=5, T=5, G=0, BS=25.89224, BG=3.375406) x41p_zi3xs3, Cuda 9.00 special
Task 6128258809 (S=9, A=0, P=5, T=5, G=0, BS=25.89224, BG=3.375409) x41p_zi3v, Cuda 8.00 special

The 4 additional Spikes that the zi3xs3 reported are the first 4 listed in the Stderr, all with "time=6.711". All other signals and "Best" values appear to match up just fine. (Note that I've now added a BS value to my summaries, for "Best Spike", since that also seems to be a possible area of concern on some WUs, where a NaN value might be reported.)

As I write this, the tiebreaker task has yet to be sent out.
ID: 1898212 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1898296 - Posted: 30 Oct 2017, 17:48:51 UTC - in response to Message 1898212.  

Here's a non-overflow Inconclusive that should probably be watched.

Workunit 2727927944 (02ap07ad.14827.10706.5.32.128)
Task 6128258808 (S=13, A=0, P=5, T=5, G=0, BS=25.89224, BG=3.375406) x41p_zi3xs3, Cuda 9.00 special
Task 6128258809 (S=9, A=0, P=5, T=5, G=0, BS=25.89224, BG=3.375409) x41p_zi3v, Cuda 8.00 special

The 4 additional Spikes that the zi3xs3 reported are the first 4 listed in the Stderr, all with "time=6.711". All other signals and "Best" values appear to match up just fine. (Note that I've now added a BS value to my summaries, for "Best Spike", since that also seems to be a possible area of concern on some WUs, where a NaN value might be reported.)

As I write this, the tiebreaker task has yet to be sent out.

Did Petri say that "3v" disallowed for production use?
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1898296 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1898301 - Posted: 30 Oct 2017, 18:15:00 UTC - in response to Message 1898296.  

Did Petri say that "3v" disallowed for production use?
Not as far as I know. A few weeks ago he posted a message to try to discourage people from using the zi3t2, due to Pulse reporting issues. He recommended moving to the "latest" Cuda8, which I believe is the zi3v and was supposed to improve on the Pulse reporting. His message was somewhat confusing, however, and referenced the "s2" a lot.

Anyway, I've been running the zi3v on that one box since I first converted it to Linux. While there are still several nagging problems there, the biggest one I run into is the bogus Spikes and Triplets that I often see following restarts.
ID: 1898301 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1898350 - Posted: 30 Oct 2017, 22:31:10 UTC - in response to Message 1898296.  

Here's a non-overflow Inconclusive that should probably be watched.
Well, here's an Overflow issue that needs to be watched.
Raistmer, any idea why the Spikes have a much higher reading on the other Overflows than the Special App? Other than the full zi3xs3 only working on Pascal GPUs, this is the only problem I see with the Overflows with zi3xs3. Of course, it still has the occasional bad Best Pulse problem as well.
https://setiathome.berkeley.edu/workunit.php?wuid=2728259294
It's the same with the older App,
https://setiathome.berkeley.edu/workunit.php?wuid=2728384537
https://setiathome.berkeley.edu/workunit.php?wuid=2727684970
ID: 1898350 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1898357 - Posted: 30 Oct 2017, 23:02:44 UTC - in response to Message 1898350.  

That looks about the same as the one I reported in Message 1896776, but with zi3x vs. zi3t2b. When I benched it (Message 1896912), the stock Windows CPU result matched zi3t2b, so it would appear something may have gone sideways starting with zi3x.
ID: 1898357 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1898359 - Posted: 30 Oct 2017, 23:22:58 UTC - in response to Message 1898357.  

Well, Petri says it's because his newer Apps are finding signals in the first chirp whereas the other Apps aren't. It is something in the newer Apps, he's just not sure what.
Other than that, the full zi3xs3 runs nicely on the Pascal GPUs.
ID: 1898359 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1898361 - Posted: 30 Oct 2017, 23:37:27 UTC - in response to Message 1898212.  

Here's a non-overflow Inconclusive that should probably be watched.

Workunit 2727927944 (02ap07ad.14827.10706.5.32.128)
Task 6128258808 (S=13, A=0, P=5, T=5, G=0, BS=25.89224, BG=3.375406) x41p_zi3xs3, Cuda 9.00 special
Task 6128258809 (S=9, A=0, P=5, T=5, G=0, BS=25.89224, BG=3.375409) x41p_zi3v, Cuda 8.00 special

The 4 additional Spikes that the zi3xs3 reported are the first 4 listed in the Stderr, all with "time=6.711". All other signals and "Best" values appear to match up just fine.
The tiebreaker on this one ran as a Linux CPU task and appears to have matched with the zi3v result. It did not report those first 4 Spikes that that the zi3xs3 did, so perhaps the same issue as in the 30-Spike overflow. All 3 hosts got credit on this one, however.
ID: 1898361 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1898428 - Posted: 31 Oct 2017, 11:53:24 UTC - in response to Message 1898350.  
Last modified: 31 Oct 2017, 11:54:08 UTC


Raistmer, any idea why the Spikes have a much higher reading on the other Overflows than the Special App?

Cause power values come from some summation simplest guess would be those summations are distributed and some part is missing/not reduced in overflow case.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1898428 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1898429 - Posted: 31 Oct 2017, 11:58:25 UTC - in response to Message 1898359.  
Last modified: 31 Oct 2017, 12:06:17 UTC

Well, Petri says it's because his newer Apps are finding signals in the first chirp whereas the other Apps aren't. It is something in the newer Apps, he's just not sure what.
Other than that, the full zi3xs3 runs nicely on the Pascal GPUs.

To be correct: "first chirp" is zero chirp. And definitely algorithm looks for signals there (it means no relative motion regarding source and receiver).
What is omitted and by the reason is the 0th slot in PoT analysis (for all chirps). Zero slot means static signal strength and obviously should be ignored.
If Petri's app really accepts anything from that slot it's serious bug.
EDIT: indeed, handling 0th slot differently from all others means divergence and performance drop in CU that processed it along with others. But that's life, correct algorithm functioning requires omitting results from that slot.
If I recall correctly I implemented it in way that all processing is perfomed w/o deviation but results reduction omits anything from that slot. In such way GPU performance drop is minimal.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1898429 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1898439 - Posted: 31 Oct 2017, 23:41:19 UTC

Interesting, that 1050Ti has only OpenCL 1.2 support.
What NV card would have OpenCL 2.0 then?...
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1898439 · Report as offensive
Previous · 1 . . . 66 · 67 · 68 · 69 · 70 · 71 · 72 . . . 83 · Next

Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.