Linux CUDA 'Special' App finally available, featuring Low CPU use

Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 36 · 37 · 38 · 39 · 40 · 41 · 42 . . . 83 · Next

AuthorMessage
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1875279 - Posted: 26 Jun 2017, 20:45:27 UTC - in response to Message 1875274.  

I know there is a problem with my code reporting over 20 pulses at identical time with a small difference in frequency. That is an extremely rare event. And it always happens at 46.something.
That sounds like the problem I was running into with my GTX 780 (now replaced by a GTX 980), which I detailed in Message 1864874. In fact, with the Cuda8.0 Special App, it was happening quite frequently. Dialing back to the Cuda6.5 version, it became rare, but didn't go away entirely. It has never (yet) shown up on any of my other cards (GTX 750Ti, GTX 960, GTX 980). You'd need to find somebody else running a 780 to see if the problem is common to that model or unique to my card.


I'll put my 780 back in the Mac Pro on the weekend. Its unique Hyper-Q feature might be in play, differs by OS and Cuda version in subtle implementation ways.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1875279 · Report as offensive     Reply Quote
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1605
Credit: 367,049,020
RAC: 381,795
Finland
Message 1875294 - Posted: 26 Jun 2017, 22:37:47 UTC - in response to Message 1875275.  
Last modified: 26 Jun 2017, 22:49:42 UTC


I know there is a problem with my code reporting over 20 pulses at identical time with a small difference in frequency. That is an extremely rare event. And it always happens at 46.something.

Could it be solved in same fashion - by re=processing after discovery?


Another re-re-processing could be done. But I really would like to know why it happens in the first place.
I could also just stop reporting any pulses at the exact time/fft/..., just pretend they did not happen. But I do not want to.

The bug is bugging me. Peak, Time, Period and Score + fft_len always the same. Freq and chirp wary.
The first pulsefind in the code (8k len). Not the l2m version.

My suspect is a memory overflow, a misbehaved pointer/memory management, overheating, bad VRAM, uninitialized variable/mem area, an error in the chirp code/fft lib/my code or then it is an alien trying to hide its existence.

Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286407.06, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286418.24, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286429.41, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286440.59, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286451.76, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286462.94, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286474.12, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286485.29, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286496.47, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286507.64, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286518.82, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286529.99, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286541.17, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286552.35, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286563.52, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286574.7, score=4.562, chirp=-83.442, fft_len=8k

To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1875294 · Report as offensive     Reply Quote
Profile Jeff Buck Special Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1875300 - Posted: 26 Jun 2017, 23:02:23 UTC - in response to Message 1875294.  

Hmmm....that looks very different from the ones I was getting, although the "time" value is quite close.
ID: 1875300 · Report as offensive     Reply Quote
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1875313 - Posted: 27 Jun 2017, 1:40:14 UTC - in response to Message 1875294.  

..
Another re-re-processing could be done. But I really would like to know why it happens in the first place...


Agreed. The pulse race before was challenging to visualise and describe, but serialised reprocessing was one correct way to handle it. This other odd thing I don't have a similar clear idea on yet.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1875313 · Report as offensive     Reply Quote
Profile Jeff Buck Special Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1875314 - Posted: 27 Jun 2017, 1:54:43 UTC - in response to Message 1875107.  

The only way to tell for sure is to run the task with your CPU and compare the results. You should give that a try, you can run a CPU task in the benchmark App while running BOINC. Just reduce the CPU usage by One in BOINC and remove any Apps from the APPS folder in the Benchmark package. The CPU App in the REF_APPS folder will search the WU folder and run any task it doesn't have results for. The Benchmark tool is here, KWSN Linux MB Bench v2.01.08. Extract the KWSN-Bench-Linux-MBv7_v2.01.08.7z to your Home folder and run it from there.
Okay, I tried running it with the Windows CPU app that I use here on my daily driver. It almost perfectly matches the v8.22 (opencl_ati_cat132) result.

Workunit 2567983999 (20oc08aa.4777.254820.12.39.5)
Task 5794100079 (S=10, A=3, P=0, T=0, G=0, BG=0) v8.22 (opencl_ati_cat132) windows_intelx86
Task 5829376759 (S=10, A=3, P=0, T=0, G=0, BG=0) x41p_zi3v, Cuda 8.00 special

v8.22 (opencl_ati_cat132) windows_intelx86 - Best pulse: peak=0.4685673, time=98.45, period=0.01441, d_freq=1420048834.69, score=0.9218, chirp=-61.928, fft_len=8
x41p_zi3v, Cuda 8.00 special - Best pulse: peak=0.3951461, time=68.92, period=0.0147, d_freq=1420052490.23, score=0.7774, chirp=0, fft_len=8.
MB8_win_x86_SSE3_VS2008_r3330 - Best pulse: peak=0.4685681, time=98.45, period=0.01441, d_freq=1420048834.69, score=0.9218, chirp=-61.928, fft_len=8

Are you able to cross compare that with Cuda Baseline? It'll narrow down where to look once I get to the special code.

[Edit:] Which branch is that MB8 derived from ? Stock seti_boinc master ? or AKv8 ? The difference may be important here.
[which one(s) differ to reference Windows/x86 8.00 may point in the right directions]
Still don't know how to do a Cuda bench run, but I did run what I think is the Windows stock CPU app today, setiathome_8.00_windows_intelx86. The numbers in the results file (<peak_power>0.46856832504272</peak_power>, etc.) appear to match the opencl_ati_cat132 and r3330 results (allowing for the fact that I don't know how to convert that "time" value, and the "score" has a value of 0).

FWIW, today's run reminded me of why I haven't run stock CPU in a long time. It took about 6 hours and 45 minutes, versus 3 hours and 13 minutes for the r3330 I ran yesterday!
ID: 1875314 · Report as offensive     Reply Quote
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1875316 - Posted: 27 Jun 2017, 2:18:51 UTC - in response to Message 1875314.  

Still don't know how to do a Cuda bench run, but I did run what I think is the Windows stock CPU app today, setiathome_8.00_windows_intelx86. The numbers in the results file (<peak_power>0.46856832504272</peak_power>, etc.) appear to match the opencl_ati_cat132 and r3330 results (allowing for the fact that I don't know how to convert that "time" value, and the "score" has a value of 0).

FWIW, today's run reminded me of why I haven't run stock CPU in a long time. It took about 6 hours and 45 minutes, versus 3 hours and 13 minutes for the r3330 I ran yesterday!


Cheers! The observations will help narrow things down. Yep, win32 8.00 CPU isn't quick ;D. Cuda bench on Windows is just a matter of throwing the exe, two suitable cu DLLs, and optionally an mbcuda.cfg into the science_apps folder before running the bench. If you do get it working, luckily the CPU reference result should be cached from the prior run and skipped, so it just runs any other app comparison against that.

Myself I'll probably attempt GPU-passthrough of the 780 &/or from the OSX+Linux Host to a Win10vm, scheduled for the weekend. If I do get that operational, I may script a rough automation to distribute and accumulate results from the 3 platforms, letting each OS have a batch of normal test and suspect tasks with various apps. If that works out as hoped, I'll enable some kindof facility to dump in suspects remotely for cross platform match, but that of course is further down the line.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1875316 · Report as offensive     Reply Quote
Profile Jeff Buck Special Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1875318 - Posted: 27 Jun 2017, 2:42:59 UTC - in response to Message 1875316.  

Cuda bench on Windows is just a matter of throwing the exe, two suitable cu DLLs, and optionally an mbcuda.cfg into the science_apps folder before running the bench. If you do get it working, luckily the CPU reference result should be cached from the prior run and skipped, so it just runs any other app comparison against that.
Welllll....I actually did try doing that last evening, though from the Reference folder rather than the Science_apps folder. Used Lunatics_x41zi_win32_cuda50.exe, cudart32_50_35.dll, cufft32_50_35.dll, and mbcuda.cfg. In short, everything that normally runs on this machine. That led to the most spectacular Windows meltdown I've ever seen! The script suspended BOINC just fine. But then.......the windows for all my open apps went haywire, rapidly cycling from one to another, then black screen, then back to Windows but now with an XP theme, then a disappearing task bar, followed by an empty task bar, followed by one task bar button after another reappearing (still with the old XP look), and then a convincing imitation of a Cheshire cat as one Window after another rapidly disappeared, followed by the task bar again, leaving only the desktop wallpaper. Several seconds later, even that disappeared, but eventually the Windows welcome/login screen showed up, which then simply allowed me to click the icon and start a new Windows session, with apparently no permanent harm done (other than the time it took me to re-launch the applications that had been running pre-meltdown. Sooooo.....I kinda figured I must've missed some key element there.
ID: 1875318 · Report as offensive     Reply Quote
Profile Jeff Buck Special Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1875321 - Posted: 27 Jun 2017, 3:11:43 UTC

I've got several new Inconclusives this evening featuring Best Gaussians. None of them have reportable Gaussian signals, however.

Workunit 2585507075 (23no16ab.17660.365297.10.37.52)
Task 5830849142 (S=1, A=0, P=0, T=0, G=0, BG=5.342847) x41p_zi3v, Cuda 8.00 special
Task 5830849143 (S=1, A=0, P=0, T=0, G=0, BG=6.369327) v8.22 (opencl_nvidia_SoG) windows_intelx86

Workunit 2585604974 (24no16ab.21342.5384.7.34.159)
Task 5831052801 (S=6, A=0, P=8, T=0, G=0, BG=5.469572) x41p_zi3v, Cuda 8.00 special
Task 5831052802 (S=6, A=0, P=8, T=0, G=0, BG=4.944084) v8.22 (opencl_nvidia_SoG) windows_intelx86

Workunit 2586487892 (30oc16ab.3155.106788.15.42.241)
Task 5832884399 (S=19, A=0, P=1, T=0, G=0, BG=3.664569) v8.22 (opencl_nvidia_SoG) windows_intelx86
Task 5832884400 (S=19, A=0, P=1, T=0, G=0, BG=3.816209) x41p_zi3t2b, Cuda 8.00 special

Workunit 2586487990 (30oc16ab.3216.106890.16.43.244)
Task 5832884260 (S=1, A=0, P=0, T=2, G=0, BG=3.187616) x41p_zi3t2b, Cuda 8.00 special
Task 5832884261 (S=1, A=0, P=0, T=2, G=0, BG=3.505734) v8.22 (opencl_nvidia_SoG) windows_intelx86
ID: 1875321 · Report as offensive     Reply Quote
Profile Jeff Buck Special Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1875324 - Posted: 27 Jun 2017, 3:28:30 UTC - in response to Message 1874886.  

So far, I've only noted one task with a problem using zi3v, and I'm pretty sure that's an isolated incident. Task 5828724704 originally was started on a GTX 750 Ti but, following a reboot, restarted on the GTX 960. Before the restart, it looks like it was running fine, but afterwards it went haywire, identifying 25 bogus Triplets with non-numeric peaks (i.e, "peak=-nan"). I imagine that it's just some sort of restart timing issue, though perhaps on a restart like that the memory usage spikes in some way. Only if it happens again will I really be concerned. Anyway, that task is currently in an Inconclusive state but I expect it to go Invalid once the tie-breaker reports in.
Unfortunately, this was not an isolated incident. I've now had two more tasks, 5832726585 and 5832726581, which went haywire with zi3v following a restart. The first one originally ran on a GTX 750 Ti and restarted on the GTX 960, at which point it reported 15 bogus Triplets with "peak=-nan", similar to the one from Friday. The second one also started on a GTX 750 Ti but restarted on a different 750 Ti, this time quickly reporting 17 bogus Spikes. Both tasks seemed to be running fine before the shutdown and restart.
ID: 1875324 · Report as offensive     Reply Quote
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1875334 - Posted: 27 Jun 2017, 3:56:08 UTC - in response to Message 1875324.  

Some definite headscratchers in the last few posts :D Will think about those while planning the attack.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1875334 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 4264
Credit: 261,880,570
RAC: 354,492
United States
Message 1875350 - Posted: 27 Jun 2017, 7:42:27 UTC - in response to Message 1875321.  

I've got several new Inconclusives this evening featuring Best Gaussians. None of them have reportable Gaussian signals, however.

Workunit 2585507075 (23no16ab.17660.365297.10.37.52)
Task 5830849142 (S=1, A=0, P=0, T=0, G=0, BG=5.342847) x41p_zi3v, Cuda 8.00 special
Task 5830849143 (S=1, A=0, P=0, T=0, G=0, BG=6.369327) v8.22 (opencl_nvidia_SoG) windows_intelx86

Workunit 2585604974 (24no16ab.21342.5384.7.34.159)
Task 5831052801 (S=6, A=0, P=8, T=0, G=0, BG=5.469572) x41p_zi3v, Cuda 8.00 special
Task 5831052802 (S=6, A=0, P=8, T=0, G=0, BG=4.944084) v8.22 (opencl_nvidia_SoG) windows_intelx86

Workunit 2586487892 (30oc16ab.3155.106788.15.42.241)
Task 5832884399 (S=19, A=0, P=1, T=0, G=0, BG=3.664569) v8.22 (opencl_nvidia_SoG) windows_intelx86
Task 5832884400 (S=19, A=0, P=1, T=0, G=0, BG=3.816209) x41p_zi3t2b, Cuda 8.00 special

Workunit 2586487990 (30oc16ab.3216.106890.16.43.244)
Task 5832884260 (S=1, A=0, P=0, T=2, G=0, BG=3.187616) x41p_zi3t2b, Cuda 8.00 special
Task 5832884261 (S=1, A=0, P=0, T=2, G=0, BG=3.505734) v8.22 (opencl_nvidia_SoG) windows_intelx86
I just tested a Windows SoG App on My Mac against zi3v. Since the other machine didn't have any other Inconclusives, I decided to give it a try. My CPU says zi3v is correct on this task, http://setiathome.berkeley.edu/workunit.php?wuid=2586601005 So, since you are able to test those tasks with a Windows CPU, I'd say do that. One of the tasks I've been waiting on at Beta finally finished, it is basically the same as the task I just tested, https://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=9809103 The CPU App says the SoG App is wrong. I've been watching another task at Beta, https://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=9796907 Since it's still not validated, I downloaded it and now testing it on my CPU since it's the New CPU App destined for Main. If you look at it, you can see the Best Gaussian is different than the Reported Gaussian which is how my CPU and zi3v behave.
Anyway, after this next test I'm not going to worry about Gaussians anymore, unless some change is made that needs to be investigated.
ID: 1875350 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6005
Credit: 83,851,834
RAC: 28,525
Russia
Message 1875351 - Posted: 27 Jun 2017, 7:43:20 UTC - in response to Message 1875294.  

Another re-re-processing could be done. But I really would like to know why it happens in the first place.

Sure, it would be workaround not fix, but would allow to move on next stage faster.


I could also just stop reporting any pulses at the exact time/fft/..., just pretend they did not happen. But I do not want to.

No, that's not good way to go at all. It would be just as introducing RFI manually on particular area of parameter space.
If signal there inconclusives will arise anyway. But correct ones will be swarmed by GPU because of better performance, not because of validity.



The bug is bugging me. Peak, Time, Period and Score + fft_len always the same. Freq and chirp wary.
The first pulsefind in the code (8k len).

Please explain more verbose here.
First pulsefind is done on zero chirp and its length 8, not 8k. What did you mean by fist and 8k here?


Not the l2m version.

??sorry?


My suspect is a memory overflow, a misbehaved pointer/memory management, overheating, bad VRAM, uninitialized variable/mem area, an error in the chirp code/fft lib/my code or then it is an alien trying to hide its existence.

From all this I would go further with uninitialized variable/mem area
Same value for particular field could be just as attempt to interpret NaN or 0xDEAD


Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286407.06, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286574.7, score=4.562, chirp=-83.442, fft_len=8k

So, not very first but at 8k.
It's strange indeed cause can't devise any pekuliarity of 8k. 16k would mean "square" matrice for gausfit if I recal correctly (not quite square but w/o scaling), but 8k and for Pulse... no idea right now.

I would propose "heavy debugging" here. That is, to print data arrays. If it's 0xDEAD indeed you will notice it immediately.
If not then should be compared with working OK version.

And most important - can this be reproduced in offline run?
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1875351 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6005
Credit: 83,851,834
RAC: 28,525
Russia
Message 1875352 - Posted: 27 Jun 2017, 7:50:26 UTC - in response to Message 1875324.  
Last modified: 27 Jun 2017, 7:52:09 UTC

The first one originally ran on a GTX 750 Ti and restarted on the GTX 960, at which point it reported 15 bogus Triplets with "peak=-nan", similar to the one from Friday. The second one also started on a GTX 750 Ti but restarted on a different 750 Ti, this time quickly reporting 17 bogus Spikes. Both tasks seemed to be running fine before the shutdown and restart.

Damage in checkpointing. Most probably, some missing re-initialization of GPU-side signal buffers after restart.
"Proper" fix would be to add those re-initializations but if app's target only really fast GPUs adequate solution will be just skip checkpointing at all. This will add overhead on restart (to reprocess from beginning) but simplifies code and avoids this issue completely (positive side effect could be some small speedup on all non-checkpointing tasks)
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1875352 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6005
Credit: 83,851,834
RAC: 28,525
Russia
Message 1875354 - Posted: 27 Jun 2017, 7:58:55 UTC - in response to Message 1875318.  

Sooooo.....I kinda figured I must've missed some key element there.

Better try from science dir or edit script.
ref has -verb option by default and not sure how CUDA binary will react on this one.
(or just try to remove additional switches from beginning of script).
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1875354 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6005
Credit: 83,851,834
RAC: 28,525
Russia
Message 1875355 - Posted: 27 Jun 2017, 8:02:51 UTC - in response to Message 1875321.  

I've got several new Inconclusives this evening featuring Best Gaussians. None of them have reportable Gaussian signals, however.

That important.
SoG (actually, all OpenCL MB) have different processing for case with already found reportable Gaussian and before such event.
So, your run indicates that issue on no reportable gaussian path .
Another possibility - just boundary precision issue.

Please try to grab 1-2 tasks and reprocess them offline with SoG vsv non-SoG binaries.
If they will disagree then report back and provide task as testcase - that requires bugfixing.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1875355 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6005
Credit: 83,851,834
RAC: 28,525
Russia
Message 1875356 - Posted: 27 Jun 2017, 8:10:05 UTC

BTW, can Petri's app run on such GPUs (mobile ones) ?:

NVIDIA GeForce 940MX
and
NVIDIA GeForce 820M
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1875356 · Report as offensive     Reply Quote
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1366
Credit: 546,148,498
RAC: 141,831
Sweden
Message 1875358 - Posted: 27 Jun 2017, 9:15:33 UTC - in response to Message 1875356.  
Last modified: 27 Jun 2017, 9:24:06 UTC

BTW, can Petri's app run on such GPUs (mobile ones) ?:

NVIDIA GeForce 940MX
and
NVIDIA GeForce 820M


This is what i've found out:

"The executable is version zi3t2b and it can be run on sm_35, 50, 52, and 61. (750,780,980,1080 and likes).
With 1 Mb of GPU ram you need -unroll 1. Other can use -unroll autotune.
Use -bs to reduce CPU usage.
Set -pfb to 8, 16 or 32."

https://en.wikipedia.org/wiki/CUDA

EDIT: 820M seems out of luck.. CC2.1 only but the 940MX seems to work CC50

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 1875358 · Report as offensive     Reply Quote
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1366
Credit: 546,148,498
RAC: 141,831
Sweden
Message 1875361 - Posted: 27 Jun 2017, 9:36:45 UTC

How do we know that the CPU portion of latest code isn't effected by sporadic errors when running Hyperthreading enabled?!
Has anybody run tests with and without HT on Skylake and Kaby Lake computers?!

https://setiathome.berkeley.edu/forum_thread.php?id=81641

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 1875361 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 4264
Credit: 261,880,570
RAC: 354,492
United States
Message 1875364 - Posted: 27 Jun 2017, 10:20:45 UTC

And the results are...
My OSX CPU Agrees with the CPU SETI@home v8 v8.06 (alt) windows_x86_64 here, https://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=9796907
Pretty close match, the important parts are;
8.06 (alt) windows
Gaussian: peak=2.961493, mean=0.5020308, ChiSq=1.415568, time=17.62, d_freq=1420573507.32, score=0.3145776, null_hyp=2.268677, chirp=-98.677, fft_len=16k
Best gaussian: peak=3.659641, mean=0.5301717, ChiSq=1.25771, time=67.95, d_freq=1420577155.89, score=0.794832, null_hyp=2.202944, chirp=-65.055, fft_len=16k
SSE4.1xjf OS X r 3344
Gaussian: peak=2.961488, mean=0.5020312, ChiSq=1.415569, time=17.62, d_freq=1420573507.32, score=0.3145256, null_hyp=2.268674, chirp=-98.677, fft_len=16k
Best gaussian: peak=3.659646, mean=0.5301709, ChiSq=1.257715, time=67.95, d_freq=1420577155.89, score=0.7949486, null_hyp=2.202953, chirp=-65.055, fft_len=16k

Of course this means the SoG App is Wrong...again. Seems the SoGs have a propensity to report the reported Gaussian as Best. One of them is wrong.
All those people testing these Apps at Beta and no one picked this up? Nevermind.
ID: 1875364 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6005
Credit: 83,851,834
RAC: 28,525
Russia
Message 1875371 - Posted: 27 Jun 2017, 11:59:29 UTC - in response to Message 1875358.  

Thanks!
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1875371 · Report as offensive     Reply Quote
Previous · 1 . . . 36 · 37 · 38 · 39 · 40 · 41 · 42 . . . 83 · Next

Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use


 
©2018 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.