Posts by Raistmer

1) Message boards : Number crunching : New binary to test on beta (Message 1875462)
Posted 5 hours ago by Profile Raistmer
Post:
but with the graphics part fixed (and it is fixed)

Actually that's all we should know about 8.07 before release.

IMO Eric can do release any day from now.
2) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1875461)
Posted 5 hours ago by Profile Raistmer
Post:

How there can be a "best" signal that isn't worth reporting (when there are apparently 3 inferior signals that are) is beyond me, but that's apparently the standard. :^)


That's the right question.
3) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1875460)
Posted 5 hours ago by Profile Raistmer
Post:
All those people testing these Apps at Beta and no one picked this up? Nevermind.

And you was just one of them, if I recall correctly :D

The right question to ask how non-reportable could be better than reportable one.?....

And in this test reportable comes later in processing chain so SoG operated in "no reportable so far" path again.
4) Message boards : Number crunching : "BOINC portable" for Windows hosts (Message 1875457)
Posted 5 hours ago by Profile Raistmer
Post:
Characterised issue with GPU detection little more: http://boinc.berkeley.edu/dev/forum_thread.php?id=11690

So, in my case "command-line BOINC" doesn't see GPU, "GUI BOINC" - does.
It's weird cause my own GPU apps built for console subsystem and detect GPUs just OK....

(I tried to copy BOINC dirs and bring them to similar PC - no luck, then I tried on original one that surely uses GPUs with GUI manager... - same issue. BOINC.exe being started from command prompt doesn't detect GPUs)
5) Message boards : Number crunching : "BOINC portable" for Windows hosts (Message 1875456)
Posted 6 hours ago by Profile Raistmer
Post:
On the offline boinc, did you try adding the --detectgpus flag when starting the client?

Not so far, thanks, will try.

Such flag undefined for boinc client I use (7.6.33 x64)
6) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1875371)
Posted 1 day ago by Profile Raistmer
Post:
Thanks!
7) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1875356)
Posted 1 day ago by Profile Raistmer
Post:
BTW, can Petri's app run on such GPUs (mobile ones) ?:

NVIDIA GeForce 940MX
and
NVIDIA GeForce 820M
8) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1875355)
Posted 1 day ago by Profile Raistmer
Post:
I've got several new Inconclusives this evening featuring Best Gaussians. None of them have reportable Gaussian signals, however.

That important.
SoG (actually, all OpenCL MB) have different processing for case with already found reportable Gaussian and before such event.
So, your run indicates that issue on no reportable gaussian path .
Another possibility - just boundary precision issue.

Please try to grab 1-2 tasks and reprocess them offline with SoG vsv non-SoG binaries.
If they will disagree then report back and provide task as testcase - that requires bugfixing.
9) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1875354)
Posted 1 day ago by Profile Raistmer
Post:
Sooooo.....I kinda figured I must've missed some key element there.

Better try from science dir or edit script.
ref has -verb option by default and not sure how CUDA binary will react on this one.
(or just try to remove additional switches from beginning of script).
10) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1875352)
Posted 1 day ago by Profile Raistmer
Post:
The first one originally ran on a GTX 750 Ti and restarted on the GTX 960, at which point it reported 15 bogus Triplets with "peak=-nan", similar to the one from Friday. The second one also started on a GTX 750 Ti but restarted on a different 750 Ti, this time quickly reporting 17 bogus Spikes. Both tasks seemed to be running fine before the shutdown and restart.

Damage in checkpointing. Most probably, some missing re-initialization of GPU-side signal buffers after restart.
"Proper" fix would be to add those re-initializations but if app's target only really fast GPUs adequate solution will be just skip checkpointing at all. This will add overhead on restart (to reprocess from beginning) but simplifies code and avoids this issue completely (positive side effect could be some small speedup on all non-checkpointing tasks)
11) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1875351)
Posted 1 day ago by Profile Raistmer
Post:
Another re-re-processing could be done. But I really would like to know why it happens in the first place.

Sure, it would be workaround not fix, but would allow to move on next stage faster.


I could also just stop reporting any pulses at the exact time/fft/..., just pretend they did not happen. But I do not want to.

No, that's not good way to go at all. It would be just as introducing RFI manually on particular area of parameter space.
If signal there inconclusives will arise anyway. But correct ones will be swarmed by GPU because of better performance, not because of validity.



The bug is bugging me. Peak, Time, Period and Score + fft_len always the same. Freq and chirp wary.
The first pulsefind in the code (8k len).

Please explain more verbose here.
First pulsefind is done on zero chirp and its length 8, not 8k. What did you mean by fist and 8k here?


Not the l2m version.

??sorry?


My suspect is a memory overflow, a misbehaved pointer/memory management, overheating, bad VRAM, uninitialized variable/mem area, an error in the chirp code/fft lib/my code or then it is an alien trying to hide its existence.

From all this I would go further with uninitialized variable/mem area
Same value for particular field could be just as attempt to interpret NaN or 0xDEAD


Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286407.06, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286574.7, score=4.562, chirp=-83.442, fft_len=8k

So, not very first but at 8k.
It's strange indeed cause can't devise any pekuliarity of 8k. 16k would mean "square" matrice for gausfit if I recal correctly (not quite square but w/o scaling), but 8k and for Pulse... no idea right now.

I would propose "heavy debugging" here. That is, to print data arrays. If it's 0xDEAD indeed you will notice it immediately.
If not then should be compared with working OK version.

And most important - can this be reproduced in offline run?
12) Message boards : Number crunching : "BOINC portable" for Windows hosts (Message 1875346)
Posted 1 day ago by Profile Raistmer
Post:
I see, you don't copy data directory at all.
But then you would need to to do item 6 - connect to projects again and again.
Keeping in mind that there are many PCs to handle it will be time-consuming.
I did copy both program _and_ data folders. That's why duplicates arise.
Can issues I encountered come from that too?

EDIT:
To handle N PCs I would need smth like this on flash drive:

E:\BOINC
E:\BOINCdata01
E:\BOINCdata02
....
E:\BOINCdataN

As I understand initial project attach will be required only on first directory structure creation... but N times.
I would like to optimize this initial overhead too.
EDIT2: also, there should be some gather/scatter script too cause obviously I need local copy on PC while it crunch...

Smth like:
Scatter:
Copy E:\BOINC to C:
Move E:\BOINCdata_i_ to C:\BOINCdata (where _i_ should be determined automatically based on existing directory structure)
Launch BOINC from local drive

Gather:
Stop BOINC on local driver
Delete C:\BOINC
move C:\BOINCdata to E:\BOINCdata_j_

In general case _i_ != _j_ that's not good if PCs not identical.
So additionally I need to ensure some persistence in data directories distribution over set of N hosts...
13) Message boards : Number crunching : "BOINC portable" for Windows hosts (Message 1875287)
Posted 1 day ago by Profile Raistmer
Post:
On the offline boinc, did you try adding the --detectgpus flag when starting the client?

Not so far, thanks, will try.
14) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1875275)
Posted 1 day ago by Profile Raistmer
Post:

I know there is a problem with my code reporting over 20 pulses at identical time with a small difference in frequency. That is an extremely rare event. And it always happens at 46.something.

Could it be solved in same fashion - by re=processing after discovery?
15) Message boards : Number crunching : "BOINC portable" for Windows hosts (Message 1875273)
Posted 1 day ago by Profile Raistmer
Post:
why dont you simply make a portable windows stick and add boinc to its programms ?
http://lifehacker.com/how-to-run-a-portable-version-of-windows-from-a-usb-dri-1565509124


Because PC should be used for what it was bought, not only for BOINC.
And OS already installed and it's Windows. So - no bootable flash drives, only portable BOINC.
16) Message boards : Number crunching : Anything relating to AstroPulse tasks (Message 1875272)
Posted 1 day ago by Profile Raistmer
Post:
@Raistmer.

I will shortly send you a message with the links to the results files I have made.

Please tell me if you have any problems retreiving them.

Got them, thanks
17) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1875253)
Posted 1 day ago by Profile Raistmer
Post:
SoG has own parallelized reduction for Gaussians (should implement same logic though).
And what warries me - the difference between SoG and non-SoG OpenCL results - that's definitely worth check when I'll have easy access to hardware for that.
18) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1875239)
Posted 1 day ago by Profile Raistmer
Post:
Hmmm, AK8 branch *might be missing Joe's fix from ~2011 ? (svn posted earlier in thread):

gaussfit.cpp (stock seti_boinc branch):
report = chisqOK // chisqOK is (ChiSq <= swi.analysis_cfg.gauss_chi_sq_thresh)
&& (gi.g.peak_power >= gi.g.mean_power * swi.analysis_cfg.gauss_peak_power_thresh)
&& (gi.g.null_chisqr >= swi.analysis_cfg.gauss_null_chi_sq_thresh);
if (gaussian_count==0||report) {
gi.score = score_offset
+lcgf(0.5*gauss_dof,std::max(gi.g.chisqr*0.5*gauss_bins,0.5*gauss_dof+1))
-lcgf(0.5*null_dof,std::max(gi.g.null_chisqr*0.5*gauss_bins,0.5*null_dof+1));
}
// Only include "real" Gaussians (those meeting the chisqr threshold)
// in the best Gaussian display.
if (gi.score > best_gauss->score && chisqOK) {
*best_gauss = gi;

....


The special appears to have it, as does Cuda baseline.


opt build more complex in this area:
BOOLEAN chisq = (ChiSq <= swi.analysis_cfg.gauss_chi_sq_thresh);
...
if (chisq) {
#endif
BOOLEAN newbest=false, report;
//R: same optimization as for GPU build: if there is reportable Gaussian already -
//R: skip score calculation for all except new reportable Gaussians
//R: TODO: carefully check if it's valid assumption!
report = chisq && (PeakPower >= TrueMean * PoTInfo.GaussPeakPowerThresh) &&
(null_ChiSq >= swi.analysis_cfg.gauss_null_chi_sq_thresh);
if(gaussian_count==0 || report){
score = calc_GaussFit_score(ChiSq,null_ChiSq);
newbest = chisq && (score > best_gauss->score);
}
#if USE_COUNTERS
Counter<Gaussian_skip6_low_power>::update(!(PeakPower >= TrueMean * PoTInfo.GaussPeakPowerThresh));
//fprintf(stderr,"best_score=%.7g, score=%.7g\n",best_gauss->score,score);
#endif
#ifdef BOINC_APP_GRAPHICS
if (newbest || report || graphics) {
#else
#if USE_COUNTERS
if(! (newbest||report) ){
Counter<Gaussian_miss>::update(1);
}
#endif
if (newbest || report) {
#endif
....

But chi-square check seems to be present.
19) Message boards : Number crunching : Anything relating to AstroPulse tasks (Message 1875230)
Posted 1 day ago by Profile Raistmer
Post:
OK, TMI could comlicate process so lets concentrate on single task: single_pulses.wu

All subsequent runs only with this task.

Please send me full content of Testdatas directory (upload it to some cloud storage)

And after that please re-run single_pulses.wu on another GPU on the same host (and upload data again)
20) Message boards : Number crunching : Anything relating to AstroPulse tasks (Message 1875227)
Posted 1 day ago by Profile Raistmer
Post:

Anything special I could do to run on both GPU's in the same run?

Yes. Add --device 0 and --device 1 to command lines in BenchCfg.txt


Next 20


 
©2017 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.