Posts by petri33

1) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1875474)
Posted 1 day ago by Profile petri33Project Donor
Post:
Hi,




The bug is bugging me. Peak, Time, Period and Score + fft_len always the same. Freq and chirp wary.
The first pulsefind in the code (8k len).

Please explain more verbose here.
First pulsefind is done on zero chirp and its length 8, not 8k. What did you mean by fist and 8k here?



The first pulse find in the souce code file cudaAcc_pulsefind.cu. There are 2 versions of pulse find, the first one is used for fft lengths 1k-16k and the second l2m is used for fft len < 1k.
So I was referring not to the order of pulse find runs but to the place of the kernel in the source code file.



Not the l2m version.

??sorry?


See above.

About the rare errors: I'm running my cards at high clock speed for memory and GPU. They have fan at 100% and temperatures near 70C. That may be one cause for the errors. In summer the cards sem to have more lockups too.

Petri
2) Message boards : Number crunching : Panic Mode On (106) Server Problems? (Message 1875299)
Posted 2 days ago by Profile petri33Project Donor
Post:
WOW doesn't start for a month and a half....................


But.. I suspect the the caching of the results and hoarding of WU's will start earlier each year.

And I'm not hoarding. I have set my cache limit to 2200 WUs for my 16 virtual GTX 1080 Tuesday cores to get better over those weekly Server Problems and a bit better over those Tuesdays.
3) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1875294)
Posted 2 days ago by Profile petri33Project Donor
Post:

I know there is a problem with my code reporting over 20 pulses at identical time with a small difference in frequency. That is an extremely rare event. And it always happens at 46.something.

Could it be solved in same fashion - by re=processing after discovery?


Another re-re-processing could be done. But I really would like to know why it happens in the first place.
I could also just stop reporting any pulses at the exact time/fft/..., just pretend they did not happen. But I do not want to.

The bug is bugging me. Peak, Time, Period and Score + fft_len always the same. Freq and chirp wary.
The first pulsefind in the code (8k len). Not the l2m version.

My suspect is a memory overflow, a misbehaved pointer/memory management, overheating, bad VRAM, uninitialized variable/mem area, an error in the chirp code/fft lib/my code or then it is an alien trying to hide its existence.

Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286407.06, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286418.24, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286429.41, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286440.59, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286451.76, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286462.94, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286474.12, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286485.29, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286496.47, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286507.64, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286518.82, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286529.99, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286541.17, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286552.35, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286563.52, score=4.562, chirp=-83.442, fft_len=8k
Pulse: peak=41.66666, time=46.17, period=23.26, d_freq=2323286574.7, score=4.562, chirp=-83.442, fft_len=8k
4) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1875266)
Posted 2 days ago by Profile petri33Project Donor
Post:
Hi,
I'm here just for a quick peek.
The special pulse find is doing a scan with unroll N depending on autotune or the user set limit for each CPU-code-icfft-round and if the scan finds a suspected pulse the round is run again with unroll 1 for that CPU-code-icfft. That (to find a pulse or an even better unreported best) is a rare event for real pulses and for best pulses that happens at the first round and then more and more infrequently since the bar for best and not yet reported rises after each one found.

The gauss-find is to my mind not touched for a long time in the special code. My mind-memory may be short. I guess it (the gauss find code) is working as it should be. It is a separate problem if it does not.

I know there is a problem with my code reporting over 20 pulses at identical time with a small difference in frequency. That is an extremely rare event. And it always happens at 46.something.

Petri.
I'm following and will provide help when needed later on in this summer..
5) Message boards : Number crunching : Long CPU times (Message 1874525)
Posted 6 days ago by Profile petri33Project Donor
Post:
That is an time estimate, wait for some tasks to complete first.

+1
6) Message boards : Number crunching : Seti Not Using Cuda, Einstein Does (Message 1874514)
Posted 6 days ago by Profile petri33Project Donor
Post:
Problem solved. After reading GPU FAQ went to Seti@home preferences and saw that use NVIDIA GPU was NOT checked. Too many settings in different places. The above comments helped getting a good mix of Einstein and Seti and Seti now using the GPU.


Nice to hear that the communicaty can help with a bit of a self investigation and an innovative mind. Wellcome to our ranks!

To make that easier ...
EDIT: THE NONWORD COMMUNICATY: All of those who have helped.
7) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1874317)
Posted 7 days ago by Profile petri33Project Donor
Post:
No the isn't. I have been running low resource share (100 seti, 2 beta) and it has only been using 1 GPU for beta with that setting. So basically everything is Device #3.

PS, The errors are me forgetting I had an app_config for 2 tasks prior to this, the 'sauce' didn't like that at all.

The magig of the sauce is to drain all the juice. Just once. :))
P.
8) Message boards : Number crunching : No GPU-Tasks offered (LinuxMint 18.1) (Message 1873980)
Posted 10 days ago by Profile petri33Project Donor
Post:
Hi,

I see a line saying "Mon 19 Jun 2017 07:54:21 AM CEST | | don't use GPU while active"
That can be set in BOINC manager GUI computing preferences to allow GPU computing all times not just when computer is idle..
Petri
9) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1873882)
Posted 10 days ago by Profile petri33Project Donor
Post:
@Petri33 please email latest and I'll update the alpha folder ASAP.

Emailed.
10) Message boards : Number crunching : GPU FLOPS: Theory vs Reality (Message 1873204)
Posted 13 days ago by Profile petri33Project Donor
Post:
Hey Petri,
I was about to PM you. I was wondering if you thought it would be a good idea to submit the Linux zi3v to Beta. Along with zi3t2b it appears to be well within the 5% Inconclusive rate requested by the project. Looking over a few Hosts it would appear the Gross rate is around 3.5% with the Net Inconclusive rate a bit lower. So far the only problem is that zi3v uses a little more vRam and I'm seeing problems on my Mac again with the 2 GB card.
Any Ideas?


The t2b is kind of an original. It is my code and it does not try to recheck the pulses.
The zi3v scans the Wu and if it finds any suspects it runs the part pf the wu again with unroll 1. That idea came from jason_gee. I tried and coded it and That is what I'm running now. It may be more accurate and a bit slower. Just keep testing.

To tell you all,

I'd like to stay as a developer/experimenter/propel hat/tin foil hat escapee/a man; and let the others do the political decisions. I release my code and you can do what ever you want to.

This is a hobby for me. I'd like to keep it that way. I was a SW/DB engineer for 20 years. Now I'm a teacher and a teacher for children/adults with special needs.

So TBar, it is entirely up to You. A <5% level is good enough. You decide. I'll provide the updates when I feel to.

Thank you TBar for all the testing.
p.s. I read that there is a V9 MB coming. I'll wait for that and do whatever is needed.

Petri
11) Message boards : Number crunching : Building a 32-Thread Xeon Monster PC for Less Than the Price of a Flagship Core i7 (Message 1873203)
Posted 13 days ago by Profile petri33Project Donor
Post:
Hi,
for the top 50 GPU/GPU users I'd say it is fairly easy to overcome the limit.
I've done that for demonstration purpose only on my computer/ GPU (I do not have 16 GPUs). I have 4.

Those of you running over 40 cores/3xGPUs and with a little of coding experience I'd suggest looking at boinc client code.

Petri
12) Message boards : Number crunching : New GTX 1070 and SOG vs CUDA 42.... (Message 1873200)
Posted 13 days ago by Profile petri33Project Donor
Post:
Hi,
If you are running a dedicated cruncher you may consider Linux. You'd see an enormous RAC increase with the right app.

Petri
13) Message boards : Number crunching : GPU FLOPS: Theory vs Reality (Message 1873199)
Posted 13 days ago by Profile petri33Project Donor
Post:
Hey Shaggie, is there by chance anyway to pull Linux cuda results out of your dataset for a chart?

I'm guessing you mean Petri's special app and want to know just how much faster it is.

As I've said before including the anonymous platform would defeat the purpose of this comparison; I deliberately filter for only the stock app running one job at a time so that you can make meaningful comparisons and get a sense of the relative performance and power consumption for each.

The other problem with the anonymous platform is that not clear how many jobs are being run concurrently per card; the regular CUDA app only really performs if you double or triple-job it it but the data I have to work with can't see the concurrency so I can't tell if it's 'really slow' (because concurrent), really fast (because Petri's app), or just a normal (Lunatics build). People running stock tend not to mess around with multiple jobs so those that do are eliminated as outliers by the median-window (plus there's a clue in the output from the OpenCL app that I can use to sometimes detect when they're doubling up so I can reject them).

I'm also opposed to encouraging what I see as basically cheating -- if Petri's app isn't accurate enough for everybody to use then the extra credit it awards those that use it comes at the extra validation cost of those of us running stock who have to double (and possibly triple) check the work that it does.

When it's part of the stock app set I'll be happy to report on the relative performance of the OpenCL SoG vs CUDA apps (as I've done before).


And to the cheating.. I'm doing that. I do not have 16 1080Tu graphics cards. I have only 4: 3x1080+1x1080Ti.
But It would still be interesting to know, since the top 10 hosts is full of linux anonymous apps, how do they perform...
14) Message boards : Number crunching : GUPPI Rescheduler for Linux and Windows - Move GUPPI work to CPU and non-GUPPI to GPU (Message 1872609)
Posted 16 days ago by Profile petri33Project Donor
Post:
p.s. I recommend a cache that is somehow bound to the daily output/RAC.

I agree, even if that is a <EDIT> 15 12 hour cache, many of us would be happy!

+1
15) Message boards : Number crunching : GUPPI Rescheduler for Linux and Windows - Move GUPPI work to CPU and non-GUPPI to GPU (Message 1872606)
Posted 16 days ago by Profile petri33Project Donor
Post:
I'm not rescheduling. I'm cheating. I fake I have a GTX 1080 Tu. That is a Tuesday edition. I get 400 tasks per GP.U.
Since the the day I started that my daily score has levelled out to a flat line. My GPUs were task-limited during the week. The TUesday will still hurt me. I'm not planning to do more harm. With shorties I'll run out of cache, with Guppi I can manage -- the Tuesdays.

I'm sorry if I have offeded anyone with my settings.

I feel (And I think that many Linux users are going to feel) that something must be done to the 100 WU limit.

I will not release the one/two line fix to this problem. It is the Administrations work to do.
--
p33

p.s. I recommend a cache that is somehow bound to the daily output/RAC.
16) Message boards : Number crunching : Restarting BOINC Manager (Message 1872230)
Posted 19 days ago by Profile petri33Project Donor
Post:
Can you try B I G A in ps?
ps -A|grep boinc
17) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1872140)
Posted 19 days ago by Profile petri33Project Donor
Post:
I see what you are saying now Petri Re: 4xGPU reporting. You are currently showing:
[16] NVIDIA GeForce GTX 1080 Tu (4095MB) driver: 381.09 OpenCL: 1.2

But then it is Saturday for you right now, strange ....

Yup, that is a test.
I'll try to make it happen only on Tuesdays.
:)
18) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1872089)
Posted 19 days ago by Profile petri33Project Donor
Post:
p.s. Have you seen the Tu'esday edition of the GTX1080 series yet?
That must be the 1080Ti KingPin @2025 MHz.

My 750 PSU just screamed at me saying "Not Bloody Likely!"


No, no, no, ...

A normal 50W channel blower unit is able to make four of NVIDIAs basic design blower models to run 2045 MHz @ 67degC.

What I meant with the Tuesday edition,
is that the GPUs are behaving badly.
Now they (4) seem to make themselves
manifesting having 16 of them,
but in the upcoming weeks they will do that (report of them being 4 x actual) only on Tuesdays.

HUH
19) Message boards : Number crunching : I have a new system, expected runtimes? (Message 1872080)
Posted 19 days ago by Profile petri33Project Donor
Post:
What Hal said about populating all 4 channels has me thinking if my i7-3930K would be the same currently have 4x4GB installed, I have 4 more new ones I have never used/tried. I'm not having luck finding info, and Intel's Gen3 datasheet is unavailable, grrr.

The OS certainly doesn't need it with 2.5 of 16GB used, but does the CPU want it, hmmm .......


I had a 3930K and a pair of mobo's (ASUS WS and similar). There was a recommended order of populating the RAM slots. If you can enable the BIOS option saying 'two channel' or something you may have the correct settings and placings for the RAM modules. I was able to run 1T but only when the modules were in the right places. Otherwise it needed 2T.

Petri
20) Message boards : Number crunching : Reload app_info.xml ? (Message 1872076)
Posted 19 days ago by Profile petri33Project Donor
Post:
The unroll is not implemented in any of the file interfaces.

In the upcoming version (From TBar available, when released) the -unroll autotune will be a default and the use of the blocking sync is an another default.

If you do not want any BS you must state that in the command line saying -nobs . The BS means blocking sync and that releases your CPU to do CPU processing but it will slow down your GPU work. I choose no BS.

--
Petri


Next 20


 
©2017 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.