Posts by Raistmer


log in
1) Message boards : Number crunching : GUPPI Rescheduler for Linux and Windows - Automatically move GUPPI work to CPU and non-GUPPI to GPU (Message 1805564)
Posted 1 hour ago by Profile Raistmer


If you decide some time to make another version of GUPPIRescheduler you may find some XML open source C++ libraries which should ease the parsing and changing of client_state.xml

And some basic XML-parsing ability provided with BOINC/SETI code itself.
We use it to parse SETI data files (WUs) that are XML-ones too.
2) Message boards : Number crunching : What are acceptable acronymes/terminology for the S@h main forum? (Message 1805563)
Posted 1 hour ago by Profile Raistmer

Interesting: I wonder whether the general signal finding at mid-AR has become more optimal, or something has slowed down (relatively speaking) the pulsefinding at VLAR?

For our opt builds (and i measured SSSE3 mostly with Q9450 host) VLAR always was harder. And this inspite of Alex Kan's and Ko swarm of pulsefind codelets in CPU code.
Maybe worth to build more modern, v8-based, curves for CPU-only build opt vs stock. We know that opt in whole faster but this still allows some inefficiencies in some parts...
[though we usually test versus PG set and there is separate VLAR and VLAR is mostly Pulsefind]... Well, maybe worth to test opt vs stock CPU on set of PulseFind test modules and FFT sizes test modules Joe supplied to us some time ago.. To get more precise picture if opt faster always or there are some hidden inefficiency for some of WUs.
3) Message boards : Number crunching : What are acceptable acronymes/terminology for the S@h main forum? (Message 1805561)
Posted 2 hours ago by Profile Raistmer
And a lot more in Estimates and Deadlines revisited. That thread dates from before general purpose GPUs had been invented, and from a much simpler SETI search (about v6, I think).

Interesting that midAR took more time than VLAR on Joe's hardware those times.
Picture quite changed in this area now.
4) Message boards : Number crunching : SETI applications for NVIDIA GPU improvement - how you can help (Message 1805560)
Posted 2 hours ago by Profile Raistmer
The test with freeing a CPU core for feeding the GPUs resulted the already known conclusion: one free CPU core is not enough for two GPUs to maximize the GPU load. For best result you should free one CPU core for each GPU (while running one WU/GPU at a time). This is for my host, but YMMV. Now I'm back to using my old app_config and commnadline parameters.


I would recommend to try also -hp -cpu_lock addition to your tuning line to improve GPU load with busy CPU.
5) Message boards : Number crunching : What are acceptable acronymes/terminology for the S@h main forum? (Message 1805470)
Posted 12 hours ago by Profile Raistmer

I believe one of the lunatics has generated a time/AR chart several years ago. Which showed a pretty clear change in runtime based on the AR.

Yep. Here, for example:http://lunatics.kwsn.info/index.php/topic,1806.0.html
6) Message boards : Number crunching : SETI applications for NVIDIA GPU improvement - how you can help (Message 1805431)
Posted 14 hours ago by Profile Raistmer
Thanks for report.
Good that no usability problems at defaults.
7) Message boards : Number crunching : Is the S@H server biased at sending more guppis to NV GPUs? (Message 1805293)
Posted 1 day ago by Profile Raistmer
Currently the distributed files size is the same. There have be some trials on Beta recently with "double size" files - those I currently have are "Arecibo type" names, and have taken about twice as long to run as standard length files. I guess Eric has been trying something on the splitters over there.

If they run twice as long check your host setup. Execution times did not change cause length of telescope data inside file remained the same. Just each point takes twice in bits. {And this change applied both to Arecibo and GBT data}
8) Message boards : Number crunching : SETI applications for NVIDIA GPU improvement - how you can help (Message 1805214)
Posted 1 day ago by Profile Raistmer
And what about running with defaults?
9) Message boards : Number crunching : SETI applications for NVIDIA GPU improvement - how you can help (Message 1805166)
Posted 1 day ago by Profile Raistmer
Another test to do by volunteers:

On beta (or in anonymous platform mode on main) take 8.16 OpenCL app or later and add -tt F parameter to command line.
increase F value (default is 15 that means 15 ms target time for partial PulseFind kernel) and watch for host usability (that is, GUI lags, missing letters at typing and so on). At what F value lags appear?
From performance point of view it's better to have longer kernels but this can result in GUI lags. This testing needed to establish best possible default value for unattended run.

EDIT: describe your host config (preferably with link to host on beta) along with report, please.


I would like to remind that w/o feedback about usability there will be no any corrections done and chance to get app working nice out of the box for your particular host will be missed.
Please check usability of 8.16+ builds for your particular host and report back.
10) Message boards : Number crunching : GTX 1060 or RX 480? Best bang for Buck seti-wise? (Message 1804986)
Posted 2 days ago by Profile Raistmer

Is there some app dev going on for recent AMD GPUs?

Of course cause OpenCL app is multiplatform one.
11) Message boards : Number crunching : The big picture; any pieces missing? (Message 1804977)
Posted 3 days ago by Profile Raistmer
Hello all!
2. the Lunatics v0.44 CPU apps to be distributed by the S@h server as stock (if possible) since SoG seems good-enough-for-now on the GPUs until Petri's apps;

It does more than month already.


4. monthly formal updates from the S@h volunteer app developers;

??
Maybe daily reports on volunteer devs tasks progress instead? I understand that some reading is entertaining but some writing will help better :P
There is special pinned thread to subscribe BTW.
Cause no sense to write every month what need to be done just to write it. Who really can help already found ways to do that, for others it's entertaining only...


6. an alternative metric for baselines other than RAC & credits per day;

Formulated already, just put in WiKi.

And regarding benchmarks availability - Lunatics, download section. All that you need for offline benchmarking in SETI.
12) Message boards : Number crunching : SETI applications for NVIDIA GPU improvement - how you can help (Message 1803729)
Posted 8 days ago by Profile Raistmer
Yes, was using my commandlines on both machines for this comparison

What command lines, please list them.
13) Message boards : Number crunching : SETI applications for NVIDIA GPU improvement - how you can help (Message 1803691)
Posted 8 days ago by Profile Raistmer
As a side r3430 is still about 60-90sec faster than r3486 when doing 3 at a time

What tuning lines were used in this comparison?
14) Message boards : Number crunching : SETI applications for NVIDIA GPU improvement - how you can help (Message 1803649)
Posted 9 days ago by Profile Raistmer
send them to me.
P.S. Ah, and before doing that make sure missed tasks not overflowed ones. Script recognize and excludes overflows automatically. Cause they have distorted timings.
Here are the files for a situation where I have 8GPU and 2 CPU tasks being blocked from Uploading.
Only 1 CPU file is in Times.txt
and there is no Overflow in client_state.xml
https://www.dropbox.com/sh/5jbnvivo9mtcq5g/AADIpSRrHx_rnPzaDyecZEqsa?dl=0
I'm tired but I double-checked so I don't know if I forgot something.
Cheers,
Rob zzzz soon

Task names to check?

All the ones with a <stderr> that isn't task: 28jn10ac.17896.6206.12.39.68


<stderr_txt> result map took 35.93ms, num_iter_wo_sync=12, mean_per_iter=2.994ms icfft=194302, FFTLength=4096, sync before triplet result map took 6.859ms, num_iter_wo_sync=3, mean_per_iter=2.286ms icfft=194315, FFTLength=2048, sync before triplet result map took 35.8ms, num_iter_wo_sync=12, mean_per_iter=2.983ms icfft=194319, FFTLength=2048, sync before triplet result map took 6.871ms, num_iter_wo_sync=3, mean_per_iter=2.29ms icfft=194333, FFTLength=4096, sync before triplet result map took 34.14ms, num_iter_wo_sync=13, mean_per_iter=2.626ms icfft=194336, FFTLength=4096, sync before triplet result map took 6.721ms, num_iter_wo_sync=3, mean_per_iter=2.24ms icfft=194349, FFTLength=512, sync before triplet result map took 33.37ms, num_iter_wo_sync=13, mean_per_iter=2.567ms icfft=194355, FFTLength=512, sync before triplet result map took 6.464ms, num_iter_wo_sync=3, mean_per_iter=2.155ms icfft=194371, FFTLength=4096, sync before triplet result map took 34.14ms, num_iter_wo_sync=12, mean_per_iter=2.845ms icfft=194374, FFTLength=4096, sync before triplet result map took 6.75ms, num_iter_wo_sync=3, mean_per_iter=2.25ms icfft=194387, FFTLength=2048, sync before triplet result map took 35.11ms, num_iter_wo_sync=12, mean_per_iter=2.926ms icfft=194391, FFTLength=2048, sync before triplet result map took 7.229ms, num_iter_wo_sync=3, mean_per_iter=2.41ms icfft=194405, FFTLength=4096, sync before triplet result map took 36.23ms, num_iter_wo_sync=12, mean_per_iter=3.019ms icfft=194408, FFTLength=4096, sync before triplet result map took 7.57ms, num_iter_wo_sync=3, mean_per_iter=2.523ms icfft=194421, FFTLength=1024, sync before triplet result map took 35.89ms, num_iter_wo_sync=12, mean_per_iter=2.991ms


there is no stderr header available with AR info so such records are skipped.
verbose builds not suitable for performance analysis BTW. They are special-purpose ones.
15) Message boards : Number crunching : SETI applications for NVIDIA GPU improvement - how you can help (Message 1803627)
Posted 9 days ago by Profile Raistmer
Thanks for such good structured and detailed report.
Nothing else with this build. I'll post another one soon - would be very good if you could do just similar run with it too.
16) Message boards : Number crunching : SETI applications for NVIDIA GPU improvement - how you can help (Message 1803533)
Posted 9 days ago by Profile Raistmer
Please test in single task per GPU mode with and w/o -use_sleep for comparison:
https://cloud.mail.ru/public/6wp7/cgfuAXmnc


Is this a new version or still part of the older r3486Final?


revision the same but binary is new.
17) Message boards : Number crunching : SETI applications for NVIDIA GPU improvement - how you can help (Message 1803510)
Posted 9 days ago by Profile Raistmer
Please test in single task per GPU mode with and w/o -use_sleep for comparison:
https://cloud.mail.ru/public/6wp7/cgfuAXmnc
18) Message boards : Number crunching : SETI applications for NVIDIA GPU improvement - how you can help (Message 1803476)
Posted 10 days ago by Profile Raistmer
Another test to do by volunteers:

On beta (or in anonymous platform mode on main) take 8.16 OpenCL app or later and add -tt F parameter to command line.
increase F value (default is 15 that means 15 ms target time for partial PulseFind kernel) and watch for host usability (that is, GUI lags, missing letters at typing and so on). At what F value lags appear?
From performance point of view it's better to have longer kernels but this can result in GUI lags. This testing needed to establish best possible default value for unattended run.

EDIT: describe your host config (preferably with link to host on beta) along with report, please.
19) Message boards : Number crunching : Using an iGPU from Intel to crunch ? Now watch this... (Message 1803473)
Posted 10 days ago by Profile Raistmer
With Intel's APU devices always worth to check total device throughput, not just GPU or CPU parts. The reason - they very strongly interrelated. Crunching on iGPU can decrease CPU part output in big degree. And vice versa.
So, check your device output being CPU-nly, then with iGPU enabled (and/or overclocked). Quite possibly best throughput will be achieved for smth between idle CPU cores and full-loaded cores+iGPU...
20) Message boards : Number crunching : Nvidia GT640 takes 10,000 seconds to crunch 1 opencl_sah workunit (Message 1803204)
Posted 11 days ago by Profile Raistmer


http://setiathome.berkeley.edu/result.php?resultid=5046002781

LowPerformanceGPU path: yes

For low-performance GPU path use_sleep enabled with 5ms per iteration


Try to upgrade to r3486 or add
-use_sleep_ex 1
to your current command line for 8.12.


Next 20

Copyright © 2016 University of California