Some Help with SoG buils

Message boards : Number crunching : Some Help with SoG buils
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
juan BFP
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 6059
Credit: 349,998,906
RAC: 119,921
Panama
Message 1899750 - Posted: 8 Nov 2017, 0:38:55 UTC
Last modified: 8 Nov 2017, 0:47:34 UTC

I not sure if this is what you ask for. I manualy stoped the crunching process at about 65% of the WU. Was an Arecibo WU.

The lag still there.

This was the command line i used:

<cmdline>-v 6 -sbs 256 -period_iterations_num 500 -tt 10 -high_prec_timer -cpu_lock_fixed_cpu 2 -no_defaults_scaling</cmdline>

-----------------------------------------
Priority of worker thread raised successfully
Priority of process adjusted successfully, below normal priority class used
OpenCL platform detected: NVIDIA Corporation
BOINC assigns device 0
Info: BOINC provided OpenCL device ID used

Build features: SETI8 Non-graphics OpenCL USE_OPENCL_NV OCL_ZERO_COPY SIGNALS_ON_GPU OCL_CHIRP3 FFTW USE_SSE3 x86
CPUID: Intel(R) Core(TM) i7-6800K CPU @ 3.40GHz

Cache: L1=64K L2=256K

CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 SSSE3 FMA3 SSE4.1 SSE4.2 AVX
OpenCL-kernels filename : MultiBeam_Kernels_r3557.cl
ar=0.420478 NumCfft=197513 NumGauss=1123475102 NumPulse=226399745046 NumTriplet=452801441942
Currently allocated 229 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768

Windows optimized setiathome_v8 application
Based on Intel, Core 2-optimized v8-nographics V5.13 by Alex Kan
SSE3xj Win32 Build 3557 , Ported by : Raistmer, JDWhale

SETI8 update by Raistmer

OpenCL version by Raistmer, r3557

Number of OpenCL platforms: 1


OpenCL Platform Name: NVIDIA CUDA
Number of devices: 1
Max compute units: 9
Max work group size: 1024
Max clock frequency: 1771Mhz
Max memory allocation: 805306368
Cache type: Read/Write
Cache line size: 128
Cache size: 147456
Global memory size: 3221225472
Constant buffer size: 65536
Max number of constant args: 9
Local memory type: Scratchpad
Local memory size: 49152
Queue properties:
Out-of-Order: Yes
Name: GeForce GTX 1060 3GB
Vendor: NVIDIA Corporation
Driver version: 387.92
Version: OpenCL 1.2 CUDA
Extensions: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_copy_opts cl_nv_create_buffer


Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 0.420478
Used GPU device parameters are:
Number of compute units: 9
Single buffer allocation size: 128MB
Total device global memory: 3072MB
max WG size: 1024
local mem type: Real
FERMI path used: yes
LotOfMem path: yes
LowPerformanceGPU path: no
HighPerformanceGPU path: no
period_iterations_num=50
Autocorr: peak=18.72138, time=100.7, delay=4.8012, d_freq=1420830081.2, chirp=0.030501, fft_len=128k
Autocorr: peak=17.99791, time=100.7, delay=4.5692, d_freq=1420828858.66, chirp=-12.114, fft_len=128k
Spike: peak=24.90324, time=100.7, d_freq=1420833431.08, chirp=-7.0983, fft_len=128k
Spike: peak=25.33507, time=100.7, d_freq=1420833431.08, chirp=-7.102, fft_len=128k
Spike: peak=24.56211, time=100.7, d_freq=1420833431.08, chirp=-7.1057, fft_len=128k
Spike: peak=24.27216, time=6.711, d_freq=1420827090.33, chirp=15.316, fft_len=128k
Spike: peak=24.14864, time=6.711, d_freq=1420827090.34, chirp=15.317, fft_len=128k
Autocorr: peak=19.03537, time=100.7, delay=0.8446, d_freq=1420827980.46, chirp=-20.838, fft_len=128k
Autocorr: peak=19.33105, time=100.7, delay=0.8446, d_freq=1420827980.37, chirp=-20.839, fft_len=128k
Autocorr: peak=19.06569, time=100.7, delay=0.8446, d_freq=1420827979.35, chirp=-20.849, fft_len=128k
Autocorr: peak=20.73533, time=100.7, delay=0.8446, d_freq=1420827979.26, chirp=-20.85, fft_len=128k
Autocorr: peak=20.36203, time=100.7, delay=0.8446, d_freq=1420827978.14, chirp=-20.861, fft_len=128k
Autocorr: peak=18.07004, time=100.7, delay=0.8446, d_freq=1420827978.05, chirp=-20.862, fft_len=128k
Autocorr: peak=18.28098, time=100.7, delay=0.8446, d_freq=1420827977.02, chirp=-20.873, fft_len=128k
Pulse: peak=2.692056, time=25.51, period=0.7187, d_freq=1420832418.62, score=1.022, chirp=-21.9, fft_len=64
D: threshold 0.01467706; unscaled peak power: 0.01490948 exceeds threshold for 1.584%
Spike: peak=24.02618, time=85.56, d_freq=1420827349.31, chirp=36.985, fft_len=32k
GPU device sync requested... ...GPU device synched
Termination request detected or computations are finished. GPU device synched, exiting...

----------------------------------

About the driver you ask, all my hosts uses the NVidia 387.92 instaled with clean instalation.

AFAIK All MB drivers are updated to the latest avaiable for each MB.
ID: 1899750 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 5809
Credit: 76,067,204
RAC: 51,211
Russia
Message 1899793 - Posted: 8 Nov 2017, 6:09:05 UTC - in response to Message 1899750.  
Last modified: 8 Nov 2017, 6:22:35 UTC

1) Remove -no_defaults_scaling option.
2) allow app to run considerably longer that in case of posted sample (stderr will belonger too so put it in file no need to post it on message boards directly)
Also, such tests preferably to makeonofflineruns under KWSN benchmark. That way you'll always use same taskso results are directly comparable.

And third observation: your command line ignored (app reports usage of 50 periods while you wrote 500).SO, firstly check that app responds on your command line then repeat all last testing.
Quite possibly lag is there just because nothing was really changed.

4) "OpenCL version by Raistmer, r3557" -is it latest available rev?
EDIT: already checked - nope.
My own stock-based hostrecives: OpenCL version by Raistmer, r3584

So,do update.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1899793 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 8887
Credit: 115,106,444
RAC: 70,214
Australia
Message 1899801 - Posted: 8 Nov 2017, 6:52:26 UTC - in response to Message 1899793.  

4) "OpenCL version by Raistmer, r3557" -is it latest available rev?
EDIT: already checked - nope.
My own stock-based hostrecives: OpenCL version by Raistmer, r3584

3557 was the last available update in the Lunatics Beta installer.
Grant
Darwin NT
ID: 1899801 · Report as offensive     Reply Quote
Profile Chris SCrowdfunding Project Donor
Volunteer tester
Avatar

Send message
Joined: 19 Nov 00
Posts: 40048
Credit: 34,858,584
RAC: 64,714
United Kingdom
Message 1899816 - Posted: 8 Nov 2017, 8:14:15 UTC

I'm happy to use R3584 but I want it by Lunatics not by longhand which could introduce mistakes. But that is down to Richard H having the time to update beta 6.

Does Mikes world have an R3584 and instructions, and what does R3584 do that R3557 doesn't?
ID: 1899816 · Report as offensive     Reply Quote
Profile MikeProject Donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 30603
Credit: 57,612,565
RAC: 30,396
Germany
Message 1899818 - Posted: 8 Nov 2017, 10:45:48 UTC - in response to Message 1899816.  

I'm happy to use R3584 but I want it by Lunatics not by longhand which could introduce mistakes. But that is down to Richard H having the time to update beta 6.

Does Mikes world have an R3584 and instructions, and what does R3584 do that R3557 doesn't?


Of course it has.
R3584 produces less inconclusives than 3557 on Nvidia GPU`s but is slightly slower.
With each crime and every kindness we birth our future.
ID: 1899818 · Report as offensive     Reply Quote
juan BFP
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 6059
Credit: 349,998,906
RAC: 119,921
Panama
Message 1899929 - Posted: 8 Nov 2017, 23:18:10 UTC - in response to Message 1899793.  
Last modified: 9 Nov 2017, 0:15:09 UTC

About your points.

1) Remove -no_defaults_scaling option.

This is the command line i use now: <cmdline>-v 6 -sbs 256 -period_iterations_num 500 -tt 10 -high_prec_timer -cpu_lock_fixed_cpu 2 </cmdline>

2) allow app to run considerably longer that in case of posted sample (stderr will belonger too so put it in file no need to post it on message boards directly)

Leave running with the -v 6 optioon, so you could verify
In this host: https://setiathome.berkeley.edu/show_host_detail.php?hostid=8217092

Also, such tests preferably to makeonofflineruns under KWSN benchmark. That way you'll always use same taskso results are directly comparable.

If needed i could do that please let me know.

And third observation: your command line ignored (app reports usage of 50 periods while you wrote 500).SO, firstly check that app responds on your command line then repeat all last testing.
Quite possibly lag is there just because nothing was really changed.

Not understand why than happening, the line is supose to be right.
Apparently my boinc ignores the <cmdline>
To be sure i write it directly to the mb file.
Aparently is working now.

4) "OpenCL version by Raistmer, r3557" -is it latest available rev?
EDIT: already checked - nope.
My own stock-based hostrecives: OpenCL version by Raistmer, r3584

So,do update.

That´s because i install ussing the Lunnatics Instaler who install the r3557 build.
But Done. DL the r3584 from Mike´s site, manualy install and leave running with it.

Now i see a light at the end of the tunnel

Aparently the problem is in the <cmdline>, not the command line itself is the way the line is passed by the app_config file.

For some reason it not work, but works on my other cruncher. But try to find why it´s for another day. LOL

If i write the line on the app_config file like i do in my main cruncher it not works, need to be writed directly on the MB . txt file. That the resaon why anything i try makes no diference.
Not understand why,

Now with the command line you provided working the lag is almost not noticeable, at least for the time i was abble to spend on the tv.

Will keep trying and return if i have any news

Anyway will leave running with the -v 6 option for some time if you want to follow.

Thanks for your help
ID: 1899929 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 5809
Credit: 76,067,204
RAC: 51,211
Russia
Message 1900451 - Posted: 11 Nov 2017, 4:04:00 UTC - in response to Message 1899929.  

The last GPU stderr I could find on 1060 host:

<core_client_version>7.6.33</core_client_version>
<![CDATA[
<stderr_txt>
Target kernel sequence time set to 10ms
Number of period iterations for PulseFind set to:30
System timer will be set in high resolution mode
Maximum single buffer size set to:1024MB
SpikeFind FFT size threshold override set to:4096
TUNE: kernel 1 now has workgroup size of (64,1,4)
oclFFT global radix override set to:256
oclFFT local radix override set to:16
oclFFT max WG size override set to:256
oclFFT max local FFT size override set to:512
oclFFT number of local memory banks set to:64
oclFFT minimal memory coalesce width set to:64
System timer will be set in high resolution mode
Sleep() & wait for event loops will be used in some places
Priority of worker thread raised successfully
Priority of process adjusted successfully, below normal priority class used
OpenCL platform detected: NVIDIA Corporation
BOINC assigns device 0
Info: BOINC provided OpenCL device ID used
....

So, optionsin use now. But no signs ofincreased verbosity.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1900451 · Report as offensive     Reply Quote
juan BFP
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 6059
Credit: 349,998,906
RAC: 119,921
Panama
Message 1900463 - Posted: 11 Nov 2017, 4:34:28 UTC - in response to Message 1900451.  
Last modified: 11 Nov 2017, 5:12:54 UTC

restarted with the -v 6 option now. WIll take 10 min to the first WU crunched will be avaiable.

Leave the Number of period iterations for PulseFind set to50

I made several tests and with 50 the lag now is not noticeable and allow me to see the movie normaly without complains.

<edit> I belive now -v 6 is working

https://setiathome.berkeley.edu/result.php?resultid=6156237343

With this settings: No LAG, at least i can´t see any.

<app_version>
<app_name>setiathome_v8</app_name>
<plan_class>opencl_nvidia_SoG</plan_class>
<avg_ncpus>1.0</avg_ncpus>
<ngpus>0.5</ngpus>
<cmdline>-v 6 -use_sleep -tt 10 -period_iterations_num 50</cmdline>
</app_version>
ID: 1900463 · Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Some Help with SoG buils


 
©2017 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.