Oddly long AP7 wu's?

Message boards : Number crunching : Oddly long AP7 wu's?
Message board moderation

To post messages, you must log in.

AuthorMessage
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1589057 - Posted: 20 Oct 2014, 1:28:58 UTC

I've begun to notice a number of oddly running AP7 work units on my ATI 7990. The Stderr for some reason doesn't list any of the ffa_block* or -unroll info in my command line file. The other work units run on this card do list that properly. Below is an example Stderr from the long running wu's. Any ideas?

Thanks,

Chris

Stderr output

<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
Running on device number: 1
Priority of worker thread raised successfully
Priority of process adjusted successfully, below normal priority class used
OpenCL platform detected: Advanced Micro Devices, Inc.
BOINC assigns device 1
Info: BOINC provided OpenCL device ID used

Build features: Non-graphics OpenCL TWIN_FFA OCL_ZERO_COPY COMBINED_DECHIRP_KERNEL FFTW USE_INCREASED_PRECISION USE_SSE2 x86
CPUID: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz

Cache: L1=64K L2=256K

CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 SSSE3 SSE4.1 SSE4.2 AVX
AstroPulse v6 Windows x86 rev 2667, V6 match, by Raistmer with support of Lunatics.kwsn.net team. SSE2

OpenCL version by Raistmer

oclFFT fix for ATI GPUs by Urs Echternacht
ffa threshold mods by Joe Segur
SSE3 dechirping by JDWhale
Combined dechirp kernel by Frizz
Number of OpenCL platforms: 1


OpenCL Platform Name: AMD Accelerated Parallel Processing
Number of devices: 2
Max compute units: 32
Max work group size: 256
Max clock frequency: 1000Mhz
Max memory allocation: 1073741824
Cache type: Read/Write
Cache line size: 64
Cache size: 16384
Global memory size: 3221225472
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 32768
Queue properties:
Out-of-Order: No
Name: Tahiti
Vendor: Advanced Micro Devices, Inc.
Driver version: 1445.5 (VM)
Version: OpenCL 1.2 AMD-APP (1445.5)
Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event
Max compute units: 32
Max work group size: 256
Max clock frequency: 1000Mhz
Max memory allocation: 1073741824
Cache type: Read/Write
Cache line size: 64
Cache size: 16384
Global memory size: 3221225472
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 32768
Queue properties:
Out-of-Order: No
Name: Tahiti
Vendor: Advanced Micro Devices, Inc.
Driver version: 1445.5 (VM)
Version: OpenCL 1.2 AMD-APP (1445.5)
Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event


state.fold_buf_size_short=65536; state.fold_buf_size_long=262144

single pulses: 2
repetitive pulses: 1
percent blanked: 2.39
Single pulse: peak_power=215.7 dm=4527 fft_num=4177920 peak_bin=4189440 scale=7
Rep. pulse: num_std_devs=7.017 peak_power=4327 dm=-4704 peak_bin=4224 scale=4 ffa_scale=0 period=269.2008
Single pulse: peak_power=213.4 dm=-13028 fft_num=28246016 peak_bin=28259840 scale=7

class T_remove_radar: total=3.67e+009, N=1, <>=3.67e+009, min=3.67e+009, max=3.67e+009
class T_main_loop_L1: total=3.03e+013, N=111, <>=2.73e+011, min=1.60e+011, max=3.14e+011
class T_FFT_forward: total=4.10e+011, N=909312, <>=4.50e+005, min=1.10e+004, max=2.76e+008
class T_remove_radar_randomize: total=5.24e+011, N=1817736, <>=2.88e+005, min=3.50e+002, max=2.80e+008
class T_build_chirp_table: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_DataWrite: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_DataWrite_ns: total=0, N=0, <>=0, min=0 max=0
class T_oclReadBuf: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_ChirpWrite: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_ChirpWrite_ns: total=0, N=0, <>=0, min=0 max=0
class T_dechirp: total=7.33e+010, N=909312, <>=8.06e+004, min=1.54e+004, max=1.18e+008
class Dechirp_ns: total=0, N=0, <>=0, min=0 max=0
class Half_ns: total=0, N=0, <>=0, min=0 max=0
class T_PC_single_pulse_kernel_FFA_update: total=2.55e+013, N=909312, <>=2.80e+007, min=3.57e+006, max=7.57e+008
class PC_ns: total=0, N=0, <>=0, min=0 max=0
class T_oclReadBuf: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_oclWriteBuf: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_FFT_inverse: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_ffa: total=3.73e+012, N=999, <>=3.73e+009, min=5.16e+008, max=1.50e+010
class T_GPU_buffer_read_backs: total=4, N=4, <>=1, min=1 max=1
TWIN_FFA OCL_ZERO_COPY USE_OPENCL OPENCL_WRITE USE_INCREASED_PRECISION SMALL_CHIRP_TABLE COMBINED_DECHIRP_KERNEL
rev 2667
GPU device sync requested... ...GPU device synched
17:01:48 (8240): called boinc_finish(0)

</stderr_txt>
]]>
ID: 1589057 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1589064 - Posted: 20 Oct 2014, 1:40:28 UTC
Last modified: 20 Oct 2014, 1:41:03 UTC

None of your GPU tasks returned after 17 Oct 2014, 14:48:52 UTC have in their stderr_txt the section.
DATA_CHUNK_UNROLL set to:24
FFA thread block override value:8192
FFA thread fetchblock override value:4096
Priority of worker thread raised successfully
Priority of process adjusted successfully, high priority class used

Your tasks prior to that do. You may want to see if your tuning settings are still in place or if you have a blank file. Losing your tuning settings could cause the tasks to run longer.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1589064 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1589141 - Posted: 20 Oct 2014, 4:28:32 UTC - in response to Message 1589057.  

I've begun to notice a number of oddly running AP7 work units on my ATI 7990. The Stderr for some reason doesn't list any of the ffa_block* or -unroll info in my command line file. The other work units run on this card do list that properly. Below is an example Stderr from the long running wu's. Any ideas?
...
AstroPulse v6 Windows x86 rev 2667
...
percent blanked: 2.39
...


That's an AP v6 task, even though the blanking is low it's enough to cause a significant runtime increase. Also, the tuning parameters from AP v7 won't be picked up, they need to be set separately. There are fewer options and the info put into stderr is less, as you've seen.

There are only 4 more AP v6 "in progress" on that host, but you might get more resends. I don't know if it's worth trying to set better options, your choice.
                                                                   Joe
ID: 1589141 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1589200 - Posted: 20 Oct 2014, 7:49:28 UTC - in response to Message 1589141.  
Last modified: 20 Oct 2014, 8:06:54 UTC

Also, the tuning parameters from AP v7 won't be picked up, they need to be set separately.

Strange, I have one and the same file:
ap_cmdline_win_x86_SSE2_OpenCL_ATI.txt

... that is "picked up" for both AP v6 and AP v7 - as per app_info.xml
        <file_ref>
            <file_name>ap_cmdline_win_x86_SSE2_OpenCL_ATI.txt</file_name>
            <open_name>ap_cmdline.txt</open_name>
        </file_ref>


P.S.
He probably installed the new Lunatics and need to look for his old ap_cmdline file in 'oldApp_backup'
 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1589200 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1589254 - Posted: 20 Oct 2014, 13:49:44 UTC - in response to Message 1589200.  

Ah, version 6... That's why I shouldn't be trying to solve "mysteries" late at night haha...

Thanks!

Chris
ID: 1589254 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1589259 - Posted: 20 Oct 2014, 13:57:03 UTC - in response to Message 1589254.  

Ah, version 6... That's why I shouldn't be trying to solve "mysteries" late at night haha...

Thanks!

Chris

That still doesn't explain why your AP v7 tasks lost the command line parameters you were using at the date I mentioned previously.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1589259 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1589261 - Posted: 20 Oct 2014, 14:00:48 UTC - in response to Message 1589259.  

That's because that is my mother's computer (she's in another town) and I have to walk her slowly through updates so we take baby steps. First step was to remove the existing settings and run with defaults to see how that's doing before I her add some different settings that may be more applicable for AP7.=)

Thanks,

Chris
ID: 1589261 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1589264 - Posted: 20 Oct 2014, 14:05:21 UTC - in response to Message 1589261.  

That's because that is my mother's computer (she's in another town) ...

Then you both may find this handy:
http://www.teamviewer.com/
 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1589264 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1589300 - Posted: 20 Oct 2014, 15:51:54 UTC - in response to Message 1589264.  

Ok, just to prove I'm wasn't entirely crazy last night here's a wu, that is AP7, zero percent blanking and too over an hour to run. In AP6 these would run in about 16 minutes on this machine. My wingman completed it twice as fast with a much slower card. It quit with the 30 pulses found thing but so did his. Not sure if my settings need to be tweaked or what, would adding to CPU back to the GPU wu's help? Right now I'm just using the defaults to see how it performed. It looks like it basically isn't using any CPU anyway, so I'm not sure that would help it at all.

WU: http://setiathome.berkeley.edu/workunit.php?wuid=1614282750

Thanks,

Chris

Stderr output

<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
Running on device number: 0
Priority of worker thread raised successfully
Priority of process adjusted successfully, below normal priority class used
OpenCL platform detected: Advanced Micro Devices, Inc.
BOINC assigns device 0
Info: BOINC provided OpenCL device ID used
Used GPU device parameters are:
Number of compute units: 32
Single buffer allocation size: 256MB
Total device global memory: 3072MB
max WG size: 256
local mem type: Real
-unroll default value used: 18
-ffa_block default value used: 8192
-ffa_block_fetch default value used: 4096

Build features: Non-graphics BLANKIT OpenCL TWIN_FFA OCL_ZERO_COPY COMBINED_DECHIRP_KERNEL FFTW USE_INCREASED_PRECISION USE_SSE2 x86
CPUID: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz

Cache: L1=64K L2=256K

CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 SSSE3 SSE4.1 SSE4.2 AVX
AstroPulse v7 Windows x86 rev 2721, V7 match, by Raistmer with support of Lunatics.kwsn.net team. SSE2

OpenCL version by Raistmer

oclFFT fix for ATI GPUs by Urs Echternacht
ffa threshold mods by Joe Segur
SSE3 dechirping by JDWhale
Combined dechirp kernel by Frizz
Built with uncommitted modifications
Number of OpenCL platforms: 1


OpenCL Platform Name: AMD Accelerated Parallel Processing
Number of devices: 2
Max compute units: 32
Max work group size: 256
Max clock frequency: 1000Mhz
Max memory allocation: 1073741824
Cache type: Read/Write
Cache line size: 64
Cache size: 16384
Global memory size: 3221225472
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 32768
Queue properties:
Out-of-Order: No
Name: Tahiti
Vendor: Advanced Micro Devices, Inc.
Driver version: 1445.5 (VM)
Version: OpenCL 1.2 AMD-APP (1445.5)
Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event
Max compute units: 32
Max work group size: 256
Max clock frequency: 1000Mhz
Max memory allocation: 1073741824
Cache type: Read/Write
Cache line size: 64
Cache size: 16384
Global memory size: 3221225472
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 32768
Queue properties:
Out-of-Order: No
Name: Tahiti
Vendor: Advanced Micro Devices, Inc.
Driver version: 1445.5 (VM)
Version: OpenCL 1.2 AMD-APP (1445.5)
Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event


state.fold_buf_size_short=65536; state.fold_buf_size_long=262144
Found 30 single pulses and 30 repeating pulses, exiting.
percent blanked: 0.00
Rep. pulse: num_std_devs=7.074 peak_power=3413.804 dm=960 peak_bin=4016 scale=4 ffa_scale=0 period=345.7128
Rep. pulse: num_std_devs=7.91 peak_power=9069696 dm=-896 peak_bin=65536 scale=7 ffa_scale=9 period=1889.473
Rep. pulse: num_std_devs=7.94 peak_power=9069791 dm=-896 peak_bin=65536 scale=7 ffa_scale=8 period=945.2627
Rep. pulse: num_std_devs=7.912 peak_power=9069701 dm=-896 peak_bin=0 scale=7 ffa_scale=7 period=473.3242
Rep. pulse: num_std_devs=7.973 peak_power=9069898 dm=-896 peak_bin=0 scale=7 ffa_scale=6 period=236.7398
Rep. pulse: num_std_devs=8.248 peak_power=9005147 dm=-896 peak_bin=8192 scale=7 ffa_scale=5 period=119.4203
Rep. pulse: num_std_devs=8.408 peak_power=9005664 dm=-896 peak_bin=2048 scale=7 ffa_scale=4 period=59.77418
Rep. pulse: num_std_devs=9.18 peak_power=4542300 dm=-896 peak_bin=2560 scale=7 ffa_scale=2 period=29.64422
Rep. pulse: num_std_devs=8.873 peak_power=9007169 dm=-896 peak_bin=1024 scale=7 ffa_scale=3 period=29.80409
Rep. pulse: num_std_devs=9.772 peak_power=9075752 dm=-896 peak_bin=0 scale=7 ffa_scale=2 period=14.7431
Rep. pulse: num_std_devs=10.1 peak_power=9076810 dm=-896 peak_bin=0 scale=7 ffa_scale=1 period=7.398146
Rep. pulse: num_std_devs=11.28 peak_power=1.814374e+007 dm=-896 peak_bin=0 scale=7 ffa_scale=2 period=7.371552
Rep. pulse: num_std_devs=10.79 peak_power=9079049 dm=-896 peak_bin=384 scale=7 ffa_scale=0 period=3.685776
Rep. pulse: num_std_devs=11.54 peak_power=1.801366e+007 dm=-896 peak_bin=256 scale=7 ffa_scale=1 period=3.734875
Rep. pulse: num_std_devs=8.756 peak_power=4508520 dm=-896 peak_bin=768 scale=7 ffa_scale=0 period=7.451507
Rep. pulse: num_std_devs=8.784 peak_power=4508580 dm=-896 peak_bin=512 scale=7 ffa_scale=1 period=14.85392
Rep. pulse: num_std_devs=7.396 peak_power=4505521 dm=-896 peak_bin=2048 scale=7 ffa_scale=3 period=59.77441
Rep. pulse: num_std_devs=7.889 peak_power=2273097 dm=-896 peak_bin=2560 scale=7 ffa_scale=1 period=29.64422
Rep. pulse: num_std_devs=7.741 peak_power=2256440 dm=-896 peak_bin=256 scale=7 ffa_scale=0 period=14.885
Rep. pulse: num_std_devs=6.761 peak_power=4504121 dm=-896 peak_bin=10240 scale=7 ffa_scale=4 period=119.0015
Rep. pulse: num_std_devs=6.603 peak_power=4503773 dm=-896 peak_bin=16384 scale=7 ffa_scale=5 period=239.1049
Rep. pulse: num_std_devs=7.024 peak_power=2271770 dm=-896 peak_bin=2560 scale=7 ffa_scale=2 period=59.28843
Rep. pulse: num_std_devs=8.262 peak_power=9070838 dm=-896 peak_bin=4096 scale=7 ffa_scale=4 period=59.13484
Rep. pulse: num_std_devs=7.025 peak_power=4537531 dm=-896 peak_bin=1024 scale=7 ffa_scale=3 period=59.00347
Rep. pulse: num_std_devs=11.4 peak_power=1.81443e+007 dm=-896 peak_bin=0 scale=7 ffa_scale=1 period=3.692151
Rep. pulse: num_std_devs=8.04 peak_power=9070117 dm=-896 peak_bin=8192 scale=7 ffa_scale=5 period=118.0096
Rep. pulse: num_std_devs=8.59 peak_power=4540996 dm=-896 peak_bin=384 scale=7 ffa_scale=0 period=7.371412
Rep. pulse: num_std_devs=8.418 peak_power=9071346 dm=-896 peak_bin=1024 scale=7 ffa_scale=3 period=29.50117
Rep. pulse: num_std_devs=6.423 peak_power=4536199 dm=-896 peak_bin=16384 scale=7 ffa_scale=5 period=236.0391
Rep. pulse: num_std_devs=6.405 peak_power=4568983 dm=-896 peak_bin=24576 scale=7 ffa_scale=6 period=471.3782
Single pulse: peak_power=51.9643 dm=3512 fft_num=17088512 peak_bin=17096776 scale=3
Single pulse: peak_power=129.631 dm=-3647 fft_num=4194304 peak_bin=4206912 scale=6
Single pulse: peak_power=62.1228 dm=4889 fft_num=835584 peak_bin=841504 scale=4
Single pulse: peak_power=91.1106 dm=5043 fft_num=4653056 peak_bin=4663808 scale=5
Single pulse: peak_power=62.0734 dm=5118 fft_num=4653056 peak_bin=4663824 scale=4
Single pulse: peak_power=63.7816 dm=5246 fft_num=4653056 peak_bin=4663840 scale=4
Single pulse: peak_power=50.9821 dm=5285 fft_num=4653056 peak_bin=4663848 scale=3
Single pulse: peak_power=90.4884 dm=5333 fft_num=4653056 peak_bin=4663840 scale=5
Single pulse: peak_power=50.2062 dm=5341 fft_num=4653056 peak_bin=4663856 scale=3
Single pulse: peak_power=67.1771 dm=5379 fft_num=4653056 peak_bin=4663856 scale=4
Single pulse: peak_power=63.1898 dm=5552 fft_num=27033600 peak_bin=27044320 scale=4
Single pulse: peak_power=132.345 dm=-5657 fft_num=17186816 peak_bin=17194624 scale=6
Single pulse: peak_power=367.452 dm=-6308 fft_num=163840 peak_bin=175872 scale=8
Single pulse: peak_power=131.997 dm=6556 fft_num=32440320 peak_bin=32445952 scale=6
Single pulse: peak_power=133.527 dm=-6780 fft_num=28016640 peak_bin=28025856 scale=6
Single pulse: peak_power=61.5109 dm=6784 fft_num=3260416 peak_bin=3269648 scale=4
Single pulse: peak_power=40.3592 dm=6891 fft_num=18153472 peak_bin=18167252 scale=2
Single pulse: peak_power=213.025 dm=7131 fft_num=14942208 peak_bin=14952832 scale=7
Single pulse: peak_power=61.49 dm=7289 fft_num=25575424 peak_bin=25585104 scale=4
Single pulse: peak_power=133.695 dm=-7325 fft_num=25968640 peak_bin=25981376 scale=6
Single pulse: peak_power=94.9201 dm=-7376 fft_num=30588928 peak_bin=30600512 scale=5
Single pulse: peak_power=129.122 dm=-7578 fft_num=3080192 peak_bin=3096512 scale=6
Single pulse: peak_power=373.281 dm=-7836 fft_num=31080448 peak_bin=31093504 scale=8
Single pulse: peak_power=215.015 dm=-7810 fft_num=29491200 peak_bin=29496192 scale=7
Single pulse: peak_power=214.059 dm=-8181 fft_num=17842176 peak_bin=17848320 scale=7
Single pulse: peak_power=62.0821 dm=8429 fft_num=17301504 peak_bin=17303728 scale=4
Single pulse: peak_power=87.8093 dm=8727 fft_num=22118400 peak_bin=22129184 scale=5
Single pulse: peak_power=89.1103 dm=-8874 fft_num=12861440 peak_bin=12863008 scale=5
Single pulse: peak_power=129.344 dm=-8942 fft_num=12861440 peak_bin=12862976 scale=6
Single pulse: peak_power=61.3828 dm=-9173 fft_num=8486912 peak_bin=8497984 scale=4

class T_remove_radar: total=1.96e+009, N=1, <>=1.96e+009, min=1.96e+009, max=1.96e+009
class T_main_loop_L1: total=1.31e+013, N=64, <>=2.05e+011, min=5.91e+010, max=8.38e+012
class T_FFT_forward: total=1.74e+009, N=58967, <>=2.95e+004, min=1.66e+004, max=3.56e+006
class T_remove_radar_randomize: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_build_chirp_table: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_DataWrite: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_DataWrite_ns: total=0, N=0, <>=0, min=0 max=0
class T_oclReadBuf: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_ChirpWrite: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_ChirpWrite_ns: total=0, N=0, <>=0, min=0 max=0
class T_dechirp: total=2.03e+009, N=58967, <>=3.45e+004, min=1.97e+004, max=4.71e+007
class Dechirp_ns: total=0, N=0, <>=0, min=0 max=0
class Half_ns: total=0, N=0, <>=0, min=0 max=0
class T_PC_single_pulse_kernel_FFA_update: total=4.80e+012, N=58967, <>=8.14e+007, min=2.49e+007, max=4.39e+008
class PC_ns: total=0, N=0, <>=0, min=0 max=0
class T_oclReadBuf: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_oclWriteBuf: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_FFT_inverse: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_ffa: total=8.32e+012, N=9, <>=9.25e+011, min=4.79e+008, max=8.32e+012

FFA blocks counters:
class T_FFA_fetch: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_FFA_tt_build: total=0.00e+000, N=0, <>=0.00e+000, min=1.84e+019, max=0.00e+000
class T_FFA_compare: total=1.56e+012, N=18, <>=8.68e+010, min=1.55e+006, max=1.12e+012
class T_FFA_coadd: total=6.99e+006, N=379, <>=1.85e+004, min=3.23e+003, max=6.58e+005
class T_FFA_stride_add: total=7.85e+005, N=16, <>=4.90e+004, min=1.91e+004, max=1.30e+005
class T_GPU_buffer_read_backs: total=77, N=77, <>=1, min=1 max=1
TWIN_FFA OCL_ZERO_COPY USE_OPENCL OPENCL_WRITE USE_INCREASED_PRECISION SMALL_CHIRP_TABLE COMBINED_DECHIRP_KERNEL BLANKIT
rev 2721
GPU device sync requested... ...GPU device synched
02:21:28 (3024): called boinc_finish(0)

</stderr_txt>
]]>
ID: 1589300 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1589328 - Posted: 20 Oct 2014, 16:47:35 UTC

I have noticed that tasks with many pulses take longer than ones with only a few or none. Could it be that the 30/30 pulse exits wasn't done until late in the task instead of early?
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1589328 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1589339 - Posted: 20 Oct 2014, 17:18:10 UTC - in response to Message 1589200.  

BilBg wrote:
Also, the tuning parameters from AP v7 won't be picked up, they need to be set separately.

Strange, I have one and the same file:
ap_cmdline_win_x86_SSE2_OpenCL_ATI.txt

... that is "picked up" for both AP v6 and AP v7 - as per app_info.xml
        <file_ref>
            <file_name>ap_cmdline_win_x86_SSE2_OpenCL_ATI.txt</file_name>
            <open_name>ap_cmdline.txt</open_name>
        </file_ref>

...

Thanks for catching that, I assumed they were separate without checking. And since there's only one ReadMe_AstroPulse_OpenCL_ATI.txt I'm not sure about the options being different anyhow.

Now that Chris has noted an actual AP v7 task which ran long on that host, it's probably best to focus on that side of the problem. I don't have any specific ideas on that yet.
                                                                  Joe
ID: 1589339 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1590159 - Posted: 22 Oct 2014, 13:52:43 UTC

either free CPU core or use -cpu_lock in command line.
ID: 1590159 · Report as offensive

Message boards : Number crunching : Oddly long AP7 wu's?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.