I've Built a Couple OSX CUDA Apps...

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 58 · Next

AuthorMessage
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1754410 - Posted: 6 Jan 2016, 19:09:27 UTC - in response to Message 1754398.  

TBar, try to rebuild from revision: 3321
I added CL files caching inhibition via -no_caching option.
Also can be disabled on per-device basis in multy-GPU configs.
ID: 1754410 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1754558 - Posted: 7 Jan 2016, 12:19:51 UTC - in response to Message 1754410.  

TBar, try to rebuild from revision: 3321
I added CL files caching inhibition via -no_caching option.
Also can be disabled on per-device basis in multy-GPU configs.

I downloaded r3321, compiled the App without any problems, but on the benchrun I get;
07:02:51 (12550): Can't open init data file - running in standalone mode
Not using mb_cmdline.txt-file, using commandline options.
07:02:51 (12550): Can't open init data file - running in standalone mode
WARNING: init_data.xml missing
OpenCL platform detected: Apple
WARNING: BOINC supplied wrong platform!
Number of OpenCL devices found : 1 
BOINC assigns slot on device #0.
WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities

Build features: SETI8 Non-graphics OpenCL OCL_CHIRP3 ASYNC_SPIKE FFTW SSSE3 64bit 
 System: Darwin  x86_64  Kernel: 12.6.0
CPU : Intel(R) Xeon(R) CPU           E5472  @ 3.00GHz 
 GenuineIntel x86, Family 6 Model 23 Stepping 6
 Features : FPU TSC PAE APIC MTRR MMX SSE  SSE2 HT  SSE3 SSSE3 SSE4.1  

OpenCL-kernels filename : MultiBeam_Kernels_r3321.cl 
INFO: can't open binary kernel file: .//MultiBeam_Kernels_r3321.cl_ATIRadeonBartsXTPrototype.bin_V7_12.6.0_10, continue with recompile...
Info : Building Program (binary, clBuildProgram):main kernels: OK code 0
INFO: binary kernel file created
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_524288_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_8_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_16_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_32_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_64_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_128_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_256_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_512_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_1024_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_2048_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_4096_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_8192_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_16384_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_32768_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_65536_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_131072_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
ar=0.775000  NumCfft=1169  NumGauss=6087368  NumPulse=1197108460  NumTriplet=2300559776
Currently allocated 209 MB for GPU buffers
ERROR: OpenCL kernel/call 'Creating RepackInput_kernel_cl from program' call failed (-46) in file analyzeFuncs.cpp near line 1559.
Waiting 30 sec before restart...

I thought it might be worth compiling a Intel GPU App. Using the same downloaded 3321 everything went normal right up to the very end and then stopped with;
Undefined symbols for architecture x86_64:
  "boinc_resolve_filename_s(char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&)", referenced from:
      _main in seti_boinc-main.o
      seti_init_state() in seti_boinc-seti.o
      worker() in seti_boinc-worker.o
      initializeCL(char const*) in seti_boinc-GPU_lock.o
  "std::string::_M_leak_hard()", referenced from:
      strip_whitespace(std::string&) in libboinc.a(libboinc_la-str_util.o)
  "std::string::_Rep::_M_destroy(std::allocator<char> const&)", referenced from:
      std::vector<std::string, std::allocator<std::string> >::_M_insert_aux(__gnu_cxx::__normal_iterator<std::string*, std::vector<std::string, std::allocator<std::string> > >, std::string const&) in libboinc.a(libboinc_la-util.o)
      timer_thread(void*) in libboinc_api.a(boinc_api.o)
      _boinc_receive_trickle_down in libboinc_api.a(boinc_api.o)
      boinc_upload_file(std::string&) in libboinc_api.a(boinc_api.o)
      std::vector<UPLOAD_FILE_STATUS, std::allocator<UPLOAD_FILE_STATUS> >::~vector() in libboinc_api.a(boinc_api.o)
      std::vector<UPLOAD_FILE_STATUS, std::allocator<UPLOAD_FILE_STATUS> >::_M_insert_aux(__gnu_cxx::__normal_iterator<UPLOAD_FILE_STATUS*, std::vector<UPLOAD_FILE_STATUS, std::allocator<UPLOAD_FILE_STATUS> > >, UPLOAD_FILE_STATUS const&) in libboinc_api.a(boinc_api.o)
      skip_unrecognized(char*, MIOFILE&) in libboinc.a(libboinc_la-parse.o)
//Lots More Similar to the Above and then;
  "std::basic_istream<char, std::char_traits<char> >& std::getline<char, std::char_traits<char>, std::allocator<char> >(std::basic_istream<char, std::char_traits<char> >&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, char)", referenced from:
      split(std::string, char) in libboinc.a(libboinc_la-str_util.o)
  "___sincos_stret", referenced from:
      v_vChirpData(float (*) [2], float (*) [2], int, double, int, int, double) in seti_boinc-analyzeFuncs.o
      v_vpChirpData(float (*) [2], float (*) [2], int, double, int, int, double) in seti_boinc-analyzeFuncs.o
      _clFFT_CreatePlanAdv in seti_boinc-fft_setup.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
cp: seti_boinc: No such file or directory
error: strip: can't open file: MBv8_8.4r3320_ssse3_clGPU_x86_64-apple-darwin (No such file or directory)

This "clang: error: linker command failed with exit code 1" is very similar to what I was getting with boinc-master ver 7.7. There isn't much difference between the 2 compiles, strange it gives the Linker Error with the Intel GPU compile. I'll try a different Configure line on the Intel build.

So, what about "ERROR: OpenCL kernel/call 'Creating RepackInput_kernel_cl from program' call failed (-46) in file analyzeFuncs.cpp near line 1559" on the first build?
ID: 1754558 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1754562 - Posted: 7 Jan 2016, 12:46:09 UTC - in response to Message 1754558.  


So, what about "ERROR: OpenCL kernel/call 'Creating RepackInput_kernel_cl from program' call failed (-46) in file analyzeFuncs.cpp near line 1559" on the first build?

Define SETI7 and rebuild.
ID: 1754562 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1754578 - Posted: 7 Jan 2016, 13:48:50 UTC - in response to Message 1754562.  


So, what about "ERROR: OpenCL kernel/call 'Creating RepackInput_kernel_cl from program' call failed (-46) in file analyzeFuncs.cpp near line 1559" on the first build?

Define SETI7 and rebuild.

Well, that Does seem to fix the Error, However, Now the results say;
Build features: SETI7 Non-graphics OpenCL OCL_CHIRP3 ASYNC_SPIKE FFTW SSSE3 64bit
OS X optimized S@H v7 application (based on S@H Enhanced by Alex Kan)

It does say SSSE3x OS X 64bit Build 3321 though.
So, it will just confuse the Version Police?

For the Intel build I found that changing -mmacosx-version-min=10.9 to -mmacosx-version-min=10.8 allowed it to finish, However, it Too crashed with "OpenCL kernel/call 'Creating RepackInput_kernel_cl from program' call failed (-46) in file analyzeFuncs.cpp near line 1559". So I guess if I use the same r3321 I'll have to change that one to SETI7 as well?
ID: 1754578 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14659
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1754581 - Posted: 7 Jan 2016, 13:53:34 UTC - in response to Message 1754578.  


So, what about "ERROR: OpenCL kernel/call 'Creating RepackInput_kernel_cl from program' call failed (-46) in file analyzeFuncs.cpp near line 1559" on the first build?

Define SETI7 and rebuild.

Well, that Does seem to fix the Error, However, Now the results say;
Build features: SETI7 Non-graphics OpenCL OCL_CHIRP3 ASYNC_SPIKE FFTW SSSE3 64bit
OS X optimized S@H v7 application (based on S@H Enhanced by Alex Kan)

It does say SSSE3x OS X 64bit Build 3321 though.
So, it will just confuse the Version Police?

For the Intel build I found that changing -mmacosx-version-min=10.9 to -mmacosx-version-min=10.8 allowed it to finish, However, it Too crashed with "OpenCL kernel/call 'Creating RepackInput_kernel_cl from program' call failed (-46) in file analyzeFuncs.cpp near line 1559". So I guess if I use the same r3321 I'll have to change that one to SETI7 as well?

Builds for public release should always report their capabilities accurately.

It helps everyone, not least the application user, to check that they've done everything right. This isn't just a game.
ID: 1754581 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1754588 - Posted: 7 Jan 2016, 14:12:16 UTC - in response to Message 1754578.  
Last modified: 7 Jan 2016, 14:50:53 UTC


So, what about "ERROR: OpenCL kernel/call 'Creating RepackInput_kernel_cl from program' call failed (-46) in file analyzeFuncs.cpp near line 1559" on the first build?

Define SETI7 and rebuild.

Well, that Does seem to fix the Error, However, Now the results say;
Build features: SETI7 Non-graphics OpenCL OCL_CHIRP3 ASYNC_SPIKE FFTW SSSE3 64bit
OS X optimized S@H v7 application (based on S@H Enhanced by Alex Kan)

It does say SSSE3x OS X 64bit Build 3321 though.
So, it will just confuse the Version Police?

For the Intel build I found that changing -mmacosx-version-min=10.9 to -mmacosx-version-min=10.8 allowed it to finish, However, it Too crashed with "OpenCL kernel/call 'Creating RepackInput_kernel_cl from program' call failed (-46) in file analyzeFuncs.cpp near line 1559". So I guess if I use the same r3321 I'll have to change that one to SETI7 as well?


Did I say "undefine SETI8" ?
Define SETI8 back, leave SETI7 untouched (defined!) and rebuild.

And update to 3323 - it will not allow wrong mix of defines
ID: 1754588 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1754606 - Posted: 7 Jan 2016, 15:14:22 UTC
Last modified: 7 Jan 2016, 15:16:26 UTC

Success
It completes the Benchmarks correctly, giving results consistent with the other v8 Apps. So...I suppose I'll post it at Crunchers Anonymous so the other Tom can try it on his iMacs. Since I can't test the Intel GPU App either, I guess I'm also going to have to post that somewhere where others can test it.

It would be nice to receive a little feedback on the Apps posted at Crunchers Anonymous, especially the Intel GPU App...when it's posted. I'm going to rebuild the AVX CPU App, and I can't test that one either...

09:55:17 (26504): Can't open init data file - running in standalone mode
Not using mb_cmdline.txt-file, using commandline options.
09:55:17 (26504): Can't open init data file - running in standalone mode
WARNING: init_data.xml missing
OpenCL platform detected: Apple
WARNING: BOINC supplied wrong platform!
Number of OpenCL devices found : 1 
BOINC assigns slot on device #0.
WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities
Build features: SETI8 Non-graphics OpenCL OCL_CHIRP3 ASYNC_SPIKE FFTW SSSE3 64bit 
 System: Darwin  x86_64  Kernel: 12.6.0
CPU : Intel(R) Xeon(R) CPU           E5472  @ 3.00GHz 
 GenuineIntel x86, Family 6 Model 23 Stepping 6
 Features : FPU TSC PAE APIC MTRR MMX SSE  SSE2 HT  SSE3 SSSE3 SSE4.1  

OpenCL-kernels filename : MultiBeam_Kernels_r3321.cl 
INFO: can't open binary kernel file: .//MultiBeam_Kernels_r3321.cl_ATIRadeonBartsXTPrototype.bin_V7_12.6.0_10, continue with recompile...
Info : Building Program (binary, clBuildProgram):main kernels: OK code 0
INFO: binary kernel file created
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_524288_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_8_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_16_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_32_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_64_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_128_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_256_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_512_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_1024_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_2048_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_4096_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_8192_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_16384_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_32768_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_65536_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_131072_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile...
ar=0.775000  NumCfft=1169  NumGauss=6087368  NumPulse=1197108460  NumTriplet=2300559776
Currently allocated 209 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
OS X optimized setiathome_v8 application
Version info: SSSE3x (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSSE3x OS X 64bit Build 3321 , Ported by : Raistmer, JDWhale, Urs Echternacht

OpenCL version by Raistmer, r3321
Number of OpenCL platforms:				 1

 OpenCL Platform Name:					 Apple
Number of devices:				 1
  Max compute units:				 14
  Max work group size:				 1024
  Max clock frequency:				 900Mhz
  Max memory allocation:			 268435456
  Cache type:					 None
  Cache line size:				 0
  Cache size:					 0
  Global memory size:				 1073741824
  Constant buffer size:				 65536
  Max number of constant args:			 8
  Local memory type:				 Scratchpad
  Local memory size:				 32768
  Queue properties:				 
    Out-of-Order:				 No
  Name:						 ATI Radeon Barts XT Prototype
  Vendor:					 AMD
  Driver version:				 1.0
  Version:					 OpenCL 1.1 
  Extensions:					 cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store

Work Unit Info:
...............
Credit multiplier is :  2.85
WU true angle range is :  0.775000
Used GPU device parameters are:
	Number of compute units: 14
	Single buffer allocation size: 128MB
	Total device global memory: 1024MB
	max WG size: 1024
	local mem type: Real
	LotOfMem path: no
period_iterations_num=50
Pulse: peak=0.6048146, time=69.31, period=0.1079, d_freq=1418922119.13, score=1.045, chirp=0, fft_len=32 
Triplet: peak=8.319599, time=80.39, period=0.5439, d_freq=1418924865.71, chirp=0, fft_len=128 
Triplet: peak=8.403753, time=80.39, period=0.5439, d_freq=1418924865.71, chirp=0, fft_len=128 
Autocorr: peak=12.93238, time=20.13, delay=4.969, d_freq=1418920271.34, chirp=-0.83183, fft_len=128k
Spike: peak=19.36639, time=46.98, d_freq=1418916062.37, chirp=2.4955, fft_len=128k
Spike: peak=19.67712, time=73.82, d_freq=1418918905.44, chirp=-4.8986, fft_len=128k
Spike: peak=21.39583, time=33.55, d_freq=1418915733.83, chirp=-5.638, fft_len=128k
Spike: peak=18.88078, time=20.13, d_freq=1418920463.53, chirp=-7.0244, fft_len=128k
Autocorr: peak=13.35424, time=33.55, delay=2.1202, d_freq=1418920561, chirp=8.1335, fft_len=128k
Spike: peak=18.92986, time=20.13, d_freq=1418917368.68, chirp=-9.335, fft_len=128k
Spike: peak=19.28573, time=87.24, d_freq=1418920545.39, chirp=-11.276, fft_len=128k
Spike: peak=19.21027, time=90.26, d_freq=1418919043.86, chirp=-11.461, fft_len=512 
Gaussian: peak=2.568508, mean=0.6189198, ChiSq=0.9575716, time=57.88, d_freq=1418922601.42,
	score=0.1431332, null_hyp=1.422762, chirp=-11.831, fft_len=16k
Gaussian: peak=2.695651, mean=0.6148685, ChiSq=0.93757, time=59.56, d_freq=1418922581.57,
	score=0.1267667, null_hyp=1.421357, chirp=-11.831, fft_len=16k
Gaussian: peak=2.417054, mean=0.6235663, ChiSq=0.9921176, time=61.24, d_freq=1418922561.73,
	score=0.376936, null_hyp=1.442612, chirp=-11.831, fft_len=16k
Gaussian: peak=1.821032, mean=0.7009013, ChiSq=0.98085, time=17.62, d_freq=1418922404.5,
	score=0.2538466, null_hyp=1.432214, chirp=20.056, fft_len=8k
Gaussian: peak=1.701029, mean=0.6977942, ChiSq=1.018962, time=19.29, d_freq=1418922438.15,
	score=0.4377575, null_hyp=1.456448, chirp=20.056, fft_len=8k
Spike: peak=19.76917, time=23.07, d_freq=1418916784.01, chirp=-20.796, fft_len=8k
Spike: peak=19.70372, time=90.26, d_freq=1418919039.42, chirp=-22.922, fft_len=512 
Pulse: peak=8.095228, time=89.99, period=4.535, d_freq=1418919045.43, score=1.089, chirp=-22.922, fft_len=512 
Gaussian: peak=2.40371, mean=0.6120579, ChiSq=1.006094, time=9.227, d_freq=1418920421.68,
	score=0.7348762, null_hyp=1.474921, chirp=-22.922, fft_len=16k
Gaussian: peak=2.599831, mean=0.609823, ChiSq=0.9212406, time=12.58, d_freq=1418920344.77,
	score=0.1807008, null_hyp=1.425981, chirp=-22.922, fft_len=16k
Gaussian: peak=2.473597, mean=0.6138971, ChiSq=0.9926966, time=14.26, d_freq=1418920306.31,
	score=0.5439019, null_hyp=1.456539, chirp=-22.922, fft_len=16k
Spike: peak=20.13485, time=90.26, d_freq=1418919042.56, chirp=34.382, fft_len=512 
Gaussian: peak=2.614028, mean=0.6185501, ChiSq=1.010624, time=39.43, d_freq=1418919421.81,
	score=0.9043522, null_hyp=1.490616, chirp=-36.878, fft_len=16k
Gaussian: peak=2.58274, mean=0.6202052, ChiSq=0.995735, time=41.1, d_freq=1418919359.94,
	score=0.8745441, null_hyp=1.483554, chirp=-36.878, fft_len=16k
Spike: peak=18.54538, time=95.63, d_freq=1418917049.85, chirp=-41.407, fft_len=32k
Pulse: peak=2.358688, time=76.4, period=0.7864, d_freq=1418920638.41, score=1.132, chirp=-45.843, fft_len=256 
Gaussian: peak=2.380434, mean=0.5830942, ChiSq=1.010076, time=31.04, d_freq=1418920482.46,
	score=1.3315, null_hyp=1.524061, chirp=-46.49, fft_len=16k

Best spike: peak=21.39583, time=33.55, d_freq=1418915733.83, chirp=-5.638, fft_len=128k
Best autocorr: peak=13.35424, time=33.55, delay=2.1202, d_freq=1418920561, chirp=8.1335, fft_len=128k
Best gaussian: peak=2.380434, mean=0.5830942, ChiSq=1.010076, time=31.04, d_freq=1418920482.46,
	score=1.3315, null_hyp=1.524061, chirp=-46.49, fft_len=16k
Best pulse: peak=2.358688, time=76.4, period=0.7864, d_freq=1418920638.41, score=1.132, chirp=-45.843, fft_len=256 
Best triplet: peak=8.403753, time=80.39, period=0.5439, d_freq=1418924865.71, chirp=0, fft_len=128 

Flopcounter: 132485111757.204300

Spike count:    11
Autocorr count: 2
Pulse count:    3
Triplet count:  2
Gaussian count: 11
Time cpu in use since last restart: 16.4 seconds

GPU device sync requested...  ...GPU device synched
09:55:44 (26504): called boinc_finish(0)

Now You say 3323?
ID: 1754606 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1754608 - Posted: 7 Jan 2016, 15:29:20 UTC - in response to Message 1754606.  


Now You say 3323?

you can stay with 3321 as long as proper defines selected. Read SVN log :)
ID: 1754608 · Report as offensive
Profile William
Volunteer tester
Avatar

Send message
Joined: 14 Feb 13
Posts: 2037
Credit: 17,689,662
RAC: 0
Message 1754612 - Posted: 7 Jan 2016, 15:41:48 UTC - in response to Message 1754606.  

Now You say 3323?

it's the newest commit from Raistmer and it makes sure both v7 and v8 got defined (for autocor and v8 changes respectively I presume)

While things are still in flux, make sure you have grabbed the latest changes from the svn before building. At this point, they should all be bugfixes for proper v8 running and not fresh optimisations.
You don;t want to release apps before they've been properly ironed out. Not all people pay attention to running the latest release and then we'll have half-cooked apps out in the wild.
The name 'Crunch3r' still brings bad memories to some here.

In particular, we want the inconclusive rate at 5% or below (that's what the thresholds are set for) to avoid unnecessary load to the project.

Please keep that in mind before making builds publicly available.

@Raistmer you may want to do a cleanup commit at some point, getting rid of the old and unnecessary v6 codepaths and the switch for it.
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 1754612 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1754620 - Posted: 7 Jan 2016, 16:58:13 UTC - in response to Message 1754612.  

You may want to go through the AKv8 folder and make a few additions there. Because I don't see any of them that have both -DSETI7 & -DSETI8.
https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/AKv8/ConfigureOSX_AKv8d_OPENCL_SSE3_MBv8.txt
Configure commandline for MBv8 OpenCL AMD/ATI GPU application for OS X 10.7 and higher (current) :
./configure --disable-server --disable-graphics --disable-shared --enable-bitness=64 --enable-client --enable-static-client --enable-static --enable-dependency-tracking --enable-intrinsics --enable-sse3 --host=x86_64-apple-darwin --target=x86_64-apple-darwin --build=x86_64-apple-darwin --with-boinc-platform=x86_64-apple-darwin CC="/usr/bin/clang" CXXFLAGS=" -mmacosx-version-min=10.7 -fast -mtune=generic --param inline-unit-growth=3000 -I./ -I./client -I./../src" CPPFLAGS=" -DUSE_I386_OPTIMIZATIONS -DUSE_I386_XEON -DUSE_FFTW -DUSE_OPENCL -DASYNC_SPIKE -DSETI8 -DOCL_CHIRP3 -DUSE_SSE3" LDFLAGS=" -mmacosx-version-min=10.7 -ldl /usr/lib/libz.1.dylib -lpthread -framework Carbon -static-libgcc -static-stdc++" BOINCDIR=" ../../../boinc"
ID: 1754620 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1754637 - Posted: 7 Jan 2016, 17:57:32 UTC - in response to Message 1754620.  
Last modified: 7 Jan 2016, 18:00:31 UTC

You may want to go through the AKv8 folder and make a few additions there. Because I don't see any of them that have both -DSETI7 & -DSETI8.
https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/AKv8/ConfigureOSX_AKv8d_OPENCL_SSE3_MBv8.txt
Configure commandline for MBv8 OpenCL AMD/ATI GPU application for OS X 10.7 and higher (current) :
./configure --disable-server --disable-graphics --disable-shared --enable-bitness=64 --enable-client --enable-static-client --enable-static --enable-dependency-tracking --enable-intrinsics --enable-sse3 --host=x86_64-apple-darwin --target=x86_64-apple-darwin --build=x86_64-apple-darwin --with-boinc-platform=x86_64-apple-darwin CC="/usr/bin/clang" CXXFLAGS=" -mmacosx-version-min=10.7 -fast -mtune=generic --param inline-unit-growth=3000 -I./ -I./client -I./../src" CPPFLAGS=" -DUSE_I386_OPTIMIZATIONS -DUSE_I386_XEON -DUSE_FFTW -DUSE_OPENCL -DASYNC_SPIKE -DSETI8 -DOCL_CHIRP3 -DUSE_SSE3" LDFLAGS=" -mmacosx-version-min=10.7 -ldl /usr/lib/libz.1.dylib -lpthread -framework Carbon -static-libgcc -static-stdc++" BOINCDIR=" ../../../boinc"

Good spot, but it's Urs' domain. He will add if needed (and we saw it's needed).
(perhaps currently he goes with #ifdef inside sources but it's not enough that's why I added #error instead of #define SETI7. One should ensure each and every file recives define, CL file included. Hence best place is inside configure line, not inside header).
ID: 1754637 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14659
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1754641 - Posted: 7 Jan 2016, 18:09:27 UTC - in response to Message 1754637.  

You may want to go through the AKv8 folder and make a few additions there. Because I don't see any of them that have both -DSETI7 & -DSETI8.
https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/AKv8/ConfigureOSX_AKv8d_OPENCL_SSE3_MBv8.txt
Configure commandline for MBv8 OpenCL AMD/ATI GPU application for OS X 10.7 and higher (current) :
./configure --disable-server --disable-graphics --disable-shared --enable-bitness=64 --enable-client --enable-static-client --enable-static --enable-dependency-tracking --enable-intrinsics --enable-sse3 --host=x86_64-apple-darwin --target=x86_64-apple-darwin --build=x86_64-apple-darwin --with-boinc-platform=x86_64-apple-darwin CC="/usr/bin/clang" CXXFLAGS=" -mmacosx-version-min=10.7 -fast -mtune=generic --param inline-unit-growth=3000 -I./ -I./client -I./../src" CPPFLAGS=" -DUSE_I386_OPTIMIZATIONS -DUSE_I386_XEON -DUSE_FFTW -DUSE_OPENCL -DASYNC_SPIKE -DSETI8 -DOCL_CHIRP3 -DUSE_SSE3" LDFLAGS=" -mmacosx-version-min=10.7 -ldl /usr/lib/libz.1.dylib -lpthread -framework Carbon -static-libgcc -static-stdc++" BOINCDIR=" ../../../boinc"

Good spot, but it's Urs' domain. He will add if needed (and we saw it's needed).
(perhaps currently he goes with #ifdef inside sources but it's not enough that's why I added #error instead of #define SETI7. One should ensure each and every file recives define, CL file included. Hence best place is inside configure line, not inside header).

The alternative route is to include both (actually, all three - although SETI6 is never going to be used again) codepaths, and switch at runtime. Then the configure line doesn't have to be kept in step. That should be foolproof, even though they keep evolving better fools...

Which reminds me - I was going to run benchtests for the code in SVN r3312 and r3315.
ID: 1754641 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1754643 - Posted: 7 Jan 2016, 18:15:20 UTC - in response to Message 1754641.  


The alternative route is to include both (actually, all three - although SETI6 is never going to be used again) codepaths, and switch at runtime. Then the configure line doesn't have to be kept in step. That should be foolproof, even though they keep evolving better fools...

Which reminds me - I was going to run benchtests for the code in SVN r3312 and r3315.


Just as v7 one.
But... do you know how long all use "magnetic induction" instead of "magnetic field strength" and vice versa? Traditions, English peoples should reverend them most of all ;)
ID: 1754643 · Report as offensive
Urs Echternacht
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 692
Credit: 135,197,781
RAC: 211
Germany
Message 1754644 - Posted: 7 Jan 2016, 18:21:44 UTC - in response to Message 1754620.  

You may want to go through the AKv8 folder and make a few additions there. Because I don't see any of them that have both -DSETI7 & -DSETI8.
https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/AKv8/ConfigureOSX_AKv8d_OPENCL_SSE3_MBv8.txt
Configure commandline for MBv8 OpenCL AMD/ATI GPU application for OS X 10.7 and higher (current) :
./configure --disable-server --disable-graphics --disable-shared --enable-bitness=64 --enable-client --enable-static-client --enable-static --enable-dependency-tracking --enable-intrinsics --enable-sse3 --host=x86_64-apple-darwin --target=x86_64-apple-darwin --build=x86_64-apple-darwin --with-boinc-platform=x86_64-apple-darwin CC="/usr/bin/clang" CXXFLAGS=" -mmacosx-version-min=10.7 -fast -mtune=generic --param inline-unit-growth=3000 -I./ -I./client -I./../src" CPPFLAGS=" -DUSE_I386_OPTIMIZATIONS -DUSE_I386_XEON -DUSE_FFTW -DUSE_OPENCL -DASYNC_SPIKE -DSETI8 -DOCL_CHIRP3 -DUSE_SSE3" LDFLAGS=" -mmacosx-version-min=10.7 -ldl /usr/lib/libz.1.dylib -lpthread -framework Carbon -static-libgcc -static-stdc++" BOINCDIR=" ../../../boinc"

Yep that should be "-DSETI7 -DSETI8" in the CPPFLAGS and in the ConfigureOSX file, probably just overseen to add that with last commit. I've used it that way for stock GPU apps on Linux and OS X at least.
_\|/_
U r s
ID: 1754644 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14659
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1754645 - Posted: 7 Jan 2016, 18:28:11 UTC - in response to Message 1754643.  


The alternative route is to include both (actually, all three - although SETI6 is never going to be used again) codepaths, and switch at runtime. Then the configure line doesn't have to be kept in step. That should be foolproof, even though they keep evolving better fools...

Which reminds me - I was going to run benchtests for the code in SVN r3312 and r3315.

Just as v7 one.
But... do you know how long all use "magnetic induction" instead of "magnetic field strength" and vice versa? Traditions, English peoples should reverend them most of all ;)

Not sure I get your point about magnetic induction, but I'll have a think about it.

If you're saying that SETI7 is obsolete - wrong, think again. My oldest pending is WU 1941216665, from 23 October - 2015, fortunately. It looks like the Android resend is likely to time out on 9 February and require at least one more reissue then. There will be a SETI7 codepath in the next installer, for all MB apps: it would save user space if it was a single switching app, rather than a completely separate (but mostly redundant) application. Suite of applications, plural.
ID: 1754645 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1754657 - Posted: 7 Jan 2016, 19:19:30 UTC
Last modified: 7 Jan 2016, 20:18:07 UTC

Speaking of Configure lines...
This one, https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/AKv8/ConfigureOSX64_AKv8d_SSE3-AVX2_MBv8.txt
To Me, that appears to be a CPU line. I've tried it a few different ways without success. If you use the same Line in the file you get;
./configure --disable-server --disable-graphics --disable-shared --enable-bitness=64 --enable-client --enable-static-client --enable-static --enable-dependency-tracking --enable-intrinsics --enable-sse3 --enable-comoptions --enable-fast-math --host=x86_64-apple-darwin --target=x86_64-apple-darwin --build=x86_64-apple-darwin --with-boinc-platform=x86_64-apple-darwin CC="/usr/bin/clang" CXXFLAGS=" -mmacosx-version-min=10.7 -fast -mtune=generic --param inline-unit-growth=3000 -I/Users/Tom/sah_v7_opt/AKv8/client -I/Users/Tom/sah_v7_opt/src -I/Users/Tom/sah_v7_opt/src/boinc" CPPFLAGS=" -DUSE_I386_OPTIMIZATIONS -DUSE_I386_XEON -DUSE_FFTW -DSETI8 -DUSE_SSE3" LDFLAGS=" -mmacosx-version-min=10.7 -ldl /usr/lib/libz.1.dylib -lpthread -framework Carbon" BOINCDIR="/Users/Tom/boinc"
...
clang: error: unknown argument: '-mpreferred-stack-boundary=8' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-fira-algorithm=CB' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-floop-interchange' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-floop-strip-mine' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-floop-block' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-fpeel-loops' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-ftree-loop-distribution' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-fgcse-sm' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-mpreferred-stack-boundary=8' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-fira-algorithm=CB' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-floop-interchange' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-floop-strip-mine' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-floop-block' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-fpeel-loops' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-ftree-loop-distribution' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-fgcse-sm' [-Wunused-command-line-argument-hard-error-in-future]
//It's much longer than that, but, you get the idea.

I have managed much better with different Configure Lines, but still have over a dozen Errors.
I guess this will not work in OSX?

!!!!!!!!!!!!!!!!!!

I'm not having much luck compiling any CPU App right now. I'm just going to post the new MBv8_8.4r3323_clGPU_ssse3_x86_64-apple-darwin for now and call it quits for today.
See if the New App works on the iMacs.
ID: 1754657 · Report as offensive
Tom Rinehart
Volunteer tester

Send message
Joined: 12 Dec 01
Posts: 113
Credit: 13,255,975
RAC: 6
United States
Message 1754676 - Posted: 7 Jan 2016, 21:06:46 UTC - in response to Message 1754657.  

I will try it tonight on both iMacs.
ID: 1754676 · Report as offensive
Tom Rinehart
Volunteer tester

Send message
Joined: 12 Dec 01
Posts: 113
Credit: 13,255,975
RAC: 6
United States
Message 1754696 - Posted: 7 Jan 2016, 21:59:59 UTC - in response to Message 1754657.  

One other thing, I looked up all the Macs that had ATI HD 4XXX GPUs. They all came with processors that supported SSE4.1 or higher if that makes a difference in how you compile the app.
ID: 1754696 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1754725 - Posted: 7 Jan 2016, 23:17:40 UTC - in response to Message 1754696.  

There are still some first & second generation Mac Pros with just ssse3 out there. As soon as you compiled it with sse41 one would find a HD4 card to stick in his Mac Pro.

The App seems to work fine in My machine and it appears about as fast as the last version;
SSE4.1x OS X 64bit Build 3300
WU true angle range is : 0.415045
Run time: 15 min 38 sec
CPU time: 7 min 42 sec

SSSE3x OS X 64bit Build 3323
WU true angle range is : 0.414855
Run time: 16 min 5 sec
CPU time: 8 min 30 sec

I believe you will have to add -no_caching or something similar to the cmdline to have it remove the generated files after completion. Maybe Raistmer can give an example cmdline for your cards.
ID: 1754725 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1754726 - Posted: 7 Jan 2016, 23:45:32 UTC - in response to Message 1754725.  

I believe you will have to add -no_caching or something similar to the cmdline to have it remove the generated files after completion. Maybe Raistmer can give an example cmdline for your cards.


with -no_caching in cmd line or in mb_cmdline.txt app should not generate *bin* files at all. But nobody tested this feature so far.
If it works as intended please report (if not - report too).
ID: 1754726 · Report as offensive
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 58 · Next

Message boards : Number crunching : I've Built a Couple OSX CUDA Apps...


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.