Message boards :
Number crunching :
I've Built a Couple OSX CUDA Apps...
Message board moderation
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 58 · Next
Author | Message |
---|---|
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
TBar, try to rebuild from revision: 3321 I added CL files caching inhibition via -no_caching option. Also can be disabled on per-device basis in multy-GPU configs. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
TBar, try to rebuild from revision: 3321 I downloaded r3321, compiled the App without any problems, but on the benchrun I get; 07:02:51 (12550): Can't open init data file - running in standalone mode Not using mb_cmdline.txt-file, using commandline options. 07:02:51 (12550): Can't open init data file - running in standalone mode WARNING: init_data.xml missing OpenCL platform detected: Apple WARNING: BOINC supplied wrong platform! Number of OpenCL devices found : 1 BOINC assigns slot on device #0. WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities Build features: SETI8 Non-graphics OpenCL OCL_CHIRP3 ASYNC_SPIKE FFTW SSSE3 64bit System: Darwin x86_64 Kernel: 12.6.0 CPU : Intel(R) Xeon(R) CPU E5472 @ 3.00GHz GenuineIntel x86, Family 6 Model 23 Stepping 6 Features : FPU TSC PAE APIC MTRR MMX SSE SSE2 HT SSE3 SSSE3 SSE4.1 OpenCL-kernels filename : MultiBeam_Kernels_r3321.cl INFO: can't open binary kernel file: .//MultiBeam_Kernels_r3321.cl_ATIRadeonBartsXTPrototype.bin_V7_12.6.0_10, continue with recompile... Info : Building Program (binary, clBuildProgram):main kernels: OK code 0 INFO: binary kernel file created WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_524288_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_8_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_16_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_32_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_64_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_128_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_256_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_512_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_1024_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_2048_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_4096_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_8192_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_16384_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_32768_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_65536_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_131072_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... ar=0.775000 NumCfft=1169 NumGauss=6087368 NumPulse=1197108460 NumTriplet=2300559776 Currently allocated 209 MB for GPU buffers ERROR: OpenCL kernel/call 'Creating RepackInput_kernel_cl from program' call failed (-46) in file analyzeFuncs.cpp near line 1559. Waiting 30 sec before restart... I thought it might be worth compiling a Intel GPU App. Using the same downloaded 3321 everything went normal right up to the very end and then stopped with; Undefined symbols for architecture x86_64: "boinc_resolve_filename_s(char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&)", referenced from: _main in seti_boinc-main.o seti_init_state() in seti_boinc-seti.o worker() in seti_boinc-worker.o initializeCL(char const*) in seti_boinc-GPU_lock.o "std::string::_M_leak_hard()", referenced from: strip_whitespace(std::string&) in libboinc.a(libboinc_la-str_util.o) "std::string::_Rep::_M_destroy(std::allocator<char> const&)", referenced from: std::vector<std::string, std::allocator<std::string> >::_M_insert_aux(__gnu_cxx::__normal_iterator<std::string*, std::vector<std::string, std::allocator<std::string> > >, std::string const&) in libboinc.a(libboinc_la-util.o) timer_thread(void*) in libboinc_api.a(boinc_api.o) _boinc_receive_trickle_down in libboinc_api.a(boinc_api.o) boinc_upload_file(std::string&) in libboinc_api.a(boinc_api.o) std::vector<UPLOAD_FILE_STATUS, std::allocator<UPLOAD_FILE_STATUS> >::~vector() in libboinc_api.a(boinc_api.o) std::vector<UPLOAD_FILE_STATUS, std::allocator<UPLOAD_FILE_STATUS> >::_M_insert_aux(__gnu_cxx::__normal_iterator<UPLOAD_FILE_STATUS*, std::vector<UPLOAD_FILE_STATUS, std::allocator<UPLOAD_FILE_STATUS> > >, UPLOAD_FILE_STATUS const&) in libboinc_api.a(boinc_api.o) skip_unrecognized(char*, MIOFILE&) in libboinc.a(libboinc_la-parse.o) //Lots More Similar to the Above and then; "std::basic_istream<char, std::char_traits<char> >& std::getline<char, std::char_traits<char>, std::allocator<char> >(std::basic_istream<char, std::char_traits<char> >&, std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, char)", referenced from: split(std::string, char) in libboinc.a(libboinc_la-str_util.o) "___sincos_stret", referenced from: v_vChirpData(float (*) [2], float (*) [2], int, double, int, int, double) in seti_boinc-analyzeFuncs.o v_vpChirpData(float (*) [2], float (*) [2], int, double, int, int, double) in seti_boinc-analyzeFuncs.o _clFFT_CreatePlanAdv in seti_boinc-fft_setup.o ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation) cp: seti_boinc: No such file or directory error: strip: can't open file: MBv8_8.4r3320_ssse3_clGPU_x86_64-apple-darwin (No such file or directory) This "clang: error: linker command failed with exit code 1" is very similar to what I was getting with boinc-master ver 7.7. There isn't much difference between the 2 compiles, strange it gives the Linker Error with the Intel GPU compile. I'll try a different Configure line on the Intel build. So, what about "ERROR: OpenCL kernel/call 'Creating RepackInput_kernel_cl from program' call failed (-46) in file analyzeFuncs.cpp near line 1559" on the first build? |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Define SETI7 and rebuild. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Well, that Does seem to fix the Error, However, Now the results say; Build features: SETI7 Non-graphics OpenCL OCL_CHIRP3 ASYNC_SPIKE FFTW SSSE3 64bit OS X optimized S@H v7 application (based on S@H Enhanced by Alex Kan) It does say SSSE3x OS X 64bit Build 3321 though. So, it will just confuse the Version Police? For the Intel build I found that changing -mmacosx-version-min=10.9 to -mmacosx-version-min=10.8 allowed it to finish, However, it Too crashed with "OpenCL kernel/call 'Creating RepackInput_kernel_cl from program' call failed (-46) in file analyzeFuncs.cpp near line 1559". So I guess if I use the same r3321 I'll have to change that one to SETI7 as well? |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
Builds for public release should always report their capabilities accurately. It helps everyone, not least the application user, to check that they've done everything right. This isn't just a game. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Did I say "undefine SETI8" ? Define SETI8 back, leave SETI7 untouched (defined!) and rebuild. And update to 3323 - it will not allow wrong mix of defines |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Success It completes the Benchmarks correctly, giving results consistent with the other v8 Apps. So...I suppose I'll post it at Crunchers Anonymous so the other Tom can try it on his iMacs. Since I can't test the Intel GPU App either, I guess I'm also going to have to post that somewhere where others can test it. It would be nice to receive a little feedback on the Apps posted at Crunchers Anonymous, especially the Intel GPU App...when it's posted. I'm going to rebuild the AVX CPU App, and I can't test that one either... 09:55:17 (26504): Can't open init data file - running in standalone mode Not using mb_cmdline.txt-file, using commandline options. 09:55:17 (26504): Can't open init data file - running in standalone mode WARNING: init_data.xml missing OpenCL platform detected: Apple WARNING: BOINC supplied wrong platform! Number of OpenCL devices found : 1 BOINC assigns slot on device #0. WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities Build features: SETI8 Non-graphics OpenCL OCL_CHIRP3 ASYNC_SPIKE FFTW SSSE3 64bit System: Darwin x86_64 Kernel: 12.6.0 CPU : Intel(R) Xeon(R) CPU E5472 @ 3.00GHz GenuineIntel x86, Family 6 Model 23 Stepping 6 Features : FPU TSC PAE APIC MTRR MMX SSE SSE2 HT SSE3 SSSE3 SSE4.1 OpenCL-kernels filename : MultiBeam_Kernels_r3321.cl INFO: can't open binary kernel file: .//MultiBeam_Kernels_r3321.cl_ATIRadeonBartsXTPrototype.bin_V7_12.6.0_10, continue with recompile... Info : Building Program (binary, clBuildProgram):main kernels: OK code 0 INFO: binary kernel file created WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_524288_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_8_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_16_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_32_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_64_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_128_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_256_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_512_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_1024_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_2048_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_4096_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_8192_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_16384_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_32768_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_65536_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_131072_gr64_lr16_wg256_tw0_r3321.bin_12.6.0_10, continue with recompile... ar=0.775000 NumCfft=1169 NumGauss=6087368 NumPulse=1197108460 NumTriplet=2300559776 Currently allocated 209 MB for GPU buffers In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768 OS X optimized setiathome_v8 application Version info: SSSE3x (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan SSSE3x OS X 64bit Build 3321 , Ported by : Raistmer, JDWhale, Urs Echternacht OpenCL version by Raistmer, r3321 Number of OpenCL platforms: 1 OpenCL Platform Name: Apple Number of devices: 1 Max compute units: 14 Max work group size: 1024 Max clock frequency: 900Mhz Max memory allocation: 268435456 Cache type: None Cache line size: 0 Cache size: 0 Global memory size: 1073741824 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Queue properties: Out-of-Order: No Name: ATI Radeon Barts XT Prototype Vendor: AMD Driver version: 1.0 Version: OpenCL 1.1 Extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store Work Unit Info: ............... Credit multiplier is : 2.85 WU true angle range is : 0.775000 Used GPU device parameters are: Number of compute units: 14 Single buffer allocation size: 128MB Total device global memory: 1024MB max WG size: 1024 local mem type: Real LotOfMem path: no period_iterations_num=50 Pulse: peak=0.6048146, time=69.31, period=0.1079, d_freq=1418922119.13, score=1.045, chirp=0, fft_len=32 Triplet: peak=8.319599, time=80.39, period=0.5439, d_freq=1418924865.71, chirp=0, fft_len=128 Triplet: peak=8.403753, time=80.39, period=0.5439, d_freq=1418924865.71, chirp=0, fft_len=128 Autocorr: peak=12.93238, time=20.13, delay=4.969, d_freq=1418920271.34, chirp=-0.83183, fft_len=128k Spike: peak=19.36639, time=46.98, d_freq=1418916062.37, chirp=2.4955, fft_len=128k Spike: peak=19.67712, time=73.82, d_freq=1418918905.44, chirp=-4.8986, fft_len=128k Spike: peak=21.39583, time=33.55, d_freq=1418915733.83, chirp=-5.638, fft_len=128k Spike: peak=18.88078, time=20.13, d_freq=1418920463.53, chirp=-7.0244, fft_len=128k Autocorr: peak=13.35424, time=33.55, delay=2.1202, d_freq=1418920561, chirp=8.1335, fft_len=128k Spike: peak=18.92986, time=20.13, d_freq=1418917368.68, chirp=-9.335, fft_len=128k Spike: peak=19.28573, time=87.24, d_freq=1418920545.39, chirp=-11.276, fft_len=128k Spike: peak=19.21027, time=90.26, d_freq=1418919043.86, chirp=-11.461, fft_len=512 Gaussian: peak=2.568508, mean=0.6189198, ChiSq=0.9575716, time=57.88, d_freq=1418922601.42, score=0.1431332, null_hyp=1.422762, chirp=-11.831, fft_len=16k Gaussian: peak=2.695651, mean=0.6148685, ChiSq=0.93757, time=59.56, d_freq=1418922581.57, score=0.1267667, null_hyp=1.421357, chirp=-11.831, fft_len=16k Gaussian: peak=2.417054, mean=0.6235663, ChiSq=0.9921176, time=61.24, d_freq=1418922561.73, score=0.376936, null_hyp=1.442612, chirp=-11.831, fft_len=16k Gaussian: peak=1.821032, mean=0.7009013, ChiSq=0.98085, time=17.62, d_freq=1418922404.5, score=0.2538466, null_hyp=1.432214, chirp=20.056, fft_len=8k Gaussian: peak=1.701029, mean=0.6977942, ChiSq=1.018962, time=19.29, d_freq=1418922438.15, score=0.4377575, null_hyp=1.456448, chirp=20.056, fft_len=8k Spike: peak=19.76917, time=23.07, d_freq=1418916784.01, chirp=-20.796, fft_len=8k Spike: peak=19.70372, time=90.26, d_freq=1418919039.42, chirp=-22.922, fft_len=512 Pulse: peak=8.095228, time=89.99, period=4.535, d_freq=1418919045.43, score=1.089, chirp=-22.922, fft_len=512 Gaussian: peak=2.40371, mean=0.6120579, ChiSq=1.006094, time=9.227, d_freq=1418920421.68, score=0.7348762, null_hyp=1.474921, chirp=-22.922, fft_len=16k Gaussian: peak=2.599831, mean=0.609823, ChiSq=0.9212406, time=12.58, d_freq=1418920344.77, score=0.1807008, null_hyp=1.425981, chirp=-22.922, fft_len=16k Gaussian: peak=2.473597, mean=0.6138971, ChiSq=0.9926966, time=14.26, d_freq=1418920306.31, score=0.5439019, null_hyp=1.456539, chirp=-22.922, fft_len=16k Spike: peak=20.13485, time=90.26, d_freq=1418919042.56, chirp=34.382, fft_len=512 Gaussian: peak=2.614028, mean=0.6185501, ChiSq=1.010624, time=39.43, d_freq=1418919421.81, score=0.9043522, null_hyp=1.490616, chirp=-36.878, fft_len=16k Gaussian: peak=2.58274, mean=0.6202052, ChiSq=0.995735, time=41.1, d_freq=1418919359.94, score=0.8745441, null_hyp=1.483554, chirp=-36.878, fft_len=16k Spike: peak=18.54538, time=95.63, d_freq=1418917049.85, chirp=-41.407, fft_len=32k Pulse: peak=2.358688, time=76.4, period=0.7864, d_freq=1418920638.41, score=1.132, chirp=-45.843, fft_len=256 Gaussian: peak=2.380434, mean=0.5830942, ChiSq=1.010076, time=31.04, d_freq=1418920482.46, score=1.3315, null_hyp=1.524061, chirp=-46.49, fft_len=16k Best spike: peak=21.39583, time=33.55, d_freq=1418915733.83, chirp=-5.638, fft_len=128k Best autocorr: peak=13.35424, time=33.55, delay=2.1202, d_freq=1418920561, chirp=8.1335, fft_len=128k Best gaussian: peak=2.380434, mean=0.5830942, ChiSq=1.010076, time=31.04, d_freq=1418920482.46, score=1.3315, null_hyp=1.524061, chirp=-46.49, fft_len=16k Best pulse: peak=2.358688, time=76.4, period=0.7864, d_freq=1418920638.41, score=1.132, chirp=-45.843, fft_len=256 Best triplet: peak=8.403753, time=80.39, period=0.5439, d_freq=1418924865.71, chirp=0, fft_len=128 Flopcounter: 132485111757.204300 Spike count: 11 Autocorr count: 2 Pulse count: 3 Triplet count: 2 Gaussian count: 11 Time cpu in use since last restart: 16.4 seconds GPU device sync requested... ...GPU device synched 09:55:44 (26504): called boinc_finish(0) Now You say 3323? |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
you can stay with 3321 as long as proper defines selected. Read SVN log :) |
William Send message Joined: 14 Feb 13 Posts: 2037 Credit: 17,689,662 RAC: 0 |
Now You say 3323? it's the newest commit from Raistmer and it makes sure both v7 and v8 got defined (for autocor and v8 changes respectively I presume) While things are still in flux, make sure you have grabbed the latest changes from the svn before building. At this point, they should all be bugfixes for proper v8 running and not fresh optimisations. You don;t want to release apps before they've been properly ironed out. Not all people pay attention to running the latest release and then we'll have half-cooked apps out in the wild. The name 'Crunch3r' still brings bad memories to some here. In particular, we want the inconclusive rate at 5% or below (that's what the thresholds are set for) to avoid unnecessary load to the project. Please keep that in mind before making builds publicly available. @Raistmer you may want to do a cleanup commit at some point, getting rid of the old and unnecessary v6 codepaths and the switch for it. A person who won't read has no advantage over one who can't read. (Mark Twain) |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
You may want to go through the AKv8 folder and make a few additions there. Because I don't see any of them that have both -DSETI7 & -DSETI8. https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/AKv8/ConfigureOSX_AKv8d_OPENCL_SSE3_MBv8.txt Configure commandline for MBv8 OpenCL AMD/ATI GPU application for OS X 10.7 and higher (current) : |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
You may want to go through the AKv8 folder and make a few additions there. Because I don't see any of them that have both -DSETI7 & -DSETI8. Good spot, but it's Urs' domain. He will add if needed (and we saw it's needed). (perhaps currently he goes with #ifdef inside sources but it's not enough that's why I added #error instead of #define SETI7. One should ensure each and every file recives define, CL file included. Hence best place is inside configure line, not inside header). |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
You may want to go through the AKv8 folder and make a few additions there. Because I don't see any of them that have both -DSETI7 & -DSETI8. The alternative route is to include both (actually, all three - although SETI6 is never going to be used again) codepaths, and switch at runtime. Then the configure line doesn't have to be kept in step. That should be foolproof, even though they keep evolving better fools... Which reminds me - I was going to run benchtests for the code in SVN r3312 and r3315. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Just as v7 one. But... do you know how long all use "magnetic induction" instead of "magnetic field strength" and vice versa? Traditions, English peoples should reverend them most of all ;) |
Urs Echternacht Send message Joined: 15 May 99 Posts: 692 Credit: 135,197,781 RAC: 211 |
You may want to go through the AKv8 folder and make a few additions there. Because I don't see any of them that have both -DSETI7 & -DSETI8. Yep that should be "-DSETI7 -DSETI8" in the CPPFLAGS and in the ConfigureOSX file, probably just overseen to add that with last commit. I've used it that way for stock GPU apps on Linux and OS X at least. _\|/_ U r s |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
Not sure I get your point about magnetic induction, but I'll have a think about it. If you're saying that SETI7 is obsolete - wrong, think again. My oldest pending is WU 1941216665, from 23 October - 2015, fortunately. It looks like the Android resend is likely to time out on 9 February and require at least one more reissue then. There will be a SETI7 codepath in the next installer, for all MB apps: it would save user space if it was a single switching app, rather than a completely separate (but mostly redundant) application. Suite of applications, plural. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Speaking of Configure lines... This one, https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/AKv8/ConfigureOSX64_AKv8d_SSE3-AVX2_MBv8.txt To Me, that appears to be a CPU line. I've tried it a few different ways without success. If you use the same Line in the file you get; ./configure --disable-server --disable-graphics --disable-shared --enable-bitness=64 --enable-client --enable-static-client --enable-static --enable-dependency-tracking --enable-intrinsics --enable-sse3 --enable-comoptions --enable-fast-math --host=x86_64-apple-darwin --target=x86_64-apple-darwin --build=x86_64-apple-darwin --with-boinc-platform=x86_64-apple-darwin CC="/usr/bin/clang" CXXFLAGS=" -mmacosx-version-min=10.7 -fast -mtune=generic --param inline-unit-growth=3000 -I/Users/Tom/sah_v7_opt/AKv8/client -I/Users/Tom/sah_v7_opt/src -I/Users/Tom/sah_v7_opt/src/boinc" CPPFLAGS=" -DUSE_I386_OPTIMIZATIONS -DUSE_I386_XEON -DUSE_FFTW -DSETI8 -DUSE_SSE3" LDFLAGS=" -mmacosx-version-min=10.7 -ldl /usr/lib/libz.1.dylib -lpthread -framework Carbon" BOINCDIR="/Users/Tom/boinc" ... clang: error: unknown argument: '-mpreferred-stack-boundary=8' [-Wunused-command-line-argument-hard-error-in-future] clang: note: this will be a hard error (cannot be downgraded to a warning) in the future clang: error: unknown argument: '-fira-algorithm=CB' [-Wunused-command-line-argument-hard-error-in-future] clang: note: this will be a hard error (cannot be downgraded to a warning) in the future clang: error: unknown argument: '-floop-interchange' [-Wunused-command-line-argument-hard-error-in-future] clang: note: this will be a hard error (cannot be downgraded to a warning) in the future clang: error: unknown argument: '-floop-strip-mine' [-Wunused-command-line-argument-hard-error-in-future] clang: note: this will be a hard error (cannot be downgraded to a warning) in the future clang: error: unknown argument: '-floop-block' [-Wunused-command-line-argument-hard-error-in-future] clang: note: this will be a hard error (cannot be downgraded to a warning) in the future clang: error: unknown argument: '-fpeel-loops' [-Wunused-command-line-argument-hard-error-in-future] clang: note: this will be a hard error (cannot be downgraded to a warning) in the future clang: error: unknown argument: '-ftree-loop-distribution' [-Wunused-command-line-argument-hard-error-in-future] clang: note: this will be a hard error (cannot be downgraded to a warning) in the future clang: error: unknown argument: '-fgcse-sm' [-Wunused-command-line-argument-hard-error-in-future] clang: note: this will be a hard error (cannot be downgraded to a warning) in the future clang: error: unknown argument: '-mpreferred-stack-boundary=8' [-Wunused-command-line-argument-hard-error-in-future] clang: note: this will be a hard error (cannot be downgraded to a warning) in the future clang: error: unknown argument: '-fira-algorithm=CB' [-Wunused-command-line-argument-hard-error-in-future] clang: note: this will be a hard error (cannot be downgraded to a warning) in the future clang: error: unknown argument: '-floop-interchange' [-Wunused-command-line-argument-hard-error-in-future] clang: note: this will be a hard error (cannot be downgraded to a warning) in the future clang: error: unknown argument: '-floop-strip-mine' [-Wunused-command-line-argument-hard-error-in-future] clang: note: this will be a hard error (cannot be downgraded to a warning) in the future clang: error: unknown argument: '-floop-block' [-Wunused-command-line-argument-hard-error-in-future] clang: note: this will be a hard error (cannot be downgraded to a warning) in the future clang: error: unknown argument: '-fpeel-loops' [-Wunused-command-line-argument-hard-error-in-future] clang: note: this will be a hard error (cannot be downgraded to a warning) in the future clang: error: unknown argument: '-ftree-loop-distribution' [-Wunused-command-line-argument-hard-error-in-future] clang: note: this will be a hard error (cannot be downgraded to a warning) in the future clang: error: unknown argument: '-fgcse-sm' [-Wunused-command-line-argument-hard-error-in-future] //It's much longer than that, but, you get the idea. I have managed much better with different Configure Lines, but still have over a dozen Errors. I guess this will not work in OSX? !!!!!!!!!!!!!!!!!! I'm not having much luck compiling any CPU App right now. I'm just going to post the new MBv8_8.4r3323_clGPU_ssse3_x86_64-apple-darwin for now and call it quits for today. See if the New App works on the iMacs. |
Tom Rinehart Send message Joined: 12 Dec 01 Posts: 113 Credit: 13,255,975 RAC: 6 |
I will try it tonight on both iMacs. |
Tom Rinehart Send message Joined: 12 Dec 01 Posts: 113 Credit: 13,255,975 RAC: 6 |
One other thing, I looked up all the Macs that had ATI HD 4XXX GPUs. They all came with processors that supported SSE4.1 or higher if that makes a difference in how you compile the app. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
There are still some first & second generation Mac Pros with just ssse3 out there. As soon as you compiled it with sse41 one would find a HD4 card to stick in his Mac Pro. The App seems to work fine in My machine and it appears about as fast as the last version; SSE4.1x OS X 64bit Build 3300 WU true angle range is : 0.415045 Run time: 15 min 38 sec CPU time: 7 min 42 sec SSSE3x OS X 64bit Build 3323 WU true angle range is : 0.414855 Run time: 16 min 5 sec CPU time: 8 min 30 sec I believe you will have to add -no_caching or something similar to the cmdline to have it remove the generated files after completion. Maybe Raistmer can give an example cmdline for your cards. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
I believe you will have to add -no_caching or something similar to the cmdline to have it remove the generated files after completion. Maybe Raistmer can give an example cmdline for your cards. with -no_caching in cmd line or in mb_cmdline.txt app should not generate *bin* files at all. But nobody tested this feature so far. If it works as intended please report (if not - report too). |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.