Message boards :
Number crunching :
The Highest Ranked SETI AMD Host is a MAC: Time for a STOCK MAC APP?
Message board moderation
Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 · Next
Author | Message |
---|---|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
I just downloaded the new files from yesterday and built a new App, https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/AKv8/client There are no apparent changes, the App still fails in standalone Mountain Lion with; ERROR: OpenCL kernel/call 'Enqueueing kernel: (strip) PC_find_spike32_kernel_cl' call failed (-54) in file analyzeFuncs.cpp near line 3171. It still works in standalone Mavericks with this recent task; http://setiathome.berkeley.edu/result.php?resultid=4007468474 r2839 Run time: 9 min 21 sec Spike count: 3 Autocorr count: 1 Pulse count: 0 Triplet count: 0 Gaussian count: 0 Time cpu in use since last restart: 87.6 seconds r2857 Run time: 8 min 17 sec 17:51:35 (564): Can't open init data file - running in standalone mode 17:51:35 (564): Can't open init data file - running in standalone mode Not using mb_cmdline.txt-file, using commandline options. 17:51:35 (564): Can't open init data file - running in standalone mode WARNING: init_data.xml missing OpenCL platform detected: Apple WARNING: BOINC supplied wrong platform! Number of OpenCL devices found : 3 BOINC assigns slot on device #0. WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities Build features: SETI7 Non-graphics OpenCL USE_OPENCL_HD5xxx OCL_CHIRP3 FFTW SSSE3 64bit System: Darwin x86_64 Kernel: 13.4.0 CPU : Intel(R) Xeon(R) CPU E5462 @ 2.80GHz GenuineIntel x86, Family 6 Model 23 Stepping 6 Features : FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 DTES64 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1 OpenCL-kernels filename : MultiBeam_Kernels_r2857.cl ar=14.514889 NumCfft=99189 NumGauss=0 NumPulse=6511290906 NumTriplet=6511290906 Currently allocated 121 MB for GPU buffers In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768 OS X optimized S@H Enhanced application by Alex Kan OpenCL version by Raistmer, r2857 AMD HD5 version by Raistmer Number of OpenCL platforms: 1 +++++++++++++++ Work Unit Info: ............... Credit multiplier is : 2.85 WU true angle range is : 14.514889 Used GPU device parameters are: Number of compute units: 14 Single buffer allocation size: 64MB Total device global memory: 1024MB max WG size: 256 local mem type: Real period_iterations_num=23 Autocorr: peak=19.33234, time=46.98, delay=2.9724, d_freq=1418916080.93, chirp=1.3901, fft_len=128k Spike: peak=25.09566, time=97.31, d_freq=1418912870.58, chirp=23.606, fft_len=64k Spike: peak=25.26082, time=97.31, d_freq=1418912870.59, chirp=23.624, fft_len=64k Spike: peak=24.09933, time=20.13, d_freq=1418915911.38, chirp=-27.249, fft_len=128k Best spike: peak=25.26082, time=97.31, d_freq=1418912870.59, chirp=23.624, fft_len=64k Best autocorr: peak=19.33234, time=46.98, delay=2.9724, d_freq=1418916080.93, chirp=1.3901, fft_len=128k Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.122e+11, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0 Best pulse: peak=4.642614, time=100.9, period=0.06349, d_freq=1418911132.81, score=0.8823, chirp=0, fft_len=16 Best triplet: peak=0, time=-2.122e+11, period=0, d_freq=0, chirp=0, fft_len=0 Flopcounter: 14663271842475.634766 Spike count: 3 Autocorr count: 1 Pulse count: 0 Triplet count: 0 Gaussian count: 0 Time cpu in use since last restart: 84.4 seconds GPU device sync requested... ...GPU device synched 17:59:52 (564): called boinc_finish(0) It still crashes immediately in BOINC with; INFO: can't open binary kernel file: /Volumes/Mov1/BOINC/Maverick/BOINC Data/projects/setiathome.berkeley.edu/MultiBeam_Kernels_r2857.clHD5_ATIRadeonBartsPROPrototype.bin_V7_13.4.0_12Aug17201420, continue with recompile... |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
clang, clang, clang The compiler spit out another App, MBv7_7.05r2856_ssse3_x86_64-apple-darwin. Here is the comparison; 19:04:34 (5943): Can't open init data file - running in standalone mode 19:04:34 (5943): Can't open init data file - running in standalone mode Not using mb_cmdline.txt-file, using commandline options. Build features: SETI7 Non-graphics FFTW SSSE3 64bit System: Darwin x86_64 Kernel: 12.5.0 CPU : Intel(R) Xeon(R) CPU E5462 @ 2.80GHz GenuineIntel x86, Family 6 Model 23 Stepping 6 Features : FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 DTES64 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1 ar=2.713638 NumCfft=99655 NumGauss=0 NumPulse=27714939457 NumTriplet=27714939457 In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768 OS X optimized S@H Enhanced application by Alex Kan Work Unit Info: ............... Credit multiplier is : 2.85 WU true angle range is : 2.713638 Spike: peak=24.59059, time=87.24, d_freq=1419809053.52, chirp=-29.68, fft_len=128k Triplet: peak=7.689766, time=102.3, period=0.4555, d_freq=1419805120.58, chirp=38.524, fft_len=64 Best spike: peak=24.59059, time=87.24, d_freq=1419809053.52, chirp=-29.68, fft_len=128k Best autocorr: peak=17.02802, time=60.4, delay=1.5946, d_freq=1419806313.14, chirp=26.915, fft_len=128k Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.122e+11, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0 Best pulse: peak=4.139598, time=6.966, period=0.2523, d_freq=1419804798.06, score=0.8801, chirp=70.628, fft_len=128 Best triplet: peak=7.689766, time=102.3, period=0.4555, d_freq=1419805120.58, chirp=38.524, fft_len=64 Flopcounter: 16154043149204.416016 Spike count: 1 Autocorr count: 0 Pulse count: 0 Triplet count: 1 Gaussian count: 0 Time cpu in use since last restart: 3214.3 seconds 19:58:31 (5943): called boinc_finish(0) Verses; 20:03:35 (6390): Can't open init data file - running in standalone mode 20:03:35 (6390): Can't open init data file - running in standalone mode Not using mb_cmdline.txt-file, using commandline options. Build features: SETI7 Non-graphics FFTW FFTOUT JSPF SSE3 64bit System: Darwin x86_64 Kernel: 12.5.0 CPU : Intel(R) Xeon(R) CPU E5462 @ 2.80GHz GenuineIntel x86, Family 6 Model 23 Stepping 6 Features : FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 DTES64 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1 ar=2.713638 NumCfft=99655 NumGauss=0 NumPulse=27714939457 NumTriplet=27714939457 In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768 OS X optimized S@H v7 application (based on S@H Enhanced by Alex Kan SSE3xjf OS X 64bit Build 2549 , Ported by : JDWhale, Urs Echternacht, Raistmer, Joe Segur Work Unit Info: ............... Credit multiplier is : 2.85 WU true angle range is : 2.713638 Spike: peak=24.59058, time=87.24, d_freq=1419809053.52, chirp=-29.68, fft_len=128k Triplet: peak=7.689764, time=102.3, period=0.4555, d_freq=1419805120.58, chirp=38.524, fft_len=64 Best spike: peak=24.59058, time=87.24, d_freq=1419809053.52, chirp=-29.68, fft_len=128k Best autocorr: peak=17.02803, time=60.4, delay=1.5946, d_freq=1419806313.14, chirp=26.915, fft_len=128k Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.122e+11, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0 Best pulse: peak=4.139598, time=6.966, period=0.2523, d_freq=1419804798.06, score=0.8801, chirp=70.628, fft_len=128 Best triplet: peak=7.689764, time=102.3, period=0.4555, d_freq=1419805120.58, chirp=38.524, fft_len=64 Flopcounter: 16154043149204.416016 Spike count: 1 Autocorr count: 0 Pulse count: 0 Triplet count: 1 Gaussian count: 0 Time cpu in use since last restart: 3838.1 seconds 21:08:03 (6390): called boinc_finish r2856 = 3214.3 r2549 = 3838.1 That's a pretty big difference. =============================== MBv7_7.05r2856_sse41_x86_64-apple-darwin shows about the same run time as the ssse3 version; http://setiathome.berkeley.edu/result.php?resultid=4007325307 20:55:18 (12450): Can't open init data file - running in standalone mode 20:55:18 (12450): Can't open init data file - running in standalone mode Not using mb_cmdline.txt-file, using commandline options. Build features: SETI7 Non-graphics FFTW SSE4.1 64bit System: Darwin x86_64 Kernel: 12.5.0 CPU : Intel(R) Xeon(R) CPU E5462 @ 2.80GHz GenuineIntel x86, Family 6 Model 23 Stepping 6 Features : FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 DTES64 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1 ar=4.587967 NumCfft=99321 NumGauss=0 NumPulse=14066417035 NumTriplet=28086600373 In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768 OS X optimized S@H Enhanced application by Alex Kan Work Unit Info: ............... Credit multiplier is : 2.85 WU true angle range is : 4.587967 Triplet: peak=7.709885, time=68.42, period=0.2474, d_freq=1421141662.6, chirp=0, fft_len=32 Triplet: peak=7.252985, time=4.679, period=0.07864, d_freq=1421146990.92, chirp=-75.989, fft_len=256 Triplet: peak=7.277875, time=29.21, period=0.1343, d_freq=1421146945.67, chirp=86.845, fft_len=16 Triplet: peak=7.495828, time=89.32, period=0.1163, d_freq=1421143621.27, chirp=86.845, fft_len=16 Best spike: peak=23.07402, time=87.24, d_freq=1421145570.02, chirp=28.74, fft_len=128k Best autocorr: peak=17.70593, time=33.55, delay=6.2105, d_freq=1421142441.39, chirp=-4.0751, fft_len=128k Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.122e+11, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0 Best pulse: peak=2.015295, time=28.9, period=0.06663, d_freq=1421140594.48, score=0.9054, chirp=0, fft_len=64 Best triplet: peak=7.709885, time=68.42, period=0.2474, d_freq=1421141662.6, chirp=0, fft_len=32 Flopcounter: 15984517635636.164062 Spike count: 0 Autocorr count: 0 Pulse count: 0 Triplet count: 4 Gaussian count: 0 Time cpu in use since last restart: 3229.1 seconds 21:49:26 (12450): called boinc_finish(0) |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
That was a nice speedup. From taking just over an hour on shorties with r2549 to around 50 minutes with the homebrew r2856, http://setiathome.berkeley.edu/result.php?resultid=4008732485. What's more interesting is all I did was remove the 2 OpenCL options from the configure line and not insert the 2 correct OpenCL paths into the configure file. The results not only work in Mountain Lion but also work in BOINC as well. Makes one wonder why adding the two OpenCL configure lines causes the results to Not work in Mountain Lion and Not work in BOINC. It would be nice if my OpenCL App worked in ML and BOINC, it's also a little faster... |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Yep. And all those changes in sources themselves that consistute new revision :) |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Still no luck getting the OpenCL version to work with BOINC. Yesterday I downloaded a new sah_v7_opt folder but had to download it in pieces as I kept getting a 'some data can't be read' error when trying to download the complete folder. Downloading the folder contents independently seemed to work. So far every CPU MBv7_7.05r2856_sse41_x86_64-apple-darwin task has validated. On my Harpertown Core2 Xeon it is saving about 10 minutes on the shorties and 30 minutes on the tasks in the 0.40 angle range. I posted it on Crunchers Anonymous if someone wants to give it a try, http://www.arkayn.us/forum/index.php?topic=130.msg4140#msg4140 |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Yep. And all those changes in sources themselves that constitute new revision :) It would seem I need one more change. The error; ERROR: OpenCL kernel/call 'Enqueueing kernel: (strip) PC_find_spike32_kernel_cl' call failed (-54) in file analyzeFuncs.cpp near line 3171. Waiting 30 sec before restart... Will disappear if you simply remove that line from here; #endif err |= clSetKernelArg(PC_find_spike32_kernel_cl,2,sizeof(cl_int),(void *)&Nchunks); err |= clSetKernelArg(PC_find_spike32_kernel_cl,3,sizeof(cl_float4)*optimal_block_x,NULL);//R: local memory allocation OCL_LOG_ERR("Setting kernel argument: (strip) PC_find_spike32_kernel_cl"); err = clEnqueueNDRangeKernel(cq,PC_find_spike32_kernel_cl, 2, NULL,globalThreads, localThreads, 0,NULL, NULL); 3171 OCL_LOG_ERR("Enqueueing kernel: (strip) PC_find_spike32_kernel_cl"); spike_logging_needed=true;//R: faster to read array than flag and array... SpikeSearchBinStart=0,SpikeSearchBinStop=NumBlockFfts; #if GPU_TWINCHIRP Remove that line and the App will work in Mountain Lion & BOINC. However, it will Not find any Spikes. On this task (among others) the spike count is Zero, http://setiathome.berkeley.edu/result.php?resultid=4010250229 So, is there anyway to change that line and still find Spikes? |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
This is strange. If you go here, https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt?rev=2760 You can download what is supposed to be 2760. The main folder download draws another error when hitting "Zip Archive" but you can download the lower levels, ie branches_sah_v7_opt_AKv8-2760.zip etc. It would be nice if you could just download sah_v7_opt. So, I place all the files in the proper locations, it builds without any trouble in Mountain Lion, I run it in standalone and receive; 23:23:09 (50228): Can't open init data file - running in standalone mode 23:23:09 (50228): Can't open init data file - running in standalone mode Not using mb_cmdline.txt-file, using commandline options. 23:23:09 (50228): Can't open init data file - running in standalone mode WARNING: init_data.xml missing OpenCL platform detected: Apple WARNING: BOINC supplied wrong platform! Number of OpenCL devices found : 3 BOINC assigns slot on device #0. WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities Build features: SETI7 Non-graphics OpenCL USE_OPENCL_HD5xxx OCL_CHIRP3 FFTW SSE4.1 64bit System: Darwin x86_64 Kernel: 12.5.0 CPU : Intel(R) Xeon(R) CPU E5462 @ 2.80GHz GenuineIntel x86, Family 6 Model 23 Stepping 6 Features : FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 DTES64 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1 OpenCL-kernels filename : MultiBeam_Kernels_r2760.cl INFO: can't open binary kernel file: .//MultiBeam_Kernels_r2760.clHD5_ATIRadeonBartsXTPrototype.bin_V7_12.5.0_10, continue with recompile... Info : Building Program (binary, clBuildProgram):main kernels: OK code 0 INFO: binary kernel file created WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_524288_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_8_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_16_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_32_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_64_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_128_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_256_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_512_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_1024_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_2048_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_4096_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_8192_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_16384_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_32768_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_65536_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_131072_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile... ar=1.185405 NumCfft=101385 NumGauss=0 NumPulse=56257780816 NumTriplet=56257780816 Currently allocated 121 MB for GPU buffers In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768 OS X optimized S@H Enhanced application by Alex Kan OpenCL version by Raistmer, r2760 AMD HD5 version by Raistmer Number of OpenCL platforms: 1 *************** Work Unit Info: ............... Credit multiplier is : 2.85 WU true angle range is : 1.185405 Used GPU device parameters are: Number of compute units: 14 Single buffer allocation size: 64MB Total device global memory: 1024MB max WG size: 1024 local mem type: Real period_iterations_num=23 ERROR: OpenCL kernel/call 'Enqueueing kernel: (strip) PC_find_spike32_kernel_cl' call failed (-54) in file analyzeFuncs.cpp near line 3029. Waiting 30 sec before restart... Yes, it must be a different version because line 3029 in this analyzeFuncs.cpp is the same line as 3171 in the current version. Just as before, it runs in standalone Mavericks; 00:36:58 (388): Can't open init data file - running in standalone mode 00:36:58 (388): Can't open init data file - running in standalone mode Not using mb_cmdline.txt-file, using commandline options. 00:36:58 (388): Can't open init data file - running in standalone mode WARNING: init_data.xml missing OpenCL platform detected: Apple WARNING: BOINC supplied wrong platform! Number of OpenCL devices found : 3 BOINC assigns slot on device #0. WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities Build features: SETI7 Non-graphics OpenCL USE_OPENCL_HD5xxx OCL_CHIRP3 FFTW SSE4.1 64bit System: Darwin x86_64 Kernel: 13.4.0 CPU : Intel(R) Xeon(R) CPU E5462 @ 2.80GHz GenuineIntel x86, Family 6 Model 23 Stepping 6 Features : FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 DTES64 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1 OpenCL-kernels filename : MultiBeam_Kernels_r2760.cl INFO: can't open binary kernel file: .//MultiBeam_Kernels_r2760.clHD5_ATIRadeonBartsPROPrototype.bin_V7_13.4.0_12Aug17201420, continue with recompile... Info : Building Program (binary, clBuildProgram):main kernels: OK code 0 INFO: binary kernel file created WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_524288_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_8_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_16_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_32_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_64_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_128_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_256_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_512_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_1024_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_2048_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_4096_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_8192_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_16384_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_32768_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_65536_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile... WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_131072_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile... ar=1.185405 NumCfft=101385 NumGauss=0 NumPulse=56257780816 NumTriplet=56257780816 Currently allocated 121 MB for GPU buffers In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768 OS X optimized S@H Enhanced application by Alex Kan OpenCL version by Raistmer, r2760 AMD HD5 version by Raistmer *************** Work Unit Info: ............... Credit multiplier is : 2.85 WU true angle range is : 1.185405 Used GPU device parameters are: Number of compute units: 14 Single buffer allocation size: 64MB Total device global memory: 1024MB max WG size: 256 local mem type: Real period_iterations_num=23 Spike: peak=24.47872, time=83.89, d_freq=1421126615.33, chirp=0.2514, fft_len=64k Spike: peak=24.07581, time=87.24, d_freq=1421126616.18, chirp=0.25232, fft_len=128k Spike: peak=25.0097, time=87.24, d_freq=1421126616.19, chirp=0.25325, fft_len=128k Spike: peak=25.16989, time=87.24, d_freq=1421126616.2, chirp=0.25417, fft_len=128k Spike: peak=24.65972, time=83.89, d_freq=1421126615.35, chirp=0.2551, fft_len=64k Spike: peak=24.59426, time=87.24, d_freq=1421126616.2, chirp=0.2551, fft_len=128k Spike: peak=24.38238, time=6.711, d_freq=1421125858.53, chirp=-8.7435, fft_len=128k Spike: peak=24.15315, time=6.711, d_freq=1421125858.53, chirp=-8.7444, fft_len=128k Spike: peak=24.46849, time=6.711, d_freq=1421125858.55, chirp=-8.7527, fft_len=128k Spike: peak=24.65364, time=6.711, d_freq=1421125858.54, chirp=-8.7537, fft_len=128k Spike: peak=24.40533, time=6.711, d_freq=1421119059.91, chirp=-14.735, fft_len=128k Spike: peak=24.47264, time=6.711, d_freq=1421119059.9, chirp=-14.735, fft_len=128k Autocorr: peak=17.84166, time=87.24, delay=1.2622, d_freq=1421121197.62, chirp=-21.197, fft_len=128k Autocorr: peak=18.20847, time=87.24, delay=1.2622, d_freq=1421121196.81, chirp=-21.206, fft_len=128k Spike: peak=24.54893, time=104.9, d_freq=1421126487.7, chirp=71.456, fft_len=16k Spike: peak=25.44987, time=104.9, d_freq=1421126487.83, chirp=71.634, fft_len=16k Spike: peak=26.741, time=104.9, d_freq=1421126487.72, chirp=71.752, fft_len=16k Spike: peak=25.83725, time=104.9, d_freq=1421126487.85, chirp=71.93, fft_len=16k Spike: peak=26.78933, time=104.9, d_freq=1421126487.74, chirp=72.048, fft_len=16k Spike: peak=24.39994, time=104.9, d_freq=1421126487.87, chirp=72.225, fft_len=16k Spike: peak=24.67293, time=104.9, d_freq=1421126487.76, chirp=72.344, fft_len=16k Best spike: peak=26.78933, time=104.9, d_freq=1421126487.74, chirp=72.048, fft_len=16k Best autocorr: peak=18.20847, time=87.24, delay=1.2622, d_freq=1421121196.81, chirp=-21.206, fft_len=128k Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.122e+11, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0 Best pulse: peak=4.527436, time=38.76, period=0.6554, d_freq=1421123960.85, score=0.9209, chirp=-92.558, fft_len=256 Best triplet: peak=0, time=-2.122e+11, period=0, d_freq=0, chirp=0, fft_len=0 Flopcounter: 14788174568041.134766 Spike count: 19 Autocorr count: 2 Pulse count: 0 Triplet count: 0 Gaussian count: 0 Time cpu in use since last restart: 66.2 seconds GPU device sync requested... ...GPU device synched 00:46:10 (388): called boinc_finish(0) It even gives the right answer. Joe Fox was using Mavericks. Just thinking out loud here... Joe Fox; Unfortunately, it looks like I still get the same error. The only difference I see in the stderr is now local mem type is "Emulated" rather than "Real". Urs Echternacht; "Emulated" is what i am working at currently. I hate to be the odd man here but my Mac Pro says; local mem type: Real What does this "Enqueueing kernel: (strip,neg) PC_find_spike32_kernel_cl" do actually? It kinda sounds like it might deal with memory; //R: local memory allocation I recently discovered you can't go above unroll 16 in Mountain Lion because it handles GPU memory differently than Mavericks, or Yosemite. So, all my builds will not work in Mountain Lion but work in Mavericks and Yosemite. Do you think this could be a local memory allocation problem caused by building in Mountain Lion? |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
It does exactly what its name says - it finds spikes, one of those signals SETI MultiBeam reports back as interesting ones. And it does this using loca/shared GPU memory inside kernel, just as my comment says. Error you report constantly is: #define CL_INVALID_WORK_GROUP_SIZE -54 So, kernel run got misconfigured somehow. One need to print real launch configuration for that calls to see why it became misconfigured. But the fact that in other builds no such error makes me think issue is local to your side. Post whole CLinfo output for beginning. You just stripped out that part in stderr. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Well, this problem has appeared on Two separate Mountain Lion Systems, and recently survived Three OS reinstalls on the current system. Hard to imagine it could survive all that. I found how to run clinfo on a Mac from here; https://groups.google.com/d/msg/aparapi-discuss/ryA8IvjI0AU/Mz_H8C7aV6UJ TomsMacPro:~ Tom$ ./clinfo Found 1 platform(s). platform[0x7fff0000]: profile: FULL_PROFILE platform[0x7fff0000]: version: OpenCL 1.2 (Apr 25 2013 18:32:06) platform[0x7fff0000]: name: Apple platform[0x7fff0000]: vendor: Apple platform[0x7fff0000]: extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event platform[0x7fff0000]: Found 4 device(s). device[0xffffffff]: NAME: Intel(R) Xeon(R) CPU E5462 @ 2.80GHz device[0xffffffff]: VENDOR: Intel device[0xffffffff]: PROFILE: FULL_PROFILE device[0xffffffff]: VERSION: OpenCL 1.2 device[0xffffffff]: EXTENSIONS: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_APPLE_fp64_basic_ops cl_APPLE_fixed_alpha_channel_orders cl_APPLE_biased_fixed_point_image_formats device[0xffffffff]: DRIVER_VERSION: 1.1 device[0xffffffff]: Type: CPU device[0xffffffff]: EXECUTION_CAPABILITIES: Kernel Native device[0xffffffff]: GLOBAL_MEM_CACHE_TYPE: Read-Write (2) device[0xffffffff]: CL_DEVICE_LOCAL_MEM_TYPE: Global (2) device[0xffffffff]: SINGLE_FP_CONFIG: 0xbf device[0xffffffff]: QUEUE_PROPERTIES: 0x2 device[0xffffffff]: VENDOR_ID: 4294967295 device[0xffffffff]: MAX_COMPUTE_UNITS: 4 device[0xffffffff]: MAX_WORK_ITEM_DIMENSIONS: 3 device[0xffffffff]: MAX_WORK_GROUP_SIZE: 1024 device[0xffffffff]: PREFERRED_VECTOR_WIDTH_CHAR: 16 device[0xffffffff]: PREFERRED_VECTOR_WIDTH_SHORT: 8 device[0xffffffff]: PREFERRED_VECTOR_WIDTH_INT: 4 device[0xffffffff]: PREFERRED_VECTOR_WIDTH_LONG: 2 device[0xffffffff]: PREFERRED_VECTOR_WIDTH_FLOAT: 4 device[0xffffffff]: PREFERRED_VECTOR_WIDTH_DOUBLE: 2 device[0xffffffff]: MAX_CLOCK_FREQUENCY: 2800 device[0xffffffff]: ADDRESS_BITS: 64 device[0xffffffff]: MAX_MEM_ALLOC_SIZE: 1610612736 device[0xffffffff]: IMAGE_SUPPORT: 1 device[0xffffffff]: MAX_READ_IMAGE_ARGS: 128 device[0xffffffff]: MAX_WRITE_IMAGE_ARGS: 8 device[0xffffffff]: IMAGE2D_MAX_WIDTH: 8192 device[0xffffffff]: IMAGE2D_MAX_HEIGHT: 8192 device[0xffffffff]: IMAGE3D_MAX_WIDTH: 2048 device[0xffffffff]: IMAGE3D_MAX_HEIGHT: 2048 device[0xffffffff]: IMAGE3D_MAX_DEPTH: 2048 device[0xffffffff]: MAX_SAMPLERS: 16 device[0xffffffff]: MAX_PARAMETER_SIZE: 4096 device[0xffffffff]: MEM_BASE_ADDR_ALIGN: 1024 device[0xffffffff]: MIN_DATA_TYPE_ALIGN_SIZE: 128 device[0xffffffff]: GLOBAL_MEM_CACHELINE_SIZE: 6291456 device[0xffffffff]: GLOBAL_MEM_CACHE_SIZE: 64 device[0xffffffff]: GLOBAL_MEM_SIZE: 6442450944 device[0xffffffff]: MAX_CONSTANT_BUFFER_SIZE: 65536 device[0xffffffff]: MAX_CONSTANT_ARGS: 8 device[0xffffffff]: LOCAL_MEM_SIZE: 32768 device[0xffffffff]: ERROR_CORRECTION_SUPPORT: 0 device[0xffffffff]: PROFILING_TIMER_RESOLUTION: 1 device[0xffffffff]: ENDIAN_LITTLE: 1 device[0xffffffff]: AVAILABLE: 1 device[0xffffffff]: COMPILER_AVAILABLE: 1 device[0x1021b00]: NAME: ATI Radeon Barts PRO Prototype device[0x1021b00]: VENDOR: AMD device[0x1021b00]: PROFILE: FULL_PROFILE device[0x1021b00]: VERSION: OpenCL 1.1 device[0x1021b00]: EXTENSIONS: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store device[0x1021b00]: DRIVER_VERSION: 1.0 device[0x1021b00]: Type: GPU device[0x1021b00]: EXECUTION_CAPABILITIES: Kernel device[0x1021b00]: GLOBAL_MEM_CACHE_TYPE: None (0) device[0x1021b00]: CL_DEVICE_LOCAL_MEM_TYPE: Local (1) device[0x1021b00]: SINGLE_FP_CONFIG: 0x1e device[0x1021b00]: QUEUE_PROPERTIES: 0x2 device[0x1021b00]: VENDOR_ID: 16915200 device[0x1021b00]: MAX_COMPUTE_UNITS: 14 device[0x1021b00]: MAX_WORK_ITEM_DIMENSIONS: 3 device[0x1021b00]: MAX_WORK_GROUP_SIZE: 1024 device[0x1021b00]: PREFERRED_VECTOR_WIDTH_CHAR: 16 device[0x1021b00]: PREFERRED_VECTOR_WIDTH_SHORT: 8 device[0x1021b00]: PREFERRED_VECTOR_WIDTH_INT: 4 device[0x1021b00]: PREFERRED_VECTOR_WIDTH_LONG: 2 device[0x1021b00]: PREFERRED_VECTOR_WIDTH_FLOAT: 4 device[0x1021b00]: PREFERRED_VECTOR_WIDTH_DOUBLE: 0 device[0x1021b00]: MAX_CLOCK_FREQUENCY: 775 device[0x1021b00]: ADDRESS_BITS: 32 device[0x1021b00]: MAX_MEM_ALLOC_SIZE: 268435456 device[0x1021b00]: IMAGE_SUPPORT: 1 device[0x1021b00]: MAX_READ_IMAGE_ARGS: 128 device[0x1021b00]: MAX_WRITE_IMAGE_ARGS: 8 device[0x1021b00]: IMAGE2D_MAX_WIDTH: 8192 device[0x1021b00]: IMAGE2D_MAX_HEIGHT: 8192 device[0x1021b00]: IMAGE3D_MAX_WIDTH: 2048 device[0x1021b00]: IMAGE3D_MAX_HEIGHT: 2048 device[0x1021b00]: IMAGE3D_MAX_DEPTH: 2048 device[0x1021b00]: MAX_SAMPLERS: 16 device[0x1021b00]: MAX_PARAMETER_SIZE: 1024 device[0x1021b00]: MEM_BASE_ADDR_ALIGN: 32768 device[0x1021b00]: MIN_DATA_TYPE_ALIGN_SIZE: 128 device[0x1021b00]: GLOBAL_MEM_CACHELINE_SIZE: 0 device[0x1021b00]: GLOBAL_MEM_CACHE_SIZE: 0 device[0x1021b00]: GLOBAL_MEM_SIZE: 1073741824 device[0x1021b00]: MAX_CONSTANT_BUFFER_SIZE: 65536 device[0x1021b00]: MAX_CONSTANT_ARGS: 8 device[0x1021b00]: LOCAL_MEM_SIZE: 32768 device[0x1021b00]: ERROR_CORRECTION_SUPPORT: 0 device[0x1021b00]: PROFILING_TIMER_RESOLUTION: 37 device[0x1021b00]: ENDIAN_LITTLE: 1 device[0x1021b00]: AVAILABLE: 1 device[0x1021b00]: COMPILER_AVAILABLE: 1 device[0x2021b00]: NAME: ATI Radeon Barts XT Prototype device[0x2021b00]: VENDOR: AMD device[0x2021b00]: PROFILE: FULL_PROFILE device[0x2021b00]: VERSION: OpenCL 1.1 device[0x2021b00]: EXTENSIONS: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store device[0x2021b00]: DRIVER_VERSION: 1.0 device[0x2021b00]: Type: GPU device[0x2021b00]: EXECUTION_CAPABILITIES: Kernel device[0x2021b00]: GLOBAL_MEM_CACHE_TYPE: None (0) device[0x2021b00]: CL_DEVICE_LOCAL_MEM_TYPE: Local (1) device[0x2021b00]: SINGLE_FP_CONFIG: 0x1e device[0x2021b00]: QUEUE_PROPERTIES: 0x2 device[0x2021b00]: VENDOR_ID: 33692416 device[0x2021b00]: MAX_COMPUTE_UNITS: 14 device[0x2021b00]: MAX_WORK_ITEM_DIMENSIONS: 3 device[0x2021b00]: MAX_WORK_GROUP_SIZE: 1024 device[0x2021b00]: PREFERRED_VECTOR_WIDTH_CHAR: 16 device[0x2021b00]: PREFERRED_VECTOR_WIDTH_SHORT: 8 device[0x2021b00]: PREFERRED_VECTOR_WIDTH_INT: 4 device[0x2021b00]: PREFERRED_VECTOR_WIDTH_LONG: 2 device[0x2021b00]: PREFERRED_VECTOR_WIDTH_FLOAT: 4 device[0x2021b00]: PREFERRED_VECTOR_WIDTH_DOUBLE: 0 device[0x2021b00]: MAX_CLOCK_FREQUENCY: 900 device[0x2021b00]: ADDRESS_BITS: 32 device[0x2021b00]: MAX_MEM_ALLOC_SIZE: 268435456 device[0x2021b00]: IMAGE_SUPPORT: 1 device[0x2021b00]: MAX_READ_IMAGE_ARGS: 128 device[0x2021b00]: MAX_WRITE_IMAGE_ARGS: 8 device[0x2021b00]: IMAGE2D_MAX_WIDTH: 8192 device[0x2021b00]: IMAGE2D_MAX_HEIGHT: 8192 device[0x2021b00]: IMAGE3D_MAX_WIDTH: 2048 device[0x2021b00]: IMAGE3D_MAX_HEIGHT: 2048 device[0x2021b00]: IMAGE3D_MAX_DEPTH: 2048 device[0x2021b00]: MAX_SAMPLERS: 16 device[0x2021b00]: MAX_PARAMETER_SIZE: 1024 device[0x2021b00]: MEM_BASE_ADDR_ALIGN: 32768 device[0x2021b00]: MIN_DATA_TYPE_ALIGN_SIZE: 128 device[0x2021b00]: GLOBAL_MEM_CACHELINE_SIZE: 0 device[0x2021b00]: GLOBAL_MEM_CACHE_SIZE: 0 device[0x2021b00]: GLOBAL_MEM_SIZE: 1073741824 device[0x2021b00]: MAX_CONSTANT_BUFFER_SIZE: 65536 device[0x2021b00]: MAX_CONSTANT_ARGS: 8 device[0x2021b00]: LOCAL_MEM_SIZE: 32768 device[0x2021b00]: ERROR_CORRECTION_SUPPORT: 0 device[0x2021b00]: PROFILING_TIMER_RESOLUTION: 37 device[0x2021b00]: ENDIAN_LITTLE: 1 device[0x2021b00]: AVAILABLE: 1 device[0x2021b00]: COMPILER_AVAILABLE: 1 device[0x3021b00]: NAME: ATI Radeon Barts PRO Prototype device[0x3021b00]: VENDOR: AMD device[0x3021b00]: PROFILE: FULL_PROFILE device[0x3021b00]: VERSION: OpenCL 1.1 device[0x3021b00]: EXTENSIONS: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store device[0x3021b00]: DRIVER_VERSION: 1.0 device[0x3021b00]: Type: GPU device[0x3021b00]: EXECUTION_CAPABILITIES: Kernel device[0x3021b00]: GLOBAL_MEM_CACHE_TYPE: None (0) device[0x3021b00]: CL_DEVICE_LOCAL_MEM_TYPE: Local (1) device[0x3021b00]: SINGLE_FP_CONFIG: 0x1e device[0x3021b00]: QUEUE_PROPERTIES: 0x2 device[0x3021b00]: VENDOR_ID: 50469632 device[0x3021b00]: MAX_COMPUTE_UNITS: 14 device[0x3021b00]: MAX_WORK_ITEM_DIMENSIONS: 3 device[0x3021b00]: MAX_WORK_GROUP_SIZE: 1024 device[0x3021b00]: PREFERRED_VECTOR_WIDTH_CHAR: 16 device[0x3021b00]: PREFERRED_VECTOR_WIDTH_SHORT: 8 device[0x3021b00]: PREFERRED_VECTOR_WIDTH_INT: 4 device[0x3021b00]: PREFERRED_VECTOR_WIDTH_LONG: 2 device[0x3021b00]: PREFERRED_VECTOR_WIDTH_FLOAT: 4 device[0x3021b00]: PREFERRED_VECTOR_WIDTH_DOUBLE: 0 device[0x3021b00]: MAX_CLOCK_FREQUENCY: 775 device[0x3021b00]: ADDRESS_BITS: 32 device[0x3021b00]: MAX_MEM_ALLOC_SIZE: 268435456 device[0x3021b00]: IMAGE_SUPPORT: 1 device[0x3021b00]: MAX_READ_IMAGE_ARGS: 128 device[0x3021b00]: MAX_WRITE_IMAGE_ARGS: 8 device[0x3021b00]: IMAGE2D_MAX_WIDTH: 8192 device[0x3021b00]: IMAGE2D_MAX_HEIGHT: 8192 device[0x3021b00]: IMAGE3D_MAX_WIDTH: 2048 device[0x3021b00]: IMAGE3D_MAX_HEIGHT: 2048 device[0x3021b00]: IMAGE3D_MAX_DEPTH: 2048 device[0x3021b00]: MAX_SAMPLERS: 16 device[0x3021b00]: MAX_PARAMETER_SIZE: 1024 device[0x3021b00]: MEM_BASE_ADDR_ALIGN: 32768 device[0x3021b00]: MIN_DATA_TYPE_ALIGN_SIZE: 128 device[0x3021b00]: GLOBAL_MEM_CACHELINE_SIZE: 0 device[0x3021b00]: GLOBAL_MEM_CACHE_SIZE: 0 device[0x3021b00]: GLOBAL_MEM_SIZE: 1073741824 device[0x3021b00]: MAX_CONSTANT_BUFFER_SIZE: 65536 device[0x3021b00]: MAX_CONSTANT_ARGS: 8 device[0x3021b00]: LOCAL_MEM_SIZE: 32768 device[0x3021b00]: ERROR_CORRECTION_SUPPORT: 0 device[0x3021b00]: PROFILING_TIMER_RESOLUTION: 37 device[0x3021b00]: ENDIAN_LITTLE: 1 device[0x3021b00]: AVAILABLE: 1 device[0x3021b00]: COMPILER_AVAILABLE: 1 I also tried to use my Mavericks system, the system that was the second Mountain Lion system, to build revision 2760 on my Mac Pro. It failed, the same way it fails in Yosemite. Best I can tell it has a problem building the source files; Undefined symbols for architecture x86_64: "boinc_resolve_filename_s(char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocato r<char> >&)", referenced from: _main in seti_boinc-main.o seti_init_state() in seti_boinc-seti.o worker() in seti_boinc-worker.o "std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::find(char const*, unsigned long, unsigned long) const", referenced from: std::__1::vector<unsigned char, std::__1::allocator<unsigned char> > xml_decode_field<unsigned char>(std::__1:: basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, char const*) in seti_boinc-seti.o coordinate_t::parse_xml(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > c onst&, char const*) in seti_boinc-schema_master.o chirp_parameter_t::parse_xml(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char > > const&, char const*) in seti_boinc-schema_master.o subband_description_t::parse_xml(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator< char> > const&, char const*) in seti_boinc-schema_master.o data_description_t::parse_xml(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<cha r> > const&, char const*) in seti_boinc-schema_master.o receiver_config::parse_xml(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, char const*) in seti_boinc-schema_master.o recorder_config::parse_xml(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, char const*) in seti_boinc-schema_master.o ... "std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::~basic_string()", reference d from: _main in seti_boinc-main.o seti_error::~seti_error() in seti_boinc-main.o seti_analyze(ANALYSIS_STATE&) in seti_boinc-analyzeFuncs.o seti_error::~seti_error() in seti_boinc-analyzeFuncs.o result_spike(SPIKE_INFO&) in seti_boinc-analyzeReport.o result_autocorr(AUTOCORR_INFO&) in seti_boinc-analyzeReport.o result_gaussian(GAUSS_INFO&) in seti_boinc-analyzeReport.o ... The only system that will compile is Mountain Lion....on my Mac Pro. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Curious, and most definitely not my area. Why does Apple only have OpenCL 1.0 drivers in there for a 1.1 capable device ? [Edit: and 1.1 for a 1.2...] [Edit2:] Another question in ignorance. These are on the same machine with a card and CPU right ? When did FreeBSD/Darwin/OSX start having multiple Video driver stacks , if that's what's going on there? Can a display be used on both simultaneously? It wouldn't surprise me if there is some installation order juggling needed there, like mixing Intel CPU, AMD and NV. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Curious, and most definitely not my area. Why does Apple only have OpenCL 1.0 drivers in there for a 1.1 capable device ? [Edit: and 1.1 for a 1.2...] The way I read it is the Driver version is OpenCL 1.2. The Device version is OpenCL 1.1. These are ATI 6850s and 6870, back then there wasn't an OpenCL 1.2. The DRIVER_VERSION: 1.0 is an Apple designation, note it doesn't say OpenCL... This is the Machine; http://setiathome.berkeley.edu/show_host_detail.php?hostid=6796479 I believe you're going to find the problem is 10.8 doesn't have Virtual GPU memory where 10.9 & 10.10 do. I get the 'call failed (-54) Error' in 10.8, I don't get the Error in 10.9 or 10.10. Somehow, the way the app is compiled in 10.8 doesn't agree with BOINC either. The GPU app compiled in 10.8 will fail immediately in BOINC as noted previously, "uncaught exception of type std::logic_error: basic_string::_S_construct NULL not valid" |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
and do other generic OpenCL test programs (that allow you to select device) run correctly on each of those devices ? and also at the same time ? "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
I believe you're going to find the problem is 10.8 doesn't have Virtual GPU memory where 10.9 & 10.10 do. I get the 'call failed (-54) Error' in 10.8, I don't get the Error in 10.9 or 10.10. Somehow, the way the app is compiled in 10.8 doesn't agree with BOINC either. The GPU app compiled in 10.8 will fail immediately in BOINC as noted previously, "uncaught exception of type std::logic_error: basic_string::_S_construct NULL not valid" hmmm, sounds like driver model changes something like MS's Continual transition. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
LuxMark doesn't have any trouble running the GPUs. The Machine has been working AstroPulses since there was APs for a Mac without any problems. In fact, it spent a good part of last Summer on the front page of the Top Computer list... |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
What is different from Windows is workgroup size. Apple allows 1024 whie AMD allows only 256 for own cards.... |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
What is different from Windows is workgroup size. So, how can we fix that? My newly revitalized 10.8.5 can crank out a nicely working App in about 5 minutes. It would be nice to pass this hurdle and move on. My MBv7 CPU App is doing very nicely, http://setiathome.berkeley.edu/result.php?resultid=4011766589. Next I'd like to recover that missing 90 minutes that was lost between APv6 and APv7. My machine completed an Unblanked APv6 CPU task in 8 hours, now an Unblanked APv7 CPU task takes 9.5 hours. Something wrong there. ;-) |
Chris Adamek Send message Joined: 15 May 99 Posts: 251 Credit: 434,772,072 RAC: 236 |
Curious, and most definitely not my area. Why does Apple only have OpenCL 1.0 drivers in there for a 1.1 capable device ? [Edit: and 1.1 for a 1.2...] OpenCL 1.2 did exist in OS X until Mavericks. Hence why iGPU's didn't show up on Boinc on Macs until Mavericks rolled out... I know zilch about what y'all are talking about but are any of the OpenCL functions 1.2 specific? Might be why things are dying in pre-Mavericks versions... Chris |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
What is different from Windows is workgroup size. Only by deceiving rest of app with different values. I'm not sure it should be done though, on NV app deals with 1024 WG OK usually. Anyway the same situation Urs should encountered when he did his port to OS X. Cause his build works and works almost OK, he already overcame this issue if any issue here. And AFAIK all hist changes already committed to Berkeley's tree. Hence, you downloaded already working on OS X source code. Taking all this into account I would concentrate on differencies versus Urs build and why are happen, not on Apple's workgroup size per se. BTW, you mentioned "emulated" instead of "real" memory type. Quite possible it's his workaround for Apple - to make app think it has emulated shared memory hence not use it much. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
What is different from Windows is workgroup size. That's not exactly what I was hoping to hear. Yes nVidia uses WG size of 1024 so there shouldn't be a problem, yet there obviously is a problem. Urs and Joe are using a Mobility GPU, Joe wasn't even using Mountain Lion, I'm using a Mac Pro with Real video cards and Real video memory. There are obvious other differences with Urs machine as some of the Configure options he recommends simply do not work on my Mac Pro. If I try those options the configure stops and complains about the compiler not working. The Apps that compile on my machine also work OK, and seem to work a little better with the exception of the WG size error. If the only problem is the code not expecting a larger GPU WG then allowing a larger GPU WG on Apple GPUs would seem to be the fix. I have been running the compiled GPU Apps in standalone for around a week and they work perfectly OK on my Mac Pro in Mavericks and Yosemite. I have been reporting the same Error for over a week, you would think if it was a recognized Error someone would have mentioned it. The Mac GPU Apps work fine when compiled on my Mac Pro with real GPUs, removing the Error message should be the objective, not preforming some trick to avoid some erroneous error message. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Urs' binary running on your PC give that error? It runs at all? |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.