The Highest Ranked SETI AMD Host is a MAC: Time for a STOCK MAC APP?

Message boards : Number crunching : The Highest Ranked SETI AMD Host is a MAC: Time for a STOCK MAC APP?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 · Next

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1648622 - Posted: 2 Mar 2015, 23:52:09 UTC

I just downloaded the new files from yesterday and built a new App, https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt/AKv8/client
There are no apparent changes, the App still fails in standalone Mountain Lion with;
ERROR: OpenCL kernel/call 'Enqueueing kernel: (strip) PC_find_spike32_kernel_cl' call failed (-54) in file analyzeFuncs.cpp near line 3171.

It still works in standalone Mavericks with this recent task; http://setiathome.berkeley.edu/result.php?resultid=4007468474
r2839
Run time: 9 min 21 sec
Spike count: 3
Autocorr count: 1
Pulse count: 0
Triplet count: 0
Gaussian count: 0
Time cpu in use since last restart: 87.6 seconds


r2857 Run time: 8 min 17 sec
17:51:35 (564): Can't open init data file - running in standalone mode
17:51:35 (564): Can't open init data file - running in standalone mode
Not using mb_cmdline.txt-file, using commandline options.
17:51:35 (564): Can't open init data file - running in standalone mode
WARNING: init_data.xml missing
OpenCL platform detected: Apple
WARNING: BOINC supplied wrong platform!
Number of OpenCL devices found : 3 
BOINC assigns slot on device #0.
WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities

Build features: SETI7 Non-graphics OpenCL USE_OPENCL_HD5xxx OCL_CHIRP3 FFTW SSSE3 64bit 
 System: Darwin  x86_64  Kernel: 13.4.0
CPU : Intel(R) Xeon(R) CPU           E5462  @ 2.80GHz 
 GenuineIntel x86, Family 6 Model 23 Stepping 6
 Features : FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 DTES64 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1

OpenCL-kernels filename : MultiBeam_Kernels_r2857.cl 
ar=14.514889  NumCfft=99189  NumGauss=0  NumPulse=6511290906  NumTriplet=6511290906
Currently allocated 121 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
OS X optimized S@H Enhanced application by Alex Kan
OpenCL version by Raistmer, r2857
AMD HD5 version by Raistmer
Number of OpenCL platforms:				 1
+++++++++++++++
Work Unit Info:
...............
Credit multiplier is :  2.85
WU true angle range is :  14.514889
Used GPU device parameters are:
	Number of compute units: 14
	Single buffer allocation size: 64MB
	Total device global memory: 1024MB
	max WG size: 256
	local mem type: Real
period_iterations_num=23
Autocorr: peak=19.33234, time=46.98, delay=2.9724, d_freq=1418916080.93, chirp=1.3901, fft_len=128k
Spike: peak=25.09566, time=97.31, d_freq=1418912870.58, chirp=23.606, fft_len=64k
Spike: peak=25.26082, time=97.31, d_freq=1418912870.59, chirp=23.624, fft_len=64k
Spike: peak=24.09933, time=20.13, d_freq=1418915911.38, chirp=-27.249, fft_len=128k

Best spike: peak=25.26082, time=97.31, d_freq=1418912870.59, chirp=23.624, fft_len=64k
Best autocorr: peak=19.33234, time=46.98, delay=2.9724, d_freq=1418916080.93, chirp=1.3901, fft_len=128k
Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.122e+11, d_freq=0,
	score=-12, null_hyp=0, chirp=0, fft_len=0 
Best pulse: peak=4.642614, time=100.9, period=0.06349, d_freq=1418911132.81, score=0.8823, chirp=0, fft_len=16 
Best triplet: peak=0, time=-2.122e+11, period=0, d_freq=0, chirp=0, fft_len=0 

Flopcounter: 14663271842475.634766

Spike count:    3
Autocorr count: 1
Pulse count:    0
Triplet count:  0
Gaussian count: 0
Time cpu in use since last restart: 84.4 seconds
GPU device sync requested...  ...GPU device synched
17:59:52 (564): called boinc_finish(0)

It still crashes immediately in BOINC with;
INFO: can't open binary kernel file: /Volumes/Mov1/BOINC/Maverick/BOINC Data/projects/setiathome.berkeley.edu/MultiBeam_Kernels_r2857.clHD5_ATIRadeonBartsPROPrototype.bin_V7_13.4.0_12Aug17201420, continue with recompile...
libc++abi.dylib: terminating with uncaught exception of type std::logic_error: basic_string::_S_construct NULL not valid
SIGABRT: abort called

Crashed executable name: MBv7_7.05r2857_ati5_ssse3_x86_64-apple-darwin
Machine type Intel 80486 (64-bit executable)
System version: Macintosh OS 10.9.5 build 13F34
Mon Mar 2 18:25:58 2015
ID: 1648622 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1648705 - Posted: 3 Mar 2015, 5:38:47 UTC
Last modified: 3 Mar 2015, 6:03:23 UTC

clang, clang, clang
The compiler spit out another App, MBv7_7.05r2856_ssse3_x86_64-apple-darwin.
Here is the comparison;

19:04:34 (5943): Can't open init data file - running in standalone mode
19:04:34 (5943): Can't open init data file - running in standalone mode
Not using mb_cmdline.txt-file, using commandline options.

Build features: SETI7 Non-graphics FFTW SSSE3 64bit 
 System: Darwin  x86_64  Kernel: 12.5.0
CPU : Intel(R) Xeon(R) CPU           E5462  @ 2.80GHz 
 GenuineIntel x86, Family 6 Model 23 Stepping 6
 Features : FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 DTES64 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1

ar=2.713638  NumCfft=99655  NumGauss=0  NumPulse=27714939457  NumTriplet=27714939457
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
OS X optimized S@H Enhanced application by Alex Kan
Work Unit Info:
...............
Credit multiplier is :  2.85
WU true angle range is :  2.713638
Spike: peak=24.59059, time=87.24, d_freq=1419809053.52, chirp=-29.68, fft_len=128k
Triplet: peak=7.689766, time=102.3, period=0.4555, d_freq=1419805120.58, chirp=38.524, fft_len=64 

Best spike: peak=24.59059, time=87.24, d_freq=1419809053.52, chirp=-29.68, fft_len=128k
Best autocorr: peak=17.02802, time=60.4, delay=1.5946, d_freq=1419806313.14, chirp=26.915, fft_len=128k
Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.122e+11, d_freq=0,
	score=-12, null_hyp=0, chirp=0, fft_len=0 
Best pulse: peak=4.139598, time=6.966, period=0.2523, d_freq=1419804798.06, score=0.8801, chirp=70.628, fft_len=128 
Best triplet: peak=7.689766, time=102.3, period=0.4555, d_freq=1419805120.58, chirp=38.524, fft_len=64 

Flopcounter: 16154043149204.416016

Spike count:    1
Autocorr count: 0
Pulse count:    0
Triplet count:  1
Gaussian count: 0
Time cpu in use since last restart: 3214.3 seconds
19:58:31 (5943): called boinc_finish(0)


Verses;

20:03:35 (6390): Can't open init data file - running in standalone mode
20:03:35 (6390): Can't open init data file - running in standalone mode
Not using mb_cmdline.txt-file, using commandline options.

Build features: SETI7 Non-graphics FFTW FFTOUT JSPF SSE3 64bit 
 System: Darwin  x86_64  Kernel: 12.5.0
CPU : Intel(R) Xeon(R) CPU           E5462  @ 2.80GHz 
 GenuineIntel x86, Family 6 Model 23 Stepping 6
 Features : FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 DTES64 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1

ar=2.713638  NumCfft=99655  NumGauss=0  NumPulse=27714939457  NumTriplet=27714939457
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
OS X optimized S@H v7 application (based on S@H Enhanced by Alex Kan
SSE3xjf OS X 64bit Build 2549 , Ported by : JDWhale, Urs Echternacht, Raistmer, Joe Segur

Work Unit Info:
...............
Credit multiplier is :  2.85
WU true angle range is :  2.713638
Spike: peak=24.59058, time=87.24, d_freq=1419809053.52, chirp=-29.68, fft_len=128k
Triplet: peak=7.689764, time=102.3, period=0.4555, d_freq=1419805120.58, chirp=38.524, fft_len=64 

Best spike: peak=24.59058, time=87.24, d_freq=1419809053.52, chirp=-29.68, fft_len=128k
Best autocorr: peak=17.02803, time=60.4, delay=1.5946, d_freq=1419806313.14, chirp=26.915, fft_len=128k
Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.122e+11, d_freq=0,
	score=-12, null_hyp=0, chirp=0, fft_len=0 
Best pulse: peak=4.139598, time=6.966, period=0.2523, d_freq=1419804798.06, score=0.8801, chirp=70.628, fft_len=128 
Best triplet: peak=7.689764, time=102.3, period=0.4555, d_freq=1419805120.58, chirp=38.524, fft_len=64 

Flopcounter: 16154043149204.416016

Spike count:    1
Autocorr count: 0
Pulse count:    0
Triplet count:  1
Gaussian count: 0
Time cpu in use since last restart: 3838.1 seconds
21:08:03 (6390): called boinc_finish


r2856 = 3214.3
r2549 = 3838.1

That's a pretty big difference.

===============================
MBv7_7.05r2856_sse41_x86_64-apple-darwin shows about the same run time as the ssse3 version;
http://setiathome.berkeley.edu/result.php?resultid=4007325307

20:55:18 (12450): Can't open init data file - running in standalone mode
20:55:18 (12450): Can't open init data file - running in standalone mode
Not using mb_cmdline.txt-file, using commandline options.

Build features: SETI7 Non-graphics FFTW SSE4.1 64bit 
 System: Darwin  x86_64  Kernel: 12.5.0
CPU : Intel(R) Xeon(R) CPU           E5462  @ 2.80GHz 
 GenuineIntel x86, Family 6 Model 23 Stepping 6
 Features : FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 DTES64 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1

ar=4.587967  NumCfft=99321  NumGauss=0  NumPulse=14066417035  NumTriplet=28086600373
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
OS X optimized S@H Enhanced application by Alex Kan
Work Unit Info:
...............
Credit multiplier is :  2.85
WU true angle range is :  4.587967
Triplet: peak=7.709885, time=68.42, period=0.2474, d_freq=1421141662.6, chirp=0, fft_len=32 
Triplet: peak=7.252985, time=4.679, period=0.07864, d_freq=1421146990.92, chirp=-75.989, fft_len=256 
Triplet: peak=7.277875, time=29.21, period=0.1343, d_freq=1421146945.67, chirp=86.845, fft_len=16 
Triplet: peak=7.495828, time=89.32, period=0.1163, d_freq=1421143621.27, chirp=86.845, fft_len=16 

Best spike: peak=23.07402, time=87.24, d_freq=1421145570.02, chirp=28.74, fft_len=128k
Best autocorr: peak=17.70593, time=33.55, delay=6.2105, d_freq=1421142441.39, chirp=-4.0751, fft_len=128k
Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.122e+11, d_freq=0,
	score=-12, null_hyp=0, chirp=0, fft_len=0 
Best pulse: peak=2.015295, time=28.9, period=0.06663, d_freq=1421140594.48, score=0.9054, chirp=0, fft_len=64 
Best triplet: peak=7.709885, time=68.42, period=0.2474, d_freq=1421141662.6, chirp=0, fft_len=32 

Flopcounter: 15984517635636.164062

Spike count:    0
Autocorr count: 0
Pulse count:    0
Triplet count:  4
Gaussian count: 0
Time cpu in use since last restart: 3229.1 seconds
21:49:26 (12450): called boinc_finish(0)
ID: 1648705 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1648828 - Posted: 3 Mar 2015, 16:28:39 UTC

That was a nice speedup. From taking just over an hour on shorties with r2549 to around 50 minutes with the homebrew r2856, http://setiathome.berkeley.edu/result.php?resultid=4008732485. What's more interesting is all I did was remove the 2 OpenCL options from the configure line and not insert the 2 correct OpenCL paths into the configure file. The results not only work in Mountain Lion but also work in BOINC as well. Makes one wonder why adding the two OpenCL configure lines causes the results to Not work in Mountain Lion and Not work in BOINC. It would be nice if my OpenCL App worked in ML and BOINC, it's also a little faster...
ID: 1648828 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1648890 - Posted: 3 Mar 2015, 23:05:24 UTC - in response to Message 1648828.  

Yep. And all those changes in sources themselves that consistute new revision :)
ID: 1648890 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1649236 - Posted: 4 Mar 2015, 19:16:31 UTC - in response to Message 1648890.  

Still no luck getting the OpenCL version to work with BOINC. Yesterday I downloaded a new sah_v7_opt folder but had to download it in pieces as I kept getting a 'some data can't be read' error when trying to download the complete folder. Downloading the folder contents independently seemed to work.

So far every CPU MBv7_7.05r2856_sse41_x86_64-apple-darwin task has validated. On my Harpertown Core2 Xeon it is saving about 10 minutes on the shorties and 30 minutes on the tasks in the 0.40 angle range. I posted it on Crunchers Anonymous if someone wants to give it a try, http://www.arkayn.us/forum/index.php?topic=130.msg4140#msg4140
ID: 1649236 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1649760 - Posted: 5 Mar 2015, 23:55:31 UTC - in response to Message 1648890.  

Yep. And all those changes in sources themselves that constitute new revision :)

It would seem I need one more change. The error;
ERROR: OpenCL kernel/call 'Enqueueing kernel: (strip) PC_find_spike32_kernel_cl' call failed (-54) in file analyzeFuncs.cpp near line 3171.
Waiting 30 sec before restart...
Will disappear if you simply remove that line from here;
#endif
			err |= clSetKernelArg(PC_find_spike32_kernel_cl,2,sizeof(cl_int),(void *)&Nchunks);
			err |= clSetKernelArg(PC_find_spike32_kernel_cl,3,sizeof(cl_float4)*optimal_block_x,NULL);//R: local memory allocation
			OCL_LOG_ERR("Setting kernel argument: (strip) PC_find_spike32_kernel_cl");
            err = clEnqueueNDRangeKernel(cq,PC_find_spike32_kernel_cl,
                 2,
                 NULL,globalThreads,
                 localThreads,
                 0,NULL, NULL);
3171 OCL_LOG_ERR("Enqueueing kernel: (strip) PC_find_spike32_kernel_cl");
    spike_logging_needed=true;//R: faster to read array than flag and array...
			SpikeSearchBinStart=0,SpikeSearchBinStop=NumBlockFfts;
#if GPU_TWINCHIRP

Remove that line and the App will work in Mountain Lion & BOINC. However, it will Not find any Spikes. On this task (among others) the spike count is Zero, http://setiathome.berkeley.edu/result.php?resultid=4010250229

So, is there anyway to change that line and still find Spikes?
ID: 1649760 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1649837 - Posted: 6 Mar 2015, 5:11:28 UTC
Last modified: 6 Mar 2015, 6:10:33 UTC

This is strange. If you go here, https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt?rev=2760 You can download what is supposed to be 2760. The main folder download draws another error when hitting "Zip Archive" but you can download the lower levels, ie branches_sah_v7_opt_AKv8-2760.zip etc. It would be nice if you could just download sah_v7_opt.

So, I place all the files in the proper locations, it builds without any trouble in Mountain Lion, I run it in standalone and receive;

23:23:09 (50228): Can't open init data file - running in standalone mode
23:23:09 (50228): Can't open init data file - running in standalone mode
Not using mb_cmdline.txt-file, using commandline options.
23:23:09 (50228): Can't open init data file - running in standalone mode
WARNING: init_data.xml missing
OpenCL platform detected: Apple
WARNING: BOINC supplied wrong platform!
Number of OpenCL devices found : 3 
BOINC assigns slot on device #0.
WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities

Build features: SETI7 Non-graphics OpenCL USE_OPENCL_HD5xxx OCL_CHIRP3 FFTW SSE4.1 64bit 
 System: Darwin  x86_64  Kernel: 12.5.0
CPU : Intel(R) Xeon(R) CPU           E5462  @ 2.80GHz 
 GenuineIntel x86, Family 6 Model 23 Stepping 6
 Features : FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 DTES64 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1

OpenCL-kernels filename : MultiBeam_Kernels_r2760.cl 
INFO: can't open binary kernel file: .//MultiBeam_Kernels_r2760.clHD5_ATIRadeonBartsXTPrototype.bin_V7_12.5.0_10, continue with recompile...
Info : Building Program (binary, clBuildProgram):main kernels: OK code 0
INFO: binary kernel file created
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_524288_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_8_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_16_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_32_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_64_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_128_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_256_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_512_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_1024_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_2048_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_4096_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_8192_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_16384_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_32768_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_65536_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_131072_gr64_lr16_wg256_tw0_r2760.bin_12.5.0_10, continue with recompile...
ar=1.185405  NumCfft=101385  NumGauss=0  NumPulse=56257780816  NumTriplet=56257780816
Currently allocated 121 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
OS X optimized S@H Enhanced application by Alex Kan
OpenCL version by Raistmer, r2760
AMD HD5 version by Raistmer
Number of OpenCL platforms:				 1
***************
Work Unit Info:
...............
Credit multiplier is :  2.85
WU true angle range is :  1.185405
Used GPU device parameters are:
	Number of compute units: 14
	Single buffer allocation size: 64MB
	Total device global memory: 1024MB
	max WG size: 1024
	local mem type: Real
period_iterations_num=23
ERROR: OpenCL kernel/call 'Enqueueing kernel: (strip) PC_find_spike32_kernel_cl' call failed (-54) in file analyzeFuncs.cpp near line 3029.
Waiting 30 sec before restart...



Yes, it must be a different version because line 3029 in this analyzeFuncs.cpp is the same line as 3171 in the current version. Just as before, it runs in standalone Mavericks;

00:36:58 (388): Can't open init data file - running in standalone mode
00:36:58 (388): Can't open init data file - running in standalone mode
Not using mb_cmdline.txt-file, using commandline options.
00:36:58 (388): Can't open init data file - running in standalone mode
WARNING: init_data.xml missing
OpenCL platform detected: Apple
WARNING: BOINC supplied wrong platform!
Number of OpenCL devices found : 3 
BOINC assigns slot on device #0.
WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities

Build features: SETI7 Non-graphics OpenCL USE_OPENCL_HD5xxx OCL_CHIRP3 FFTW SSE4.1 64bit 
 System: Darwin  x86_64  Kernel: 13.4.0
CPU : Intel(R) Xeon(R) CPU           E5462  @ 2.80GHz 
 GenuineIntel x86, Family 6 Model 23 Stepping 6
 Features : FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 DTES64 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1

OpenCL-kernels filename : MultiBeam_Kernels_r2760.cl 
INFO: can't open binary kernel file: .//MultiBeam_Kernels_r2760.clHD5_ATIRadeonBartsPROPrototype.bin_V7_13.4.0_12Aug17201420, continue with recompile...
Info : Building Program (binary, clBuildProgram):main kernels: OK code 0
INFO: binary kernel file created
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_524288_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_8_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_16_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_32_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_64_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_128_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_256_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_512_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_1024_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_2048_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_4096_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_8192_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_16384_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_32768_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_65536_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsPROPrototype_131072_gr64_lr16_wg256_tw0_r2760.bin_13.4.0_12Aug17201420, continue with recompile...
ar=1.185405  NumCfft=101385  NumGauss=0  NumPulse=56257780816  NumTriplet=56257780816
Currently allocated 121 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
OS X optimized S@H Enhanced application by Alex Kan
OpenCL version by Raistmer, r2760
AMD HD5 version by Raistmer
***************
Work Unit Info:
...............
Credit multiplier is :  2.85
WU true angle range is :  1.185405
Used GPU device parameters are:
	Number of compute units: 14
	Single buffer allocation size: 64MB
	Total device global memory: 1024MB
	max WG size: 256
	local mem type: Real
period_iterations_num=23
Spike: peak=24.47872, time=83.89, d_freq=1421126615.33, chirp=0.2514, fft_len=64k
Spike: peak=24.07581, time=87.24, d_freq=1421126616.18, chirp=0.25232, fft_len=128k
Spike: peak=25.0097, time=87.24, d_freq=1421126616.19, chirp=0.25325, fft_len=128k
Spike: peak=25.16989, time=87.24, d_freq=1421126616.2, chirp=0.25417, fft_len=128k
Spike: peak=24.65972, time=83.89, d_freq=1421126615.35, chirp=0.2551, fft_len=64k
Spike: peak=24.59426, time=87.24, d_freq=1421126616.2, chirp=0.2551, fft_len=128k
Spike: peak=24.38238, time=6.711, d_freq=1421125858.53, chirp=-8.7435, fft_len=128k
Spike: peak=24.15315, time=6.711, d_freq=1421125858.53, chirp=-8.7444, fft_len=128k
Spike: peak=24.46849, time=6.711, d_freq=1421125858.55, chirp=-8.7527, fft_len=128k
Spike: peak=24.65364, time=6.711, d_freq=1421125858.54, chirp=-8.7537, fft_len=128k
Spike: peak=24.40533, time=6.711, d_freq=1421119059.91, chirp=-14.735, fft_len=128k
Spike: peak=24.47264, time=6.711, d_freq=1421119059.9, chirp=-14.735, fft_len=128k
Autocorr: peak=17.84166, time=87.24, delay=1.2622, d_freq=1421121197.62, chirp=-21.197, fft_len=128k
Autocorr: peak=18.20847, time=87.24, delay=1.2622, d_freq=1421121196.81, chirp=-21.206, fft_len=128k
Spike: peak=24.54893, time=104.9, d_freq=1421126487.7, chirp=71.456, fft_len=16k
Spike: peak=25.44987, time=104.9, d_freq=1421126487.83, chirp=71.634, fft_len=16k
Spike: peak=26.741, time=104.9, d_freq=1421126487.72, chirp=71.752, fft_len=16k
Spike: peak=25.83725, time=104.9, d_freq=1421126487.85, chirp=71.93, fft_len=16k
Spike: peak=26.78933, time=104.9, d_freq=1421126487.74, chirp=72.048, fft_len=16k
Spike: peak=24.39994, time=104.9, d_freq=1421126487.87, chirp=72.225, fft_len=16k
Spike: peak=24.67293, time=104.9, d_freq=1421126487.76, chirp=72.344, fft_len=16k

Best spike: peak=26.78933, time=104.9, d_freq=1421126487.74, chirp=72.048, fft_len=16k
Best autocorr: peak=18.20847, time=87.24, delay=1.2622, d_freq=1421121196.81, chirp=-21.206, fft_len=128k
Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.122e+11, d_freq=0,
	score=-12, null_hyp=0, chirp=0, fft_len=0 
Best pulse: peak=4.527436, time=38.76, period=0.6554, d_freq=1421123960.85, score=0.9209, chirp=-92.558, fft_len=256 
Best triplet: peak=0, time=-2.122e+11, period=0, d_freq=0, chirp=0, fft_len=0 

Flopcounter: 14788174568041.134766

Spike count:    19
Autocorr count: 2
Pulse count:    0
Triplet count:  0
Gaussian count: 0
Time cpu in use since last restart: 66.2 seconds
GPU device sync requested...  ...GPU device synched
00:46:10 (388): called boinc_finish(0)

It even gives the right answer. Joe Fox was using Mavericks.
Just thinking out loud here...
Joe Fox; Unfortunately, it looks like I still get the same error. The only difference I see in the stderr is now local mem type is "Emulated" rather than "Real".
Urs Echternacht; "Emulated" is what i am working at currently.
I hate to be the odd man here but my Mac Pro says; local mem type: Real

What does this "Enqueueing kernel: (strip,neg) PC_find_spike32_kernel_cl" do actually? It kinda sounds like it might deal with memory; //R: local memory allocation
I recently discovered you can't go above unroll 16 in Mountain Lion because it handles GPU memory differently than Mavericks, or Yosemite. So, all my builds will not work in Mountain Lion but work in Mavericks and Yosemite.

Do you think this could be a local memory allocation problem caused by building in Mountain Lion?
ID: 1649837 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1649945 - Posted: 6 Mar 2015, 13:42:38 UTC - in response to Message 1649837.  


What does this "Enqueueing kernel: (strip,neg) PC_find_spike32_kernel_cl" do actually? It kinda sounds like it might deal with memory; //R: local memory allocation


It does exactly what its name says - it finds spikes, one of those signals SETI MultiBeam reports back as interesting ones.
And it does this using loca/shared GPU memory inside kernel, just as my comment says.

Error you report constantly is:

#define CL_INVALID_WORK_GROUP_SIZE -54

So, kernel run got misconfigured somehow. One need to print real launch configuration for that calls to see why it became misconfigured. But the fact that in other builds no such error makes me think issue is local to your side. Post whole CLinfo output for beginning. You just stripped out that part in stderr.
ID: 1649945 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1649983 - Posted: 6 Mar 2015, 16:34:52 UTC - in response to Message 1649945.  
Last modified: 6 Mar 2015, 16:57:20 UTC

Well, this problem has appeared on Two separate Mountain Lion Systems, and recently survived Three OS reinstalls on the current system. Hard to imagine it could survive all that. I found how to run clinfo on a Mac from here; https://groups.google.com/d/msg/aparapi-discuss/ryA8IvjI0AU/Mz_H8C7aV6UJ

TomsMacPro:~ Tom$ ./clinfo
Found 1 platform(s).
platform[0x7fff0000]: profile: FULL_PROFILE
platform[0x7fff0000]: version: OpenCL 1.2 (Apr 25 2013 18:32:06)
platform[0x7fff0000]: name: Apple
platform[0x7fff0000]: vendor: Apple
platform[0x7fff0000]: extensions: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event
platform[0x7fff0000]: Found 4 device(s).
device[0xffffffff]: NAME: Intel(R) Xeon(R) CPU E5462 @ 2.80GHz
device[0xffffffff]: VENDOR: Intel
device[0xffffffff]: PROFILE: FULL_PROFILE
device[0xffffffff]: VERSION: OpenCL 1.2
device[0xffffffff]: EXTENSIONS: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_APPLE_fp64_basic_ops cl_APPLE_fixed_alpha_channel_orders cl_APPLE_biased_fixed_point_image_formats
device[0xffffffff]: DRIVER_VERSION: 1.1

device[0xffffffff]: Type: CPU
device[0xffffffff]: EXECUTION_CAPABILITIES: Kernel Native
device[0xffffffff]: GLOBAL_MEM_CACHE_TYPE: Read-Write (2)
device[0xffffffff]: CL_DEVICE_LOCAL_MEM_TYPE: Global (2)
device[0xffffffff]: SINGLE_FP_CONFIG: 0xbf
device[0xffffffff]: QUEUE_PROPERTIES: 0x2

device[0xffffffff]: VENDOR_ID: 4294967295
device[0xffffffff]: MAX_COMPUTE_UNITS: 4
device[0xffffffff]: MAX_WORK_ITEM_DIMENSIONS: 3
device[0xffffffff]: MAX_WORK_GROUP_SIZE: 1024
device[0xffffffff]: PREFERRED_VECTOR_WIDTH_CHAR: 16
device[0xffffffff]: PREFERRED_VECTOR_WIDTH_SHORT: 8
device[0xffffffff]: PREFERRED_VECTOR_WIDTH_INT: 4
device[0xffffffff]: PREFERRED_VECTOR_WIDTH_LONG: 2
device[0xffffffff]: PREFERRED_VECTOR_WIDTH_FLOAT: 4
device[0xffffffff]: PREFERRED_VECTOR_WIDTH_DOUBLE: 2
device[0xffffffff]: MAX_CLOCK_FREQUENCY: 2800
device[0xffffffff]: ADDRESS_BITS: 64
device[0xffffffff]: MAX_MEM_ALLOC_SIZE: 1610612736
device[0xffffffff]: IMAGE_SUPPORT: 1
device[0xffffffff]: MAX_READ_IMAGE_ARGS: 128
device[0xffffffff]: MAX_WRITE_IMAGE_ARGS: 8
device[0xffffffff]: IMAGE2D_MAX_WIDTH: 8192
device[0xffffffff]: IMAGE2D_MAX_HEIGHT: 8192
device[0xffffffff]: IMAGE3D_MAX_WIDTH: 2048
device[0xffffffff]: IMAGE3D_MAX_HEIGHT: 2048
device[0xffffffff]: IMAGE3D_MAX_DEPTH: 2048
device[0xffffffff]: MAX_SAMPLERS: 16
device[0xffffffff]: MAX_PARAMETER_SIZE: 4096
device[0xffffffff]: MEM_BASE_ADDR_ALIGN: 1024
device[0xffffffff]: MIN_DATA_TYPE_ALIGN_SIZE: 128
device[0xffffffff]: GLOBAL_MEM_CACHELINE_SIZE: 6291456
device[0xffffffff]: GLOBAL_MEM_CACHE_SIZE: 64
device[0xffffffff]: GLOBAL_MEM_SIZE: 6442450944
device[0xffffffff]: MAX_CONSTANT_BUFFER_SIZE: 65536
device[0xffffffff]: MAX_CONSTANT_ARGS: 8
device[0xffffffff]: LOCAL_MEM_SIZE: 32768
device[0xffffffff]: ERROR_CORRECTION_SUPPORT: 0
device[0xffffffff]: PROFILING_TIMER_RESOLUTION: 1
device[0xffffffff]: ENDIAN_LITTLE: 1
device[0xffffffff]: AVAILABLE: 1
device[0xffffffff]: COMPILER_AVAILABLE: 1
device[0x1021b00]: NAME: ATI Radeon Barts PRO Prototype
device[0x1021b00]: VENDOR: AMD
device[0x1021b00]: PROFILE: FULL_PROFILE
device[0x1021b00]: VERSION: OpenCL 1.1
device[0x1021b00]: EXTENSIONS: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store
device[0x1021b00]: DRIVER_VERSION: 1.0

device[0x1021b00]: Type: GPU
device[0x1021b00]: EXECUTION_CAPABILITIES: Kernel
device[0x1021b00]: GLOBAL_MEM_CACHE_TYPE: None (0)
device[0x1021b00]: CL_DEVICE_LOCAL_MEM_TYPE: Local (1)
device[0x1021b00]: SINGLE_FP_CONFIG: 0x1e
device[0x1021b00]: QUEUE_PROPERTIES: 0x2

device[0x1021b00]: VENDOR_ID: 16915200
device[0x1021b00]: MAX_COMPUTE_UNITS: 14
device[0x1021b00]: MAX_WORK_ITEM_DIMENSIONS: 3
device[0x1021b00]: MAX_WORK_GROUP_SIZE: 1024
device[0x1021b00]: PREFERRED_VECTOR_WIDTH_CHAR: 16
device[0x1021b00]: PREFERRED_VECTOR_WIDTH_SHORT: 8
device[0x1021b00]: PREFERRED_VECTOR_WIDTH_INT: 4
device[0x1021b00]: PREFERRED_VECTOR_WIDTH_LONG: 2
device[0x1021b00]: PREFERRED_VECTOR_WIDTH_FLOAT: 4
device[0x1021b00]: PREFERRED_VECTOR_WIDTH_DOUBLE: 0
device[0x1021b00]: MAX_CLOCK_FREQUENCY: 775
device[0x1021b00]: ADDRESS_BITS: 32
device[0x1021b00]: MAX_MEM_ALLOC_SIZE: 268435456
device[0x1021b00]: IMAGE_SUPPORT: 1
device[0x1021b00]: MAX_READ_IMAGE_ARGS: 128
device[0x1021b00]: MAX_WRITE_IMAGE_ARGS: 8
device[0x1021b00]: IMAGE2D_MAX_WIDTH: 8192
device[0x1021b00]: IMAGE2D_MAX_HEIGHT: 8192
device[0x1021b00]: IMAGE3D_MAX_WIDTH: 2048
device[0x1021b00]: IMAGE3D_MAX_HEIGHT: 2048
device[0x1021b00]: IMAGE3D_MAX_DEPTH: 2048
device[0x1021b00]: MAX_SAMPLERS: 16
device[0x1021b00]: MAX_PARAMETER_SIZE: 1024
device[0x1021b00]: MEM_BASE_ADDR_ALIGN: 32768
device[0x1021b00]: MIN_DATA_TYPE_ALIGN_SIZE: 128
device[0x1021b00]: GLOBAL_MEM_CACHELINE_SIZE: 0
device[0x1021b00]: GLOBAL_MEM_CACHE_SIZE: 0
device[0x1021b00]: GLOBAL_MEM_SIZE: 1073741824
device[0x1021b00]: MAX_CONSTANT_BUFFER_SIZE: 65536
device[0x1021b00]: MAX_CONSTANT_ARGS: 8
device[0x1021b00]: LOCAL_MEM_SIZE: 32768
device[0x1021b00]: ERROR_CORRECTION_SUPPORT: 0
device[0x1021b00]: PROFILING_TIMER_RESOLUTION: 37
device[0x1021b00]: ENDIAN_LITTLE: 1
device[0x1021b00]: AVAILABLE: 1
device[0x1021b00]: COMPILER_AVAILABLE: 1
device[0x2021b00]: NAME: ATI Radeon Barts XT Prototype
device[0x2021b00]: VENDOR: AMD
device[0x2021b00]: PROFILE: FULL_PROFILE
device[0x2021b00]: VERSION: OpenCL 1.1
device[0x2021b00]: EXTENSIONS: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store
device[0x2021b00]: DRIVER_VERSION: 1.0

device[0x2021b00]: Type: GPU
device[0x2021b00]: EXECUTION_CAPABILITIES: Kernel
device[0x2021b00]: GLOBAL_MEM_CACHE_TYPE: None (0)
device[0x2021b00]: CL_DEVICE_LOCAL_MEM_TYPE: Local (1)
device[0x2021b00]: SINGLE_FP_CONFIG: 0x1e
device[0x2021b00]: QUEUE_PROPERTIES: 0x2

device[0x2021b00]: VENDOR_ID: 33692416
device[0x2021b00]: MAX_COMPUTE_UNITS: 14
device[0x2021b00]: MAX_WORK_ITEM_DIMENSIONS: 3
device[0x2021b00]: MAX_WORK_GROUP_SIZE: 1024
device[0x2021b00]: PREFERRED_VECTOR_WIDTH_CHAR: 16
device[0x2021b00]: PREFERRED_VECTOR_WIDTH_SHORT: 8
device[0x2021b00]: PREFERRED_VECTOR_WIDTH_INT: 4
device[0x2021b00]: PREFERRED_VECTOR_WIDTH_LONG: 2
device[0x2021b00]: PREFERRED_VECTOR_WIDTH_FLOAT: 4
device[0x2021b00]: PREFERRED_VECTOR_WIDTH_DOUBLE: 0
device[0x2021b00]: MAX_CLOCK_FREQUENCY: 900
device[0x2021b00]: ADDRESS_BITS: 32
device[0x2021b00]: MAX_MEM_ALLOC_SIZE: 268435456
device[0x2021b00]: IMAGE_SUPPORT: 1
device[0x2021b00]: MAX_READ_IMAGE_ARGS: 128
device[0x2021b00]: MAX_WRITE_IMAGE_ARGS: 8
device[0x2021b00]: IMAGE2D_MAX_WIDTH: 8192
device[0x2021b00]: IMAGE2D_MAX_HEIGHT: 8192
device[0x2021b00]: IMAGE3D_MAX_WIDTH: 2048
device[0x2021b00]: IMAGE3D_MAX_HEIGHT: 2048
device[0x2021b00]: IMAGE3D_MAX_DEPTH: 2048
device[0x2021b00]: MAX_SAMPLERS: 16
device[0x2021b00]: MAX_PARAMETER_SIZE: 1024
device[0x2021b00]: MEM_BASE_ADDR_ALIGN: 32768
device[0x2021b00]: MIN_DATA_TYPE_ALIGN_SIZE: 128
device[0x2021b00]: GLOBAL_MEM_CACHELINE_SIZE: 0
device[0x2021b00]: GLOBAL_MEM_CACHE_SIZE: 0
device[0x2021b00]: GLOBAL_MEM_SIZE: 1073741824
device[0x2021b00]: MAX_CONSTANT_BUFFER_SIZE: 65536
device[0x2021b00]: MAX_CONSTANT_ARGS: 8
device[0x2021b00]: LOCAL_MEM_SIZE: 32768
device[0x2021b00]: ERROR_CORRECTION_SUPPORT: 0
device[0x2021b00]: PROFILING_TIMER_RESOLUTION: 37
device[0x2021b00]: ENDIAN_LITTLE: 1
device[0x2021b00]: AVAILABLE: 1
device[0x2021b00]: COMPILER_AVAILABLE: 1
device[0x3021b00]: NAME: ATI Radeon Barts PRO Prototype
device[0x3021b00]: VENDOR: AMD
device[0x3021b00]: PROFILE: FULL_PROFILE
device[0x3021b00]: VERSION: OpenCL 1.1
device[0x3021b00]: EXTENSIONS: cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store
device[0x3021b00]: DRIVER_VERSION: 1.0

device[0x3021b00]: Type: GPU
device[0x3021b00]: EXECUTION_CAPABILITIES: Kernel
device[0x3021b00]: GLOBAL_MEM_CACHE_TYPE: None (0)
device[0x3021b00]: CL_DEVICE_LOCAL_MEM_TYPE: Local (1)
device[0x3021b00]: SINGLE_FP_CONFIG: 0x1e
device[0x3021b00]: QUEUE_PROPERTIES: 0x2

device[0x3021b00]: VENDOR_ID: 50469632
device[0x3021b00]: MAX_COMPUTE_UNITS: 14
device[0x3021b00]: MAX_WORK_ITEM_DIMENSIONS: 3
device[0x3021b00]: MAX_WORK_GROUP_SIZE: 1024
device[0x3021b00]: PREFERRED_VECTOR_WIDTH_CHAR: 16
device[0x3021b00]: PREFERRED_VECTOR_WIDTH_SHORT: 8
device[0x3021b00]: PREFERRED_VECTOR_WIDTH_INT: 4
device[0x3021b00]: PREFERRED_VECTOR_WIDTH_LONG: 2
device[0x3021b00]: PREFERRED_VECTOR_WIDTH_FLOAT: 4
device[0x3021b00]: PREFERRED_VECTOR_WIDTH_DOUBLE: 0
device[0x3021b00]: MAX_CLOCK_FREQUENCY: 775
device[0x3021b00]: ADDRESS_BITS: 32
device[0x3021b00]: MAX_MEM_ALLOC_SIZE: 268435456
device[0x3021b00]: IMAGE_SUPPORT: 1
device[0x3021b00]: MAX_READ_IMAGE_ARGS: 128
device[0x3021b00]: MAX_WRITE_IMAGE_ARGS: 8
device[0x3021b00]: IMAGE2D_MAX_WIDTH: 8192
device[0x3021b00]: IMAGE2D_MAX_HEIGHT: 8192
device[0x3021b00]: IMAGE3D_MAX_WIDTH: 2048
device[0x3021b00]: IMAGE3D_MAX_HEIGHT: 2048
device[0x3021b00]: IMAGE3D_MAX_DEPTH: 2048
device[0x3021b00]: MAX_SAMPLERS: 16
device[0x3021b00]: MAX_PARAMETER_SIZE: 1024
device[0x3021b00]: MEM_BASE_ADDR_ALIGN: 32768
device[0x3021b00]: MIN_DATA_TYPE_ALIGN_SIZE: 128
device[0x3021b00]: GLOBAL_MEM_CACHELINE_SIZE: 0
device[0x3021b00]: GLOBAL_MEM_CACHE_SIZE: 0
device[0x3021b00]: GLOBAL_MEM_SIZE: 1073741824
device[0x3021b00]: MAX_CONSTANT_BUFFER_SIZE: 65536
device[0x3021b00]: MAX_CONSTANT_ARGS: 8
device[0x3021b00]: LOCAL_MEM_SIZE: 32768
device[0x3021b00]: ERROR_CORRECTION_SUPPORT: 0
device[0x3021b00]: PROFILING_TIMER_RESOLUTION: 37
device[0x3021b00]: ENDIAN_LITTLE: 1
device[0x3021b00]: AVAILABLE: 1
device[0x3021b00]: COMPILER_AVAILABLE: 1


I also tried to use my Mavericks system, the system that was the second Mountain Lion system, to build revision 2760 on my Mac Pro. It failed, the same way it fails in Yosemite. Best I can tell it has a problem building the source files;
Undefined symbols for architecture x86_64:
"boinc_resolve_filename_s(char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocato r<char> >&)", referenced from:
_main in seti_boinc-main.o seti_init_state() in seti_boinc-seti.o worker() in seti_boinc-worker.o
"std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::find(char const*, unsigned long, unsigned long) const", referenced from:
std::__1::vector<unsigned char, std::__1::allocator<unsigned char> > xml_decode_field<unsigned char>(std::__1:: basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, char const*) in seti_boinc-seti.o coordinate_t::parse_xml(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > c
onst&, char const*) in seti_boinc-schema_master.o
chirp_parameter_t::parse_xml(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char
> > const&, char const*) in seti_boinc-schema_master.o
subband_description_t::parse_xml(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<
char> > const&, char const*) in seti_boinc-schema_master.o
data_description_t::parse_xml(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<cha
r> > const&, char const*) in seti_boinc-schema_master.o
receiver_config::parse_xml(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>
> const&, char const*) in seti_boinc-schema_master.o
recorder_config::parse_xml(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>
> const&, char const*) in seti_boinc-schema_master.o
...
"std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::~basic_string()", reference d from:
_main in seti_boinc-main.o
seti_error::~seti_error() in seti_boinc-main.o seti_analyze(ANALYSIS_STATE&) in seti_boinc-analyzeFuncs.o seti_error::~seti_error() in seti_boinc-analyzeFuncs.o result_spike(SPIKE_INFO&) in seti_boinc-analyzeReport.o result_autocorr(AUTOCORR_INFO&) in seti_boinc-analyzeReport.o result_gaussian(GAUSS_INFO&) in seti_boinc-analyzeReport.o ...

The only system that will compile is Mountain Lion....on my Mac Pro.
ID: 1649983 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1649988 - Posted: 6 Mar 2015, 17:08:55 UTC - in response to Message 1649983.  
Last modified: 6 Mar 2015, 17:20:42 UTC

Curious, and most definitely not my area. Why does Apple only have OpenCL 1.0 drivers in there for a 1.1 capable device ? [Edit: and 1.1 for a 1.2...]

[Edit2:] Another question in ignorance. These are on the same machine with a card and CPU right ? When did FreeBSD/Darwin/OSX start having multiple Video driver stacks , if that's what's going on there? Can a display be used on both simultaneously? It wouldn't surprise me if there is some installation order juggling needed there, like mixing Intel CPU, AMD and NV.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1649988 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1650000 - Posted: 6 Mar 2015, 17:43:17 UTC - in response to Message 1649988.  
Last modified: 6 Mar 2015, 17:53:35 UTC

Curious, and most definitely not my area. Why does Apple only have OpenCL 1.0 drivers in there for a 1.1 capable device ? [Edit: and 1.1 for a 1.2...]

[Edit2:] Another question in ignorance. These are on the same machine with a card and CPU right ? When did FreeBSD/Darwin/OSX start having multiple Video driver stacks , if that's what's going on there? Can a display be used on both simultaneously? It wouldn't surprise me if there is some installation order juggling needed there, like mixing Intel CPU, AMD and NV.

The way I read it is the Driver version is OpenCL 1.2. The Device version is OpenCL 1.1. These are ATI 6850s and 6870, back then there wasn't an OpenCL 1.2. The DRIVER_VERSION: 1.0 is an Apple designation, note it doesn't say OpenCL...

This is the Machine;
http://setiathome.berkeley.edu/show_host_detail.php?hostid=6796479

I believe you're going to find the problem is 10.8 doesn't have Virtual GPU memory where 10.9 & 10.10 do. I get the 'call failed (-54) Error' in 10.8, I don't get the Error in 10.9 or 10.10. Somehow, the way the app is compiled in 10.8 doesn't agree with BOINC either. The GPU app compiled in 10.8 will fail immediately in BOINC as noted previously, "uncaught exception of type std::logic_error: basic_string::_S_construct NULL not valid"
ID: 1650000 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1650004 - Posted: 6 Mar 2015, 17:53:23 UTC - in response to Message 1650000.  

and do other generic OpenCL test programs (that allow you to select device) run correctly on each of those devices ? and also at the same time ?
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1650004 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1650007 - Posted: 6 Mar 2015, 18:00:16 UTC - in response to Message 1650000.  

I believe you're going to find the problem is 10.8 doesn't have Virtual GPU memory where 10.9 & 10.10 do. I get the 'call failed (-54) Error' in 10.8, I don't get the Error in 10.9 or 10.10. Somehow, the way the app is compiled in 10.8 doesn't agree with BOINC either. The GPU app compiled in 10.8 will fail immediately in BOINC as noted previously, "uncaught exception of type std::logic_error: basic_string::_S_construct NULL not valid"


hmmm, sounds like driver model changes something like MS's Continual transition.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1650007 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1650009 - Posted: 6 Mar 2015, 18:00:46 UTC - in response to Message 1650004.  

LuxMark doesn't have any trouble running the GPUs. The Machine has been working AstroPulses since there was APs for a Mac without any problems. In fact, it spent a good part of last Summer on the front page of the Top Computer list...
ID: 1650009 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1650096 - Posted: 6 Mar 2015, 21:45:40 UTC

What is different from Windows is workgroup size.
Apple allows 1024 whie AMD allows only 256 for own cards....
ID: 1650096 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1650110 - Posted: 6 Mar 2015, 22:22:02 UTC - in response to Message 1650096.  

What is different from Windows is workgroup size.
Apple allows 1024 whie AMD allows only 256 for own cards....

So, how can we fix that? My newly revitalized 10.8.5 can crank out a nicely working App in about 5 minutes. It would be nice to pass this hurdle and move on. My MBv7 CPU App is doing very nicely, http://setiathome.berkeley.edu/result.php?resultid=4011766589. Next I'd like to recover that missing 90 minutes that was lost between APv6 and APv7. My machine completed an Unblanked APv6 CPU task in 8 hours, now an Unblanked APv7 CPU task takes 9.5 hours. Something wrong there.
;-)
ID: 1650110 · Report as offensive
Chris Adamek
Volunteer tester

Send message
Joined: 15 May 99
Posts: 251
Credit: 434,772,072
RAC: 236
United States
Message 1650123 - Posted: 6 Mar 2015, 22:33:00 UTC - in response to Message 1650000.  

Curious, and most definitely not my area. Why does Apple only have OpenCL 1.0 drivers in there for a 1.1 capable device ? [Edit: and 1.1 for a 1.2...]

[Edit2:] Another question in ignorance. These are on the same machine with a card and CPU right ? When did FreeBSD/Darwin/OSX start having multiple Video driver stacks , if that's what's going on there? Can a display be used on both simultaneously? It wouldn't surprise me if there is some installation order juggling needed there, like mixing Intel CPU, AMD and NV.

The way I read it is the Driver version is OpenCL 1.2. The Device version is OpenCL 1.1. These are ATI 6850s and 6870, back then there wasn't an OpenCL 1.2. The DRIVER_VERSION: 1.0 is an Apple designation, note it doesn't say OpenCL...


OpenCL 1.2 did exist in OS X until Mavericks. Hence why iGPU's didn't show up on Boinc on Macs until Mavericks rolled out... I know zilch about what y'all are talking about but are any of the OpenCL functions 1.2 specific? Might be why things are dying in pre-Mavericks versions...

Chris
ID: 1650123 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1650284 - Posted: 7 Mar 2015, 11:45:48 UTC - in response to Message 1650110.  

What is different from Windows is workgroup size.
Apple allows 1024 whie AMD allows only 256 for own cards....

So, how can we fix that?

Only by deceiving rest of app with different values. I'm not sure it should be done though, on NV app deals with 1024 WG OK usually.
Anyway the same situation Urs should encountered when he did his port to OS X. Cause his build works and works almost OK, he already overcame this issue if any issue here. And AFAIK all hist changes already committed to Berkeley's tree. Hence, you downloaded already working on OS X source code. Taking all this into account I would concentrate on differencies versus Urs build and why are happen, not on Apple's workgroup size per se.
BTW, you mentioned "emulated" instead of "real" memory type. Quite possible it's his workaround for Apple - to make app think it has emulated shared memory hence not use it much.
ID: 1650284 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1650350 - Posted: 7 Mar 2015, 17:05:45 UTC - in response to Message 1650284.  
Last modified: 7 Mar 2015, 17:18:48 UTC

What is different from Windows is workgroup size.
Apple allows 1024 whie AMD allows only 256 for own cards....

So, how can we fix that?

Only by deceiving rest of app with different values. I'm not sure it should be done though, on NV app deals with 1024 WG OK usually.
Anyway the same situation Urs should encountered when he did his port to OS X. Cause his build works and works almost OK, he already overcame this issue if any issue here. And AFAIK all hist changes already committed to Berkeley's tree. Hence, you downloaded already working on OS X source code. Taking all this into account I would concentrate on differencies versus Urs build and why are happen, not on Apple's workgroup size per se.
BTW, you mentioned "emulated" instead of "real" memory type. Quite possible it's his workaround for Apple - to make app think it has emulated shared memory hence not use it much.

That's not exactly what I was hoping to hear. Yes nVidia uses WG size of 1024 so there shouldn't be a problem, yet there obviously is a problem. Urs and Joe are using a Mobility GPU, Joe wasn't even using Mountain Lion, I'm using a Mac Pro with Real video cards and Real video memory. There are obvious other differences with Urs machine as some of the Configure options he recommends simply do not work on my Mac Pro. If I try those options the configure stops and complains about the compiler not working. The Apps that compile on my machine also work OK, and seem to work a little better with the exception of the WG size error. If the only problem is the code not expecting a larger GPU WG then allowing a larger GPU WG on Apple GPUs would seem to be the fix. I have been running the compiled GPU Apps in standalone for around a week and they work perfectly OK on my Mac Pro in Mavericks and Yosemite. I have been reporting the same Error for over a week, you would think if it was a recognized Error someone would have mentioned it. The Mac GPU Apps work fine when compiled on my Mac Pro with real GPUs, removing the Error message should be the objective, not preforming some trick to avoid some erroneous error message.
ID: 1650350 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1650387 - Posted: 7 Mar 2015, 18:32:09 UTC - in response to Message 1650350.  

Urs' binary running on your PC give that error? It runs at all?
ID: 1650387 · Report as offensive
Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 · Next

Message boards : Number crunching : The Highest Ranked SETI AMD Host is a MAC: Time for a STOCK MAC APP?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.