The Highest Ranked SETI AMD Host is a MAC: Time for a STOCK MAC APP?

Message boards : Number crunching : The Highest Ranked SETI AMD Host is a MAC: Time for a STOCK MAC APP?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 17 · 18 · 19 · 20 · 21 · Next

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1651465 - Posted: 10 Mar 2015, 22:45:34 UTC - in response to Message 1651461.  

That's odd. The version number has gone up by 2, from 2867 to 2869 yet there aren't any changes listed...

Don't worry about that.

setisvn is a single, common, repository used for safe keeping of all sorts of files relevant to SETI at Berkeley. 2868 and 2869 were two modifications to a paper being prepared for publication by a visiting researcher - not even computer source code.

If you're going to take this project further, I do suggest that you take time to understand what an SVN repository is, and how - at the most elementary level, which is all I've bothered with - how to use it.

So you're hinting that there is something wrong with downloading the latest files from Berkeley's own site? Is there a problem downloading from here, https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt BTW, if the entire folder doesn't download all you have to do is drop down one level and download the folder contents. Tracking down and installing yet more repetitive third party software is not something some people desire. I just spent days eliminating a faulty install of FFTW, I'd like to refrain from installing anything not absolutely needed.
ID: 1651465 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1651469 - Posted: 10 Mar 2015, 22:58:12 UTC - in response to Message 1651465.  
Last modified: 10 Mar 2015, 23:22:34 UTC

That's odd. The version number has gone up by 2, from 2867 to 2869 yet there aren't any changes listed...

Don't worry about that.

setisvn is a single, common, repository used for safe keeping of all sorts of files relevant to SETI at Berkeley. 2868 and 2869 were two modifications to a paper being prepared for publication by a visiting researcher - not even computer source code.

If you're going to take this project further, I do suggest that you take time to understand what an SVN repository is, and how - at the most elementary level, which is all I've bothered with - how to use it.

So you're hinting that there is something wrong with downloading the latest files from Berkeley's own site? Is there a problem downloading from here, https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt BTW, if the entire folder doesn't download all you have to do is drop down one level and download the folder contents. Tracking down and installing yet more repetitive third party software is not something some people desire. I just spent days eliminating a faulty install of FFTW, I'd like to refrain from installing anything not absolutely needed.

SVN is just a mechanism for keeping a set of resource files up-to-date on your computer. It doesn't "install" anything - what you do with the resource files after updating them is entirely up to you.

Once you have the basic file-set replicated on your computer (which, admittedly, is a big download), only changes (differences) need to be downloaded, which is much, much quicker than downloading the whole darn kit'n'caboodle from scratch every time.

And then you can move onto stage 2, which I've barely scratched the surface of myself. For example you can find out who changed what, when. You can compare the previous version with the current version, and - if you're lucky - you can see why the author decided to make that particular change (that does rely on the author remembering to explain the change clearly, though - not all do, or at least not everyone does it every time).

For further details (my bedtime looms), try reading http://en.wikipedia.org/wiki/Apache_Subversion.

Edit - on second thoughts, that looks a lot heavier than most simple Wikipedia introductions to subjects. Maybe one of the other regular users can come up with a better introduction?

2nd (and positively last) edit on the way to bed.

a) I've just downloaded the final three changes to bring my copy of the source code fully up to date. It took 5 seconds.

b) The procedure is: use SVN to keep a local copy of everything fully up-to-date - but keep that copy sacrosanct on another partition/drive/computer, well away from any experiments with a compiler. Then make a working scratchpad copy you can fiddle with to your heart's content - that's an 'export' from the local master copy synchronised from Berkeley, and a damn sight faster than a fresh download. That's what I've learned by working on the Lunatics Installer.
ID: 1651469 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1651479 - Posted: 10 Mar 2015, 23:18:05 UTC - in response to Message 1651469.  

The entire zipped copy of sah_v7_opt.zip is less than 30mb and takes less than 10 seconds to download. For now, I think I will stick to just downloading the latest version when needed.
ID: 1651479 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1651572 - Posted: 11 Mar 2015, 7:59:30 UTC - in response to Message 1651267.  

Any luck on assigning the WG to Apple GPUs yet? If there is a workaround in the repository somewhere has anyone found it?
I've concluded my MB7 CPU work and would like to finish MB work before moving on to the AP project.

Wait a little, under investigation for now.
ID: 1651572 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1651699 - Posted: 11 Mar 2015, 16:29:04 UTC - in response to Message 1651572.  

Any luck on assigning the WG to Apple GPUs yet? If there is a workaround in the repository somewhere has anyone found it?
I've concluded my MB7 CPU work and would like to finish MB work before moving on to the AP project.

Wait a little, under investigation for now.

You might want to look at this one as well, it also reports "max WG size: 1024", although it appears to work;
Not using ap_cmdline.txt-file, using commandline options.
shmget in attach_shmem: Invalid argument
05:04:17 (88348): Can't set up shared mem: -1. Will run in standalone mode.
Illegal value for gpu_device_num: -1 in BOINC Client 0.0.0
WARNING: boinc_get_opencl_ids failed with code -33
OpenCL platform detected: Apple
WARNING: BOINC supplied wrong platform!
Number of OpenCL devices found : 3 
BOINC assigns slot on device #0.
WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities
Used GPU device parameters are:
	Number of compute units: 14
	Single buffer allocation size: 256MB
	Total device global memory: 1024MB
	max WG size: 1024
	local mem type: Real
	-unroll default value used: 14
	-ffa_block default value used: 3584
	-ffa_block_fetch default value used: 1792
AstroPulse v7.01
Darwin 10.7+ 64 bit,  rel.  Rev 1864, OpenCL version by Raistmer, GPU mode
 V7, by Raistmer ported to  OS X   by Lunatics.kwsn.net team. 

Build features: Non-graphics OpenCL OPENCL_WRITE COMBINED_DECHIRP_KERNEL
SMALL_CHIRP_TABLE TWIN_FFA FFTW BLANKIT USE_INCREASED_PRECISION SSE4.1 64bit 
 System: Darwin  x86_64  Kernel: 12.6.0
CPU : Intel(R) Xeon(R) CPU           E5462  @ 2.80GHz 
 GenuineIntel x86, Family 6 Model 23 Stepping 6
 Features : FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI
 MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 DTES64 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1

Number of OpenCL platforms:				 1
OpenCL Platform Name:					 Apple
Number of devices:				 3
  Max compute units:				 14
  Max work group size:				 1024
  Max clock frequency:				 900Mhz
  Max memory allocation:			 268435456
  Cache type:					 None
  Cache line size:				 0
  Cache size:					 0
  Global memory size:				 1073741824
  Constant buffer size:				 65536
  Max number of constant args:			 8
  Local memory type:				 Scratchpad
  Local memory size:				 32768
  Queue properties:				 
    Out-of-Order:				 No
  Name:						 ATI Radeon Barts XT Prototype
  Vendor:					 AMD
  Driver version:				 1.0
  Version:					 OpenCL 1.1 
  Max compute units:				 14
  Max work group size:				 1024
  Max clock frequency:				 775Mhz
  Max memory allocation:			 268435456
  Cache type:					 None
  Cache line size:				 0
  Cache size:					 0
  Global memory size:				 1073741824
  Constant buffer size:				 65536
  Max number of constant args:			 8
  Local memory type:				 Scratchpad
  Local memory size:				 32768
  Queue properties:				 
    Out-of-Order:				 No
  Name:						 ATI Radeon Barts PRO Prototype
  Vendor:					 AMD
  Driver version:				 1.0
  Version:					 OpenCL 1.1 
  Max compute units:				 14
  Max work group size:				 1024
  Max clock frequency:				 775Mhz
  Max memory allocation:			 268435456
  Cache type:					 None
  Cache line size:				 0
  Cache size:					 0
  Global memory size:				 1073741824
  Constant buffer size:				 65536
  Max number of constant args:			 8
  Local memory type:				 Scratchpad
  Local memory size:				 32768
  Queue properties:				 
    Out-of-Order:				 No
  Name:						 ATI Radeon Barts PRO Prototype
  Vendor:					 AMD
  Driver version:				 1.0
  Version:					 OpenCL 1.1 

INFO: can't open binary kernel file: .//AstroPulse_Kernels_r1864.cl_ATIRadeonBartsXTPrototype.bin_V7_TWIN_FFA_12.6.0_10, continue with recompile...
Info : Building Program (binary, clBuildProgram):main kernels: OK code 0
INFO: binary kernel file created
WARNING: can't open binary kernel file for oclFFT plan: .//AP_clFFTplan_ATIRadeonBartsXTPrototype_32768_gr64_lr16_wg256_tw0_r1864.bin_12.6.0_10, continue with recompile...

    single pulses: 0
repetitive pulses: 9
  percent blanked: 63.03
Rep. pulse: num_std_devs=7.085 peak_power=2436105 dm=-3072 peak_bin=512 scale=7 ffa_scale=2 period=55.0985
Rep. pulse: num_std_devs=9.825 peak_power=980516.5 dm=2944 peak_bin=11264 scale=7 ffa_scale=2 period=137.7462
Rep. pulse: num_std_devs=8.737 peak_power=491489.5 dm=-2944 peak_bin=14848 scale=7 ffa_scale=2 period=275.4913
Rep. pulse: num_std_devs=8.603 peak_power=979302.1 dm=3072 peak_bin=28672 scale=7 ffa_scale=3 period=275.4903
Rep. pulse: num_std_devs=7.544 peak_power=490654.6 dm=-2944 peak_bin=0 scale=7 ffa_scale=1 period=137.7467
Rep. pulse: num_std_devs=9.826 peak_power=1955434 dm=-2944 peak_bin=0 scale=7 ffa_scale=3 period=137.7451
Rep. pulse: num_std_devs=7.253 peak_power=246268.9 dm=3072 peak_bin=8192 scale=7 ffa_scale=1 period=275.4924
Rep. pulse: num_std_devs=6.666 peak_power=2435438 dm=-3072 peak_bin=512 scale=7 ffa_scale=1 period=27.54915
Rep. pulse: num_std_devs=7.222 peak_power=4866263 dm=3072 peak_bin=512 scale=7 ffa_scale=2 period=27.54894

remove_radar                        	 total=9.942E+08 , N=1         , <>=9.942E+08 , min=9.942E+08 , max=9.942E+08 
main_loop_L1                        	 total=1.167E+11 , N=2         , <>=5.837E+10 , min=5.835E+10 , max=5.840E+10 
 FFT_forward                         	 total=1.949E+08 , N=2352      , <>=8.286E+04 , min=2.062E+04 , max=4.025E+05 
 remove_radar_randomize              	 total=0.000E+00 , N=0         , <>=0.000E+00 , min=1.845E+19 , max=0.000E+00 
 build_chirp_table                   	 total=0.000E+00 , N=0         , <>=0.000E+00 , min=1.845E+19 , max=0.000E+00 
 DataWrite                           	 total=0.000E+00 , N=0         , <>=0.000E+00 , min=1.845E+19 , max=0.000E+00 
  DataWrite_ns                        	 total=0.0000E+00, N=0         , <>=0         , min=0         , max=0         
 oclReadBuf                          	 total=0.000E+00 , N=0         , <>=0.000E+00 , min=1.845E+19 , max=0.000E+00 
   ChirpWrite                          	 total=0.000E+00 , N=0         , <>=0.000E+00 , min=1.845E+19 , max=0.000E+00 
    ChirpWrite_ns                       	 total=0.0000E+00, N=0         , <>=0         , min=0         , max=0         
 dechirp                             	 total=1.148E+08 , N=2352      , <>=4.881E+04 , min=2.699E+04 , max=8.592E+04 
  Dechirp_ns                          	 total=0.0000E+00, N=0         , <>=0         , min=0         , max=0         
  Half_ns                             	 total=0.0000E+00, N=0         , <>=0         , min=0         , max=0         
 PC_single_pulse_kernel_FFA_update   	 total=7.850E+10 , N=2352      , <>=3.338E+07 , min=3.088E+07 , max=6.946E+07 
  PC_ns                               	 total=0.0000E+00, N=0         , <>=0         , min=0         , max=0         
oclReadBuf                          	 total=0.000E+00 , N=0         , <>=0.000E+00 , min=1.845E+19 , max=0.000E+00 
oclWriteBuf                         	 total=0.000E+00 , N=0         , <>=0.000E+00 , min=1.845E+19 , max=0.000E+00 
  FFT_inverse                         	 total=0.000E+00 , N=0         , <>=0.000E+00 , min=1.845E+19 , max=0.000E+00 
 ffa                                 	 total=3.393E+10 , N=18        , <>=1.885E+09 , min=7.572E+08 , max=1.087E+10 
FFA blocks counters:
FFA_fetch                           	 total=0.000E+00 , N=0         , <>=0.000E+00 , min=1.845E+19 , max=0.000E+00 
FFA_tt_build                        	 total=0.000E+00 , N=0         , <>=0.000E+00 , min=1.845E+19 , max=0.000E+00 
FFA_compare                         	 total=1.926E+08 , N=40        , <>=4.815E+06 , min=8.445E+05 , max=2.608E+07 
FFA_coadd                           	 total=3.191E+07 , N=2526      , <>=1.263E+04 , min=6.993E+03 , max=6.266E+04 
FFA_stride_add                      	 total=3.589E+06 , N=36        , <>=9.968E+04 , min=6.511E+04 , max=1.184E+05 
GPU_buffer_read_backs               	 total=0.0000E+00, N=0         , <>=0         , min=0         , max=0         
TWIN_FFA	USE_OPENCL	OPENCL_WRITE	USE_INCREASED_PRECISION	SMALL_CHIRP_TABLE	COMBINED_DECHIRP_KERNEL	BLANKIT	
rev 1864
GPU device sync requested...  ...GPU device synched
05:05:03 (88348): called boinc_finish(0)

Why ap_7.01, is the Blankit folder the correct folder?
At least the benchmark appears to work for AP;

KWSN-OSX-APbench v2.0.24
Running on TomsMacPro.local at Wed Mar 11 09:03:29 2015
-------------------------------------------------------------------------
Starting benchmark run...
-------------------------------------------------------------------------
Listing wu-file(s) in /testWUs :
sigind_v5.wu

Listing executable(s) in /testAPPs :
ap_7.01r1864_sse41_clGPU_x86_64-apple-darwin

Listing executable in /refAPPs :
astropulse_7.04_x86_64-apple-darwin__opencl_ati_mac

Listing additional reference results in /testData/refResults :
No additional reference results in /testData/refResults found.
-------------------------------------------------------------------------
Current WU: sigind_v5.wu
-------------------------------------------------------------------------
Running default app with command : ... astropulse_7.04_x86_64-apple-darwin__opencl_ati_mac
Elapsed Time: ........................ 48 seconds
-------------------------------------------------------------------------
Running app with command : ........... ap_7.01r1864_sse41_clGPU_x86_64-apple-darwin
Elapsed Time: ........................ 48 seconds
Speed compared to default : .......... 100 %
-----------------
Comparing results
Actual results : 
./testData/ref-pulse.astropulse_7.04_x86_64-apple-darwin__opencl_ati_mac.sigind_v5.wu.out: <ap_signal>19,<pulses>9,<best_pulses>10
         ./pulse.out: <ap_signal>19,<pulses>9,<best_pulses>10
             All Signals: Checked  19, 19 , Strongly Similar
                  Pulses: Checked   9,  9 , Strongly Similar
             Best Pulses: Checked  10, 10 , Strongly Similar
-(./testData/ref-pulse.astropulse_7.04_x86_64-apple-darwin__opencl_ati_mac.sigind_v5.wu.out)-
    Reportable Single Pulses: 0 [OK], 0 above threshold*THRESHOLD_FUDGE
 Reportable Repeating Pulses: 9 [OK]
        Single Pulses (Best): 0 [OK], 0 above threshold*THRESHOLD_FUDGE
-(./pulse.out)-
    Reportable Single Pulses: 0 [OK], 0 above threshold*THRESHOLD_FUDGE
 Reportable Repeating Pulses: 9 [OK]
        Single Pulses (Best): 0 [OK], 0 above threshold*THRESHOLD_FUDGE

I really didn't expect much change with the GPU AP App, the CPU App will be interesting...
ID: 1651699 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1651822 - Posted: 11 Mar 2015, 21:10:00 UTC
Last modified: 11 Mar 2015, 21:53:06 UTC

Seems to be hanging on the AP CPU App;

ld: warning: directory not found for option '-L/opt/local/lib'
Undefined symbols for architecture x86_64:
  "Astropulse::number_of_logged_signals", referenced from:
      Astropulse::add_signal(std::vector<ap_signal, std::allocator<ap_signal> >&, ap_signal) in ap_client-ap_fileio.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: [ap_client] Error 1 (ignored)
/bin/cp ap_client ap_7.01r1864_sse41_x86_64-apple-darwin
cp: ap_client: No such file or directory
make: [ap_7.01r1864_sse41_x86_64-apple-darwin] Error 1 (ignored)
strip ap_7.01r1864_sse41_x86_64-apple-darwin
error: strip: can't open file: ap_7.01r1864_sse41_x86_64-apple-darwin (No such file or directory)
make: [ap_7.01r1864_sse41_x86_64-apple-darwin] Error 1 (ignored)
/bin/rm -f astropulse-7.01_x86_64-apple-darwin.debug
/bin/ln ap_client astropulse-7.01_x86_64-apple-darwin.debug
ln: ap_client: No such file or directory
make: [astropulse-7.01_x86_64-apple-darwin.debug] Error 1 (ignored)
make  all-am

There doesn't seem to be an ap_signal file. Strange it didn't have any problems building the GPU App.
If you use the other folder, "AP", it will build you a nice AP CPU 6.09 App...if you ever need an APv6 CPU App.
But it hangs on the APv7 CPU App...
ID: 1651822 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1651857 - Posted: 11 Mar 2015, 22:12:51 UTC

Ok, try revision: 2870 for OpenCL MultiBeam.
That WG size limiting code wasn't on my workcopy too so I added it. Also other improvements.

Regarding AP: If you download whole folder you need to delete AP folder after and rename AP_BLANKIT folder to AP one. All this should be not needed if SVN would be used properly. AP and AP_BLANKIT belongs to different branches actually.
ID: 1651857 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1651878 - Posted: 11 Mar 2015, 23:04:43 UTC

And after that check revision 2871 - will it work on OS X ?
It has more generalized approach to kernel launch geometry, but implements this approach only on FindSpike32 kernel. All others will use max device limit for now.
ID: 1651878 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1651910 - Posted: 12 Mar 2015, 0:17:09 UTC - in response to Message 1651857.  
Last modified: 12 Mar 2015, 0:19:27 UTC

Ok, try revision: 2870 for OpenCL MultiBeam.
That WG size limiting code wasn't on my workcopy too so I added it. Also other improvements.

Regarding AP: If you download whole folder you need to delete AP folder after and rename AP_BLANKIT folder to AP one. All this should be not needed if SVN would be used properly. AP and AP_BLANKIT belongs to different branches actually.

Nice work. 2870 is running in standalone in Mountain Lion. I'll have to test it awhile before seeing if it will work in BOINC;

20:07:52 (60901): Can't open init data file - running in standalone mode
20:07:52 (60901): Can't open init data file - running in standalone mode
Not using mb_cmdline.txt-file, using commandline options.
20:07:52 (60901): Can't open init data file - running in standalone mode
WARNING: init_data.xml missing
OpenCL platform detected: Apple
WARNING: BOINC supplied wrong platform!
Number of OpenCL devices found : 3 
BOINC assigns slot on device #0.
WARNING: BOINC failed to provide OpenCL device, using own enumeration abilities

Build features: SETI7 Non-graphics OpenCL USE_OPENCL_HD5xxx OCL_CHIRP3 FFTW SSE4.1 64bit 
 System: Darwin  x86_64  Kernel: 12.6.0
CPU : Intel(R) Xeon(R) CPU           E5462  @ 2.80GHz 
 GenuineIntel x86, Family 6 Model 23 Stepping 6

OpenCL-kernels filename : MultiBeam_Kernels_r2870.cl 
INFO: can't open binary kernel file: .//MultiBeam_Kernels_r2870.clHD5_ATIRadeonBartsXTPrototype.bin_V7_12.6.0_10, continue with recompile...
Info : Building Program (binary, clBuildProgram):main kernels: OK code 0
INFO: binary kernel file created
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_524288_gr64_lr16_wg256_tw0_r2870.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_8_gr64_lr16_wg256_tw0_r2870.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_16_gr64_lr16_wg256_tw0_r2870.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_32_gr64_lr16_wg256_tw0_r2870.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_64_gr64_lr16_wg256_tw0_r2870.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_128_gr64_lr16_wg256_tw0_r2870.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_256_gr64_lr16_wg256_tw0_r2870.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_512_gr64_lr16_wg256_tw0_r2870.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_1024_gr64_lr16_wg256_tw0_r2870.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_2048_gr64_lr16_wg256_tw0_r2870.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_4096_gr64_lr16_wg256_tw0_r2870.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_8192_gr64_lr16_wg256_tw0_r2870.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_16384_gr64_lr16_wg256_tw0_r2870.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_32768_gr64_lr16_wg256_tw0_r2870.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_65536_gr64_lr16_wg256_tw0_r2870.bin_12.6.0_10, continue with recompile...
WARNING: can't open binary kernel file for oclFFT plan: .//MB_clFFTplan_ATIRadeonBartsXTPrototype_131072_gr64_lr16_wg256_tw0_r2870.bin_12.6.0_10, continue with recompile...
ar=1.185405  NumCfft=101385  NumGauss=0  NumPulse=56257780816  NumTriplet=56257780816
Currently allocated 121 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
OS X optimized S@H v7 application (based on S@H Enhanced by Alex Kan)
Version info: SSE4.1ux (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSE4.1ux OS X 64bit Build 2870 , Ported by : Raistmer, JDWhale, Urs Echternacht
OpenCL version by Raistmer, r2870
AMD HD5 version by Raistmer
Number of OpenCL platforms:				 1


 OpenCL Platform Name:					 Apple
Number of devices:				 3
  Max compute units:				 14
  Max work group size:				 1024
  Max clock frequency:				 900Mhz
  Max memory allocation:			 268435456
  Cache type:					 None
  Cache line size:				 0
  Cache size:					 0
  Global memory size:				 1073741824
  Constant buffer size:				 65536
  Max number of constant args:			 8
  Local memory type:				 Scratchpad
  Local memory size:				 32768
  Queue properties:				 
    Out-of-Order:				 No
  Name:						 ATI Radeon Barts XT Prototype
  Vendor:					 AMD
  Driver version:				 1.0
  Version:					 OpenCL 1.1 
  Max compute units:				 14
  Max work group size:				 1024
  Max clock frequency:				 775Mhz
  Max memory allocation:			 268435456
  Cache type:					 None
  Cache line size:				 0
  Cache size:					 0
  Global memory size:				 1073741824
  Constant buffer size:				 65536
  Max number of constant args:			 8
  Local memory type:				 Scratchpad
  Local memory size:				 32768
  Queue properties:				 
    Out-of-Order:				 No
  Name:						 ATI Radeon Barts PRO Prototype
  Vendor:					 AMD
  Driver version:				 1.0
  Version:					 OpenCL 1.1 
  Max compute units:				 14
  Max work group size:				 1024
  Max clock frequency:				 775Mhz
  Max memory allocation:			 268435456
  Cache type:					 None
  Cache line size:				 0
  Cache size:					 0
  Global memory size:				 1073741824
  Constant buffer size:				 65536
  Max number of constant args:			 8
  Local memory type:				 Scratchpad
  Local memory size:				 32768
  Queue properties:				 
    Out-of-Order:				 No
  Name:						 ATI Radeon Barts PRO Prototype
  Vendor:					 AMD
  Driver version:				 1.0
  Version:					 OpenCL 1.1 

Work Unit Info:
...............
Credit multiplier is :  2.85
WU true angle range is :  1.185405
Used GPU device parameters are:
	Number of compute units: 14
	Single buffer allocation size: 64MB
	Total device global memory: 1024MB
	max WG size: 256
	local mem type: Real
period_iterations_num=23
Spike: peak=24.47872, time=83.89, d_freq=1421126615.33, chirp=0.2514, fft_len=64k
Spike: peak=24.07581, time=87.24, d_freq=1421126616.18, chirp=0.25232, fft_len=128k
Spike: peak=25.0097, time=87.24, d_freq=1421126616.19, chirp=0.25325, fft_len=128k
Spike: peak=25.16989, time=87.24, d_freq=1421126616.2, chirp=0.25417, fft_len=128k
Spike: peak=24.65972, time=83.89, d_freq=1421126615.35, chirp=0.2551, fft_len=64k
Spike: peak=24.59426, time=87.24, d_freq=1421126616.2, chirp=0.2551, fft_len=128k
Spike: peak=24.38239, time=6.711, d_freq=1421125858.53, chirp=-8.7435, fft_len=128k
Spike: peak=24.15315, time=6.711, d_freq=1421125858.53, chirp=-8.7444, fft_len=128k
Spike: peak=24.46851, time=6.711, d_freq=1421125858.55, chirp=-8.7527, fft_len=128k
Spike: peak=24.65365, time=6.711, d_freq=1421125858.54, chirp=-8.7537, fft_len=128k
Spike: peak=24.40533, time=6.711, d_freq=1421119059.91, chirp=-14.735, fft_len=128k
Spike: peak=24.47265, time=6.711, d_freq=1421119059.9, chirp=-14.735, fft_len=128k

I think the task it's on now is the 19/2 shorty.

Yes, it found the correct numbers;
Spike count: 19
Autocorr count: 2
Pulse count: 0
Triplet count: 0
Gaussian count: 0
Time cpu in use since last restart: 102.8 seconds

GPU device sync requested... ...GPU device synched
20:17:00 (60901): called boinc_finish(0)

Success!
ID: 1651910 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1652044 - Posted: 12 Mar 2015, 12:10:34 UTC - in response to Message 1651910.  



Success!

Check next rev please too.
ID: 1652044 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1652156 - Posted: 12 Mar 2015, 18:38:16 UTC - in response to Message 1652044.  

Seems to be a number of differences with r2871.
The first difference is a new numbered .cl file is being created. You no longer have to use the already present MultiBeam_Kernels.cl and add numbers to it. In standalone testing I noticed the max WG size is still listed as 1024 instead of the old 256 in r2870. When running in BOINC it seems to take much less time to build the binaries, about half as long. So far it appears to be working well, possibly a little slower though. We'll have to see how it goes, seems the new builds start out slow but speed up after a while. You can see the results here;
http://setiathome.berkeley.edu/results.php?hostid=6796479&offset=160&show_names=0&state=0&appid=0

Thanks.
ID: 1652156 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1652517 - Posted: 13 Mar 2015, 15:38:24 UTC
Last modified: 13 Mar 2015, 16:10:40 UTC

Well, I've tried about 4 different builds and it doesn't appear to get any better than the current build. When comparing the results to r2839 this build produces almost the exact same numbers and I've run thousands of r2839 tasks. The only problem I've noticed is you can't use over 128 on the settings or it will produce errors on launch. I'm not sure if that's even a problem and it seems to work best with low numbers anyway, just as the other Mac MBv7 Apps I've tested. What I haven't tested is running more than one task at a time and I think I'll leave that for others to test.
Just as I'll leave it to others to comment on any screen lag ;-)
ID: 1652517 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1652606 - Posted: 13 Mar 2015, 20:40:06 UTC - in response to Message 1651857.  
Last modified: 13 Mar 2015, 21:31:44 UTC

...Regarding AP: If you download whole folder you need to delete AP folder after and rename AP_BLANKIT folder to AP one. All this should be not needed if SVN would be used properly. AP and AP_BLANKIT belongs to different branches actually.

It's still hanging at;

checking if clang supports -c -o file.o... (cached) yes
checking whether the clang linker (/usr/bin/ld) supports shared libraries... yes
checking dynamic linker characteristics... darwin dyld
checking how to hardcode library paths into programs... immediate
checking for dlopen in -ldl... yes
checking whether a program can dlopen itself... yes
checking whether a statically linked program can dlopen itself... yes
checking whether stripping libraries is possible... yes
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... no
checking whether to build static libraries... yes
checking how to run the C++ preprocessor... g++ -E
checking for ld used by g++... /usr/bin/ld
checking if the linker (/usr/bin/ld) is GNU ld... no
checking whether the g++ linker (/usr/bin/ld) supports shared libraries... yes
checking for g++ option to produce PIC... -fno-common -DPIC
checking if g++ PIC flag -fno-common -DPIC works... yes
checking if g++ static flag -static works... no
checking if g++ supports -c -o file.o... rm: conftest*: No such file or directory
yes
checking if g++ supports -c -o file.o... (cached) yes
checking whether the g++ linker (/usr/bin/ld) supports shared libraries... yes
checking dynamic linker characteristics... darwin dyld
checking how to hardcode library paths into programs... immediate
checking whether byte ordering is bigendian... no
./configure: line 17300: AX_C_FLOAT_WORDS_BIGENDIAN: command not found
./configure: line 17302: SAH_OPTION_BITNESS: command not found
./configure: line 17304: BOINC_PLATFORM: command not found
checking for special C compiler options needed for large files... no
checking for _FILE_OFFSET_BITS value needed for large files... no
./configure: line 17515: SAH_DLLEXT: command not found
./configure: line 17516: SAH_LIBEXT: command not found
checking for special C compiler options needed for large files... (cached) no
checking for _FILE_OFFSET_BITS value needed for large files... (cached) no
./configure: line 17722: syntax error near unexpected token `AC_DEFINE'
./configure: line 17722: `ACX_PTHREAD(AC_DEFINE(HAVE_PTHREAD,1, [Have pthread]))'
TomsMacPro:client Tom$

Strange it doesn't have a problem making the AP GPU App. I made the AP GPU App, ran make clean, ran the AP CPU CL, then make -i -k, and it says;
  "Astropulse::number_of_logged_signals", referenced from:
      Astropulse::add_signal(std::vector<ap_signal, std::allocator<ap_signal> >&, ap_signal) in ap_client-ap_fileio.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: [ap_client] Error 1 (ignored)
/bin/cp ap_client ap_7.01r2871_sse41_x86_64-apple-darwin
cp: ap_client: No such file or directory
make: [ap_7.01r2871_sse41_x86_64-apple-darwin] Error 1 (ignored)
strip ap_7.01r2871_sse41_x86_64-apple-darwin
error: strip: can't open file: ap_7.01r2871_sse41_x86_64-apple-darwin (No such file or directory)
make: [ap_7.01r2871_sse41_x86_64-apple-darwin] Error 1 (ignored)
/bin/rm -f astropulse-7.01_x86_64-apple-darwin.debug
/bin/ln ap_client astropulse-7.01_x86_64-apple-darwin.debug
ln: ap_client: No such file or directory
make: [astropulse-7.01_x86_64-apple-darwin.debug] Error 1 (ignored)
make  all-am
g++ -I/opt/local/include  -mmacosx-version-min=10.7 -O3 -mtune=generic -I./ -I./Users/Tom/sah_v7_opt/src -I./Users/Tom/sah_v7_opt/AKv8
-march=corei7 -msse4.1  -mmacosx-version-min=10.7 -O3 -mtune=generic -I./ -I./Users/Tom/sah_v7_opt/src -I./Users/Tom/sah_v7_opt/AKv8
-DHAVE_CONFIG_H -DTEXT_UI -DNDEBUG -DCLIENT -I../server/db -I/Users/Tom/sah_v7_opt/AKv8/../src -I/Users/Tom/sah_v7_opt/AKv8 -I/Users
/Tom/sah_v7_opt/AKv8/db -I/Users/Tom/sah_v7_opt/AKv8/client -I/Users/Tom/boinc -I/Users/Tom/boinc/api -I/Users/Tom/boinc/lib -I/Users
/Tom/boinc/sched -I/Users/Tom/boinc/db -D_THREAD_SAFE   -D_THREAD_SAFE  -mmacosx-version-min=10.7 -ldl -lz -lpthread -framework Carbon
-lm  -L/opt/local/lib  -mmacosx-version-min=10.7 -ldl -lz -lpthread -framework Carbon -o ap_client ap_client-sqlblob.o ap_client-sqlrow.o
ap_client-xml_util.o ap_client-lcgamm.o ap_client-ap_schema.o ap_client-ap_client_main.o ap_client-ap_science.o ap_client-ap_fileio.o
ap_client-ap_fold.o ap_client-ap_timer.o ap_client-ap_debug.o ap_client-mtrand.o ap_client-ap_version.o ap_client-malloc_a.o ap_client
GPU_lock.o ap_client-sbtf.o ap_client-ap_shmem.o ap_client-ap_remove_radar.o   -L/Users/Tom/boinc/api -L/Users/Tom/boinc/api/.libs
-lboinc_api -L/Users/Tom/boinc/lib -L/Users/Tom/boinc/lib/.libs -lboinc  -lfftw3f   ../../lib/OSX64/libfftw3f.a   -lz -lstdc++ -lm
-lpthread -ldl  /Users/Tom/sah_v7_opt/lib/OSX64/libfftw3f.a -lz -lstdc++ -lm -lpthread -ldl  /Users/Tom/sah_v7_opt/lib/OSX64/libfftw3f.a
ld: warning: directory not found for option '-L/opt/local/lib'
Undefined symbols for architecture x86_64:
  "Astropulse::number_of_logged_signals", referenced from:
      Astropulse::add_signal(std::vector<ap_signal, std::allocator<ap_signal> >&, ap_signal) in ap_client-ap_fileio.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: [ap_client] Error 1 (ignored)
TomsMacPro:client Tom$

Which is the same as before...
ID: 1652606 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1652743 - Posted: 14 Mar 2015, 5:30:05 UTC

I also had the same problem with the CPU AP build in Mavericks. You can build the AP GPU App in ML & Mavericks, but the CPU build hangs with the same error.

The MBv7 ATI5 GPU App seems to work better in Mavericks. Considering the AP creation rate, you might need the MBv7 ATI5 App for a while. I posted it here;
MBv7_r2871_ati5.zip
Works for me...
ID: 1652743 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1659828 - Posted: 31 Mar 2015, 14:52:52 UTC

After 2 weeks MBv7_r2871_ati5 is still working fine. It's now up to Consecutive valid tasks: 5025. Considering some "Lost" tasks are going to start expiring in the next day or so, I guess that is about as high as the Consecutive valid tasks will go. It would be nice if someone was to enable resend lost tasks for a few hours a week.

As far as I can tell, the ATI OSX MB App works about as well as the ATI MB App in Windows, which is much better than the OSX ATI MB App on Beta at present...

;-)
ID: 1659828 · Report as offensive
Dena Wiltsie
Volunteer tester

Send message
Joined: 19 Apr 01
Posts: 1628
Credit: 24,230,968
RAC: 26
United States
Message 1660448 - Posted: 1 Apr 2015, 20:37:44 UTC

I think you have made the big time. I went over to Beta for a few AP work units and when they ran out I was about to switch back to production when I noticed I was getting MB units on my graphic processor. The graphic processor seems to be about five times faster that the CPU on my system and the few records I have looked at indicate the application compares well with the other already existing MB applications.
I am not sure if this is right but this should be the link to my work units on Beta if you are interested.
Thanks for your effort to make this possible.
ID: 1660448 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1660503 - Posted: 1 Apr 2015, 21:56:57 UTC - in response to Message 1660448.  

Unfortunately the ATI App r2728 on Beta at present was one of Joe Fox's early attempts. His later version r2760 was much better and was tested months ago, 2760 running 2 tasks at once. The main problem with version 2760 was Low GPU load and High CPU usage. The Lower Angle Range tasks ran Slow and running 2 at a time produced massive screen Lag. The newer Apps are better and produce good GPU load while running 1 at a time.

According to Raistmer's post we should see newer Apps soon, Let's hope they will emerge here as stock ones quite soon. The latest version of my Apps at Crunchers Anonymous are the best yet on my MacPro. We'll see how the others work.

It was a frustrating start, but it appears to have worked out rather nicely ;-)
ID: 1660503 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1660812 - Posted: 2 Apr 2015, 18:19:42 UTC - in response to Message 1660503.  

It's good to see that the Beta MB apps seem to be working.

At least the last few that have been processed today. Both Intel and Nvidia GPUs are crunching away after that full day of all errored out MBs yesterday.

Was worried it was the apps until I looked at the history of all those MBs and saw every different configure (CPU, ATI, NV, Intel) all errored so it was a problem with the work units themselves.

I would have posted in Beta but there isn't a MB Mac OS discussion thread.

Just in case anyone with an iMac was wondering, lol

Zalster
ID: 1660812 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1661188 - Posted: 3 Apr 2015, 21:18:44 UTC - in response to Message 1660812.  
Last modified: 3 Apr 2015, 21:23:16 UTC

That didn't stop me, http://setiweb.ssl.berkeley.edu/beta/forum_thread.php?id=2182&postid=53953#53953. As you mentioned, it's not really a Mac problem. That place is a minefield, I'm aborting many more tasks than I'm running.

Here's a comparison between the r2728 that's on Beta and my latest r2871 both at 0.416 Angle Range.

MBv7 2871
http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=6950487
Run time: 22 min 24 sec
CPU time: 7 min 6 sec

MBv7 2728
http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=6949536
Run time: 59 min 3 sec
CPU time: 33 min 25 sec

That's a fair difference in my book. Dena is a Wingperson on the 2871 task.
ID: 1661188 · Report as offensive
Dena Wiltsie
Volunteer tester

Send message
Joined: 19 Apr 01
Posts: 1628
Credit: 24,230,968
RAC: 26
United States
Message 1661208 - Posted: 3 Apr 2015, 21:58:43 UTC - in response to Message 1661188.  

That didn't stop me, http://setiweb.ssl.berkeley.edu/beta/forum_thread.php?id=2182&postid=53953#53953. As you mentioned, it's not really a Mac problem. That place is a minefield, I'm aborting many more tasks than I'm running.

Here's a comparison between the r2728 that's on Beta and my latest r2871 both at 0.416 Angle Range.

MBv7 2871
http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=6950487
Run time: 22 min 24 sec
CPU time: 7 min 6 sec

MBv7 2728
http://setiweb.ssl.berkeley.edu/beta/workunit.php?wuid=6949536
Run time: 59 min 3 sec
CPU time: 33 min 25 sec

That's a fair difference in my book. Dena is a Wingperson on the 2871 task.

I will crunch that work unit shortly. It seems like my RAC is only about 10,000 crunching MB units where I could hit 20,000 crunching AP units. Of that number, about 5,000 credits come from crunching MB units on the CPU.
ID: 1661208 · Report as offensive
Previous · 1 . . . 17 · 18 · 19 · 20 · 21 · Next

Message boards : Number crunching : The Highest Ranked SETI AMD Host is a MAC: Time for a STOCK MAC APP?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.