Message boards :
Number crunching :
Linux/NVIDIA/AP questions
Message board moderation
Author | Message |
---|---|
skildude Send message Joined: 4 Oct 00 Posts: 9541 Credit: 50,759,529 RAC: 60 |
You didn't mention if you installed the Nvidia Linux drivers or the stock Linux ones that install when it sees the GPU. That might just be the first step in getting the openCL to be recognized. I believe arkayn or mike have a repository of current "lunatics" apps that work with linux/windows/apple products. Just look at their websites for more info. also don't forget to set your permissions on the files. No permissions = not able to use it. In a rich man's house there is no place to spit but his face. Diogenes Of Sinope |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Well, perhaps Urs could answer in more details about Linux config cause he ported app to Linux, I acted mostly as guinea pig with portable Linux setup for this release. But few comments: 1) BOINC doesn't see OpenCL support. BOINC is recent enough so it's safe to assume that if BOINC doesn't see OpenCL there is no runnable OpenCL runtime on your system still. How to get OpenCL properly installed - question to Linux gurus. Apps' stderr just confirmed that - it says no OpenCL platforms found so host has no OpenCL runtime properly configured. 2) App can be build from publicly available sources. Also it's available on our (Lunatics) site here: http://lunatics.kwsn.net/index.php?module=Downloads;catd=1 Partcularly, NV Linux build: http://lunatics.kwsn.net/index.php?module=Downloads;sa=dlview;id=370 EDIT: yes, different rev number. Perhaps Petri33 built more recent rev by himself. SETI apps news We're not gonna fight them. We're gonna transcend them. |
spitfire_mk_2 Send message Joined: 14 Apr 00 Posts: 563 Credit: 27,306,885 RAC: 0 |
C2D E7200 MMX instructions SSE / Streaming SIMD Extensions SSE2 / Streaming SIMD Extensions 2 SSE3 / Streaming SIMD Extensions 3 SSSE3 / Supplemental Streaming SIMD Extensions 3 SSE4.1 / Streaming SIMD Extensions 4.1 EM64T / Extended Memory 64 technology / Intel 64 http://www.cpu-world.com/CPUs/Core_2/Intel-Core%202%20Duo%20E7200%20EU80571PH0613M%20%28BX80571E7200%20-%20BXC80571E7200%29.html#cmp_cpus |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
[quote]Petri33, /quote] I have seen Your message, but I'm just about to go to asleep after a family Sauna and .... I'm sorry that I have no more juice (in me) to write a reply to You. Please remind me if I forget tomorrow. *winding down* Sound THX slowing... #to be awake tomorrow# -- p.l.z. Chrunch on! I'll nanoSleep(x). 'I know, You'll take time from this till the next reply' PLZ -2x ()____)_________)))))~~~. p.s. The ASUS P9X79-E WS is still idle on my desk and the 560Ti and GTX-660 are on the shelf. The Corsair AX-1200 is still being replaced. My 2x780GF and i7-3930K@4.3GHz are purring on AX-650. Scratch on whatever hide, may it be kibbles Chrunchy, Itchy, WoW!, juicy, milky, new, never seen before, just heard about it, dried out, misbalanced, screwed, badly credited, newly found, worked on, twice thinked, three times checked, beta tested, alpha discussed, I've done that, .... Waiting for a) my mama b) ET .. to call. :D To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
ausymark Send message Joined: 9 Aug 99 Posts: 95 Credit: 10,175,128 RAC: 0 |
Hi Guy I think for a start you should just try using the standard the BOINC/Seti setup - which will get you most of what you are after under linux, ie crunching AP on gpu and Seti 7.x on your CPU. Remove the app_info file as well (probably best just do do a fresh start in a blank directory of your "/home" directory. (I always run boinc/seti this way it bypasses any OS updates etc. Remember that most of what is in the standard seti programs from seti these days are actually optimised code from the lunatics team. At this point Linux users (Im running Ubuntu here) cant process seti 7 work units on their gpu, but I believe that will change when the lunatics/seti team produce the next version of the client which is a major re-write of pretty much everything I believe lol. BTW with your setup you should use one core free to feed the GPU leaving the other to crunch seti 7.x work units. BTW on my setup it dont know what driver Im using either, yet, it does see what CUDA and OpenCL versions are available for crunching. Just my 2c worth, and just be patient ;) Cheers Mark |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
Hi, Now awake. I have a very old linux installation (Fedora 14) and I had to download new compiler source and compile a new compiler. I have downloaded NVIDIA drivers from nvidia downloads The source code is from svn branches sah_v7_opt (Xbranch and AP from there) Xbranch is for NVIDIA cuda and works in linux and AP is for NVIDIA opencl AP and works in linux too. I have not touched the version number and I guess the current numbers from that repository are over 2000. You can look here Astropulse source And I downloaded CUDA developer package from NVIDIA I have done some modifications to the code and I attach here after the BOINC log two examples. I'd like to hear from your experience should you try them. To build under linux I had to get the libraries right, path right, then just _autosetup, configure and make. PATH=/usr/local/cuda-5.5/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/cuda/bin:/home/petri/bin LD_LIBRARY_PATH=/usr/local/cuda-5.5/lib64:/usr/local/cuda-5.5/lib: My configure (too complicated), but works: File Edit Options Buffers Tools Help !/bin/sh export CFLAGS=-mavx ./configure BOINCDIR=/home/petri/boinc_repo --enable-sse3 CFLAGS='-O3 -march=core2 -mtune=core2 -msse2avx -mavx -mpreferred-stack-boundary=8 -fexceptions -fno-rounding-math -fno-signaling-nans -fcx-limited-range -fno-math-errno -fno-\ trapping-math --param inline-unit-growth=3000 -DPINNED -DNDEBUG -DHAVE_STRCASECMP -fpeel-loops -funroll-loops -fgcse-sm -fgcse-las -fweb -I/usr/local/cuda-5.5/include -L/opt/lib-4.12/lib ' LIBS="/opt/lib-4.12/lib/libm.so.6 /opt/lib-4.1\ 2/lib/libc.so /opt/lib-4.12/lib/libpthread.so /usr/lib64/libstdc++.so /usr/lib64/lib/libm.so.6" And ... Here is my log Sat 01 Feb 2014 10:29:02 AM EET Unrecognized tag in cc_config.xml: <http_11_0> Sat 01 Feb 2014 10:29:02 AM EET Starting BOINC client version 6.10.58 for x86_64-pc-linux-gnu Sat 01 Feb 2014 10:29:02 AM EET Config: report completed tasks immediately Sat 01 Feb 2014 10:29:02 AM EET Config: use all coprocessors Sat 01 Feb 2014 10:29:02 AM EET log flags: file_xfer, sched_ops, task, cpu_sched Sat 01 Feb 2014 10:29:02 AM EET Libraries: libcurl/7.21.0 NSS/3.12.10.0 zlib/1.2.5 libidn/1.18 libssh2/1.2.4 Sat 01 Feb 2014 10:29:02 AM EET Data directory: /var/lib/boinc Sat 01 Feb 2014 10:29:02 AM EET Processor: 12 GenuineIntel Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz [Family 6 Model 45 Stepping 6] Sat 01 Feb 2014 10:29:02 AM EET Processor: 12.00 MB cache Sat 01 Feb 2014 10:29:02 AM EET Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmu Sat 01 Feb 2014 10:29:02 AM EET OS: Linux: 2.6.35.14-106.fc14.x86_64 Sat 01 Feb 2014 10:29:02 AM EET Memory: 7.79 GB physical, 9.78 GB virtual Sat 01 Feb 2014 10:29:02 AM EET Disk: 47.37 GB total, 30.28 GB free Sat 01 Feb 2014 10:29:02 AM EET Local time is UTC +2 hours Sat 01 Feb 2014 10:29:02 AM EET NVIDIA GPU 0: GeForce GTX 780 (driver version unknown, CUDA version 6000, compute capability 3.5, 3072MB, 723 GFLOPS peak) Sat 01 Feb 2014 10:29:02 AM EET NVIDIA GPU 1: GeForce GTX 780 (driver version unknown, CUDA version 6000, compute capability 3.5, 3071MB, 723 GFLOPS peak) Sat 01 Feb 2014 10:29:02 AM EET SETI@home Found app_info.xml; using anonymous platform Sat 01 Feb 2014 10:29:02 AM EET SETI@home URL http://setiathome.berkeley.edu/; Computer ID 5643864; resource share 9100 Sat 01 Feb 2014 10:29:02 AM EET SETI@home General prefs: from SETI@home (last modified 21-Nov-2013 17:26:39) Sat 01 Feb 2014 10:29:02 AM EET SETI@home Computer location: home Sat 01 Feb 2014 10:29:02 AM EET General prefs: using separate prefs for home Sat 01 Feb 2014 10:29:02 AM EET Reading preferences override file Sat 01 Feb 2014 10:29:02 AM EET Preferences: Sat 01 Feb 2014 10:29:02 AM EET max memory usage when active: 7182.89MB Sat 01 Feb 2014 10:29:02 AM EET max memory usage when idle: 7182.89MB Sat 01 Feb 2014 10:29:02 AM EET max disk usage: 23.68GB Sat 01 Feb 2014 10:29:02 AM EET max CPUs used: 6 Sat 01 Feb 2014 10:29:02 AM EET (to change preferences, visit the web site of an attached project, or select Preferences in the Manager) Sat 01 Feb 2014 10:29:02 AM EET Using proxy info from GUI Sat 01 Feb 2014 10:29:02 AM EET Not using a proxy Sat 01 Feb 2014 10:29:02 AM EET SETI@home [cpu_sched] Starting ap_21my13ag_B6_P1_00090_20140129_01158.wu_0(resume) Sat 01 Feb 2014 10:29:02 AM EET SETI@home Restarting task ap_21my13ag_B6_P1_00090_20140129_01158.wu_0 using astropulse_v6 version 603 Sat 01 Feb 2014 10:29:02 AM EET SETI@home [cpu_sched] Starting ap_21my13ag_B6_P1_00146_20140129_01158.wu_1(resume) Sat 01 Feb 2014 10:29:02 AM EET SETI@home Restarting task ap_21my13ag_B6_P1_00146_20140129_01158.wu_1 using astropulse_v6 version 603 Sat 01 Feb 2014 10:29:02 AM EET SETI@home [cpu_sched] Starting ap_22au13ac_B4_P1_00141_20140129_06644.wu_0(resume) Sat 01 Feb 2014 10:29:02 AM EET SETI@home Restarting task ap_22au13ac_B4_P1_00141_20140129_06644.wu_0 using astropulse_v6 version 603 Sat 01 Feb 2014 10:29:02 AM EET SETI@home [cpu_sched] Starting ap_22au13ac_B4_P1_00178_20140129_06644.wu_0(resume) Sat 01 Feb 2014 10:29:02 AM EET SETI@home Restarting task ap_22au13ac_B4_P1_00178_20140129_06644.wu_0 using astropulse_v6 version 603 Sat 01 Feb 2014 10:29:02 AM EET SETI@home [cpu_sched] Starting ap_22au13ac_B5_P0_00308_20140129_18763.wu_0(resume) Sat 01 Feb 2014 10:29:02 AM EET SETI@home Restarting task ap_22au13ac_B5_P0_00308_20140129_18763.wu_0 using astropulse_v6 version 603 Sat 01 Feb 2014 10:29:02 AM EET SETI@home [cpu_sched] Starting ap_22au13ac_B6_P1_00214_20140129_32125.wu_0(resume) Sat 01 Feb 2014 10:29:02 AM EET SETI@home Restarting task ap_22au13ac_B6_P1_00214_20140129_32125.wu_0 using astropulse_v6 version 603 Sat 01 Feb 2014 10:29:02 AM EET SETI@home [cpu_sched] Starting 11ap13ac.4503.13324.438086664201.12.213_0(resume) Sat 01 Feb 2014 10:29:02 AM EET SETI@home Restarting task 11ap13ac.4503.13324.438086664201.12.213_0 using setiathome_v7 version 709 Sat 01 Feb 2014 10:29:02 AM EET SETI@home [cpu_sched] Starting 08se13aa.19612.16427.438086664206.12.57_1(resume) Sat 01 Feb 2014 10:29:02 AM EET SETI@home Restarting task 08se13aa.19612.16427.438086664206.12.57_1 using setiathome_v7 version 709 Sat 01 Feb 2014 10:29:02 AM EET SETI@home [cpu_sched] Starting 08se13aa.19612.16427.438086664206.12.21_0(resume) Sat 01 Feb 2014 10:29:02 AM EET SETI@home Restarting task 08se13aa.19612.16427.438086664206.12.21_0 using setiathome_v7 version 709 Sat 01 Feb 2014 10:29:02 AM EET SETI@home [cpu_sched] Starting 08se13ad.19679.18237.438086664206.12.44_0(resume) Sat 01 Feb 2014 10:29:02 AM EET SETI@home Restarting task 08se13ad.19679.18237.438086664206.12.44_0 using setiathome_v7 version 709 Sat 01 Feb 2014 10:29:02 AM EET SETI@home [cpu_sched] Starting 08se13ad.19679.18237.438086664206.12.38_1(resume) Sat 01 Feb 2014 10:29:02 AM EET SETI@home Restarting task 08se13ad.19679.18237.438086664206.12.38_1 using setiathome_v7 version 709 Sat 01 Feb 2014 10:29:02 AM EET SETI@home [cpu_sched] Starting 11ap13ac.4503.17414.438086664201.12.83_0(resume) Sat 01 Feb 2014 10:29:02 AM EET SETI@home Restarting task 11ap13ac.4503.17414.438086664201.12.83_0 using setiathome_v7 version 709 and some code ... from cudaAcc_CalcChiprData.cu ... // modified by petri33 #define NEWW 1 #ifdef NEWW #define B 8 #define N_TIMES B/2 #define THREADS 192 #else ... #endif ... // some new stuff by petri33 #ifdef NEWW __global__ void __launch_bounds__(THREADS, 4) cudaAcc_CalcChirpData_kernel_sm13(int NumDataPoints, double ccr, const float2 * const __restrict__ cx_DataArray, float2 * const __restrict__ cx_ChirpDataArray) //cudaAcc_CalcChirpData_kernel_sm13(int NumDataPoints, double ccr, float2 *cx_DataArray, float2 *cx_ChirpDataArray) { int iblock = blockIdx.x + blockIdx.y * gridDim.x; int ix = (iblock * blockDim.x + threadIdx.x) * B; double time = ix; float time2; float time3; float4 cx[N_TIMES]; for(int i = 0; i < N_TIMES; i++) // load cx[i] = *(float4 *)(&cx_DataArray[ix + (i<<1)]); time = __dmul_rn(time, time); time2 = (float)((ix << 1) + 1); time3 = __fmul_rn((float)ccr, 2.0f); time = __dmul_rn(ccr, time); time2 = __fmul_rn((float)ccr, time2); time = __dsub_rn(time, __double2int_rd(time)); // time2 = __fsub_rn(time2, __float2int_rd(time2)); // time3 = __fsub_rn(time3, __float2int_rd(time3)); float ft1 = time; float ft2 = time2; float ft3 = time3; ft1 = __fmul_rn(ft1, M_2PIf); ft2 = __fmul_rn(ft2, M_2PIf); ft3 = __fmul_rn(ft3, M_2PIf); float cf, sf, ca, sa, cb, sb; __sincosf(ft1, &sf, &cf); __sincosf(ft2, &sa, &ca); __sincosf(ft3, &sb, &cb); float4 tmp = cx[0]; const float nsb = -sb; for(int i = 0; i < N_TIMES; i++) // use f and g to rot { float tsca, tcca, sg, cg, sacb, cacb, tsa; float ft1f = __fmul_rn(tmp.y, -sf); float ft2f = __fmul_rn(tmp.y, cf); tsca = __fmul_rn(sf, ca); // rot f by a to make g tcca = __fmul_rn(cf, ca); // sacb = __fmul_rn(sa, cb); // rot a by b cacb = __fmul_rn(ca, cb); // sg = __fmaf_rn(cf, sa, tsca); // cg = __fmaf_rn(sf, -sa, tcca); // rot f to g by a ready tsa = sa; // sa = __fmaf_rn(ca, sb, sacb); // ca = __fmaf_rn(tsa, nsb, cacb); // rot a by b ready float ft3g = __fmul_rn(tmp.w, -sg); float ft4g = __fmul_rn(tmp.w, cg); cx[i].y = __fmaf_rn(tmp.x, sf, ft2f); cx[i].x = __fmaf_rn(tmp.x, cf, ft1f); tsca = __fmul_rn(sg, ca); // rot g by a to make f tcca = __fmul_rn(cg, ca); // cx[i].w = __fmaf_rn(tmp.z, sg, ft4g); cx[i].z = __fmaf_rn(tmp.z, cg, ft3g); tmp = cx[i+1]; sacb = __fmul_rn(sa, cb); // rot a by b again cacb = __fmul_rn(ca, cb); // tsa = sa; // sf = __fmaf_rn(cg, sa, tsca); // sf = __fmaf_rn(cg, sa, tsca); // cf = __fmaf_rn(sg, -sa, tcca); // rot g to f by a ready sa = __fmaf_rn(ca, sb, sacb); // ca = __fmaf_rn(tsa, nsb, cacb); // rot a by b ready } for(int i = 0; i < N_TIMES; i++) // store { *(float4 *)&(cx_ChirpDataArray[ix + (i<<1)]) = cx[i]; } } #else ... #endif ... // call double ccr = 0.5*chirp_rate*recip_sample_rate*recip_sample_rate; CUDA_ACC_SAFE_LAUNCH( (cudaAcc_CalcChirpData_kernel_sm13<<<grid, block>>>(cudaAcc_NumDataPoints, ccr, dev_cx_DataArray, dev_cx_ChirpDataArray)),true); ... and for AP I can not remember what I touched here ... inline float4 calc_chirp(float4 cconst, float dm_start, int i) { float4 result; float phase1, phase2; float phase_const = M_PI * (dm_start + (float)i); float freq1 = cconst.x * cconst.x; cconst.x *= 0.00176f; float freq2 = cconst.y * cconst.y; cconst.y *= 0.00176f; cconst.x += 1.0f; cconst.y += 1.0f; cconst.x = 1.0f/cconst.x; cconst.y = 1.0f/cconst.y; freq1 = freq1 * cconst.x; freq2 = freq2 * cconst.y; phase1 = phase_const * freq1; phase2 = phase_const * freq2; result.x = native_cos(phase1); result.y = native_sin(phase1); result.z = native_cos(phase2); result.w = native_sin(phase2); result.x *= cconst.z; result.y *= cconst.z; result.z *= cconst.w; result.w *= cconst.w; return result; } __kernel void dechirp_range1_kernel ( __global float4* gpu_data , __global float4* gpu_dechirped, __const float dm_start) { uint tid = get_global_id(0); uint dchunk=get_global_id(1); int j = tid*2; float4 cconst; float nrcp = -rcp_2SigmaSqr; float g1, g2; float freq1, freq2; float multiplier1, multiplier2; if(j < (FFT_SIZE/2)) { g1 = (float)j + 0.5f; g2 = (float)j + 1.5f; float float_j1 = (float)j; float float_j2 = (float)(j + 1); g1 *= g1; g2 *= g2; g1 *= nrcp; g2 *= nrcp; cconst.x = float_j1*(1.0f/32768.0f); cconst.y = float_j2*(1.0f/32768.0f); multiplier1 = native_sqrt(1.0f - native_exp(g1)); multiplier2 = native_sqrt(1.0f - native_exp(g2)); multiplier1 *= normalize_val; multiplier2 *= normalize_val; cconst.z = multiplier1; cconst.w = multiplier2; } else { g1 = (float)(16383 -(j - 16384)) + 0.5f; int j1 = j - 16384; g2 = (float)(16383 -(j - 16384 + 1)) + 0.5f; int j2 = j - 16384 + 1; g1 *= g1; g2 *= g2; g1 *= nrcp; g2 *= nrcp; cconst.x = (float)(j1 + 16384)*(1.0f/32768.0f) -1.0f; cconst.y = (float)(j2 + 16384)*(1.0f/32768.0f) -1.0f; multiplier1 = native_sqrt(1.0f - native_exp(g1)); multiplier2 = native_sqrt(1.0f - native_exp(g2)); multiplier1 *= normalize_val; multiplier2 *= normalize_val; cconst.z = multiplier1; cconst.w = multiplier2; } //R: each work item will process 2 complex data elements and write 2*2*16 new complex elements into dechirped array float4 data=gpu_data[tid+dchunk*(FFT_SIZE/2)]; float4 cur_chirp1; float4 cur_chirp2; float4 cur_chirp3; float4 cur_chirp4; float4 cur_dechirp11, cur_dechirp12; float4 cur_dechirp21, cur_dechirp22; float4 cur_dechirp31, cur_dechirp32; float4 cur_dechirp41, cur_dechirp42; float xx, yy, zz, ww, yx, xy, wz, zw; float xx2, yy2, zz2, ww2, yx2, xy2, wz2, zw2; float xx3, yy3, zz3, ww3, yx3, xy3, wz3, zw3; float xx4, yy4, zz4, ww4, yx4, xy4, wz4, zw4; for(uint i = 0; i < 16; i += 4) { //R: can be optimized via mad instruction probably cur_chirp1 = calc_chirp(cconst, dm_start, i); cur_chirp2 = calc_chirp(cconst, dm_start, i+1); cur_chirp3 = calc_chirp(cconst, dm_start, i+2); cur_chirp4 = calc_chirp(cconst, dm_start, i+3); //negative sign yy = data.y*cur_chirp1.y; xy = data.x*cur_chirp1.y; ww = data.w*cur_chirp1.w; zw = data.z*cur_chirp1.w; yy2 = data.y*cur_chirp2.y; xy2 = data.x*cur_chirp2.y; ww2 = data.w*cur_chirp2.w; zw2 = data.z*cur_chirp2.w; yy3 = data.y*cur_chirp3.y; xy3 = data.x*cur_chirp3.y; ww3 = data.w*cur_chirp3.w; zw3 = data.z*cur_chirp3.w; yy4 = data.y*cur_chirp4.y; xy4 = data.x*cur_chirp4.y; ww4 = data.w*cur_chirp4.w; zw4 = data.z*cur_chirp4.w; cur_dechirp11.x = mad(data.x, cur_chirp1.x, yy); cur_dechirp12.y = mad(data.y, cur_chirp1.x, xy); cur_dechirp11.z = mad(data.z, cur_chirp1.z, ww); cur_dechirp12.w = mad(data.w, cur_chirp1.z, zw); cur_dechirp21.x = mad(data.x, cur_chirp2.x, yy2); cur_dechirp22.y = mad(data.y, cur_chirp2.x, xy2); cur_dechirp21.z = mad(data.z, cur_chirp2.z, ww2); cur_dechirp22.w = mad(data.w, cur_chirp2.z, zw2); cur_dechirp31.x = mad(data.x, cur_chirp3.x, yy3); cur_dechirp32.y = mad(data.y, cur_chirp3.x, xy3); cur_dechirp31.z = mad(data.z, cur_chirp3.z, ww3); cur_dechirp32.w = mad(data.w, cur_chirp3.z, zw3); cur_dechirp41.x = mad(data.x, cur_chirp4.x, yy4); cur_dechirp42.y = mad(data.y, cur_chirp4.x, xy4); cur_dechirp41.z = mad(data.z, cur_chirp4.z, ww4); cur_dechirp42.w = mad(data.w, cur_chirp4.z, zw4); yy = -yy; xy = -xy; ww = -ww; zw = -zw; yy2 = -yy2; xy2 = -xy2; ww2 = -ww2; zw2 = -zw2; yy3 = -yy3; xy3 = -xy3; ww3 = -ww3; zw3 = -zw3; yy4 = -yy4; xy4 = -xy4; ww4 = -ww4; zw4 = -zw4; cur_dechirp12.x = mad(data.x, cur_chirp1.x, yy); cur_dechirp11.y = mad(data.y, cur_chirp1.x, xy); cur_dechirp12.z = mad(data.z, cur_chirp1.z, ww); cur_dechirp11.w = mad(data.w, cur_chirp1.z, zw); cur_dechirp22.x = mad(data.x, cur_chirp2.x, yy2); cur_dechirp21.y = mad(data.y, cur_chirp2.x, xy2); cur_dechirp22.z = mad(data.z, cur_chirp2.z, ww2); cur_dechirp21.w = mad(data.w, cur_chirp2.z, zw2); cur_dechirp32.x = mad(data.x, cur_chirp3.x, yy3); cur_dechirp31.y = mad(data.y, cur_chirp3.x, xy3); cur_dechirp32.z = mad(data.z, cur_chirp3.z, ww3); cur_dechirp31.w = mad(data.w, cur_chirp3.z, zw3); cur_dechirp42.x = mad(data.x, cur_chirp4.x, yy4); cur_dechirp41.y = mad(data.y, cur_chirp4.x, xy4); cur_dechirp42.z = mad(data.z, cur_chirp4.z, ww4); cur_dechirp41.w = mad(data.w, cur_chirp4.z, zw4); gpu_dechirped[32*(FFT_SIZE/2)*dchunk+(2*i+0)*(FFT_SIZE/2)+tid] = cur_dechirp11; gpu_dechirped[32*(FFT_SIZE/2)*dchunk+(2*i+1)*(FFT_SIZE/2)+tid] = cur_dechirp12; gpu_dechirped[32*(FFT_SIZE/2)*dchunk+(2*i+2)*(FFT_SIZE/2)+tid] = cur_dechirp21; gpu_dechirped[32*(FFT_SIZE/2)*dchunk+(2*i+3)*(FFT_SIZE/2)+tid] = cur_dechirp22; gpu_dechirped[32*(FFT_SIZE/2)*dchunk+(2*i+4)*(FFT_SIZE/2)+tid] = cur_dechirp31; gpu_dechirped[32*(FFT_SIZE/2)*dchunk+(2*i+5)*(FFT_SIZE/2)+tid] = cur_dechirp32; gpu_dechirped[32*(FFT_SIZE/2)*dchunk+(2*i+6)*(FFT_SIZE/2)+tid] = cur_dechirp41; gpu_dechirped[32*(FFT_SIZE/2)*dchunk+(2*i+7)*(FFT_SIZE/2)+tid] = cur_dechirp42; } } #else // dechirping via look up table To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
What does your 'ldd' say? # ldd setiathome_x41zc_x86_64-pc-linux-gnu_cuda55 linux-vdso.so.1 => (0x00007fff84dff000) libm.so.6 => /lib64/libm.so.6 (0x0000003ab9000000) libc.so.6 => /var/lib/boinc/projects/setiathome.berkeley.edu/libc.so.6 (0x00007f7714520000) libpthread.so.0 => /var/lib/boinc/projects/setiathome.berkeley.edu/libpthread.so.0 (0x00007f7714303000) libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f7713fc9000) libcudart.so.5.5 => /var/lib/boinc/projects/setiathome.berkeley.edu/libcudart.so.5.5 (0x00007f7713d7b000) libcufft.so.5.5 => /var/lib/boinc/projects/setiathome.berkeley.edu/libcufft.so.5.5 (0x00007f770f25a000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003ab9800000) /lib64/ld-linux-x86-64.so.2 (0x0000003ab7c00000) libdl.so.2 => /lib64/libdl.so.2 (0x0000003ab8800000) librt.so.1 => /lib64/librt.so.1 (0x0000003755e00000) and for AP # ldd ap_6.07r1952_avx_clGPU_x86_64-pc-linux-gnu linux-vdso.so.1 => (0x00007fff891ff000) libOpenCL.so.1 => /usr/lib64/libOpenCL.so.1 (0x00007fb50239b000) libfftw3f.so.3 => /usr/lib64/libfftw3f.so.3 (0x0000003862e00000) libz.so.1 => /lib64/libz.so.1 (0x0000003ab9400000) libnsl.so.1 => /lib64/libnsl.so.1 (0x0000003acb800000) libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007fb50205f000) libm.so.6 => /lib64/libm.so.6 (0x0000003ab9000000) libdl.so.2 => /lib64/libdl.so.2 (0x0000003ab8800000) libc.so.6 (0x00007fb501cd0000) libpthread.so.0 (0x00007fb501ab3000) /lib64/ld-linux-x86-64.so.2 (0x0000003ab7c00000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003ab9800000) To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
.. And to compile AP on my host I use this kind of configure script (again works, may be too complicated) #Pre-prepared configuration for an AstroPulse AMD/ATI-GPU build using OpenCL : export BOINC_DIR=/petri/boinc_repo export BOINCDIR=/home/petri/boinc_repo export INCLUDE=/root/NVIDIA_GPU_Computing_SDK/OpenCL/common/inc export GPU=NV #./configure --enable-bitness=64 --build=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-boinc-platform=x86_64-pc-linux-gnu --enable-static --enable-static-client --enable-avx --disable-shared --disable-graphics --enable-intrinsics CXXFLAGS=" -O3 -march=core2 -mtune=core2 -mfpmath=sse -mavx --param inline-unit-growth=3000 -I/root/NVIDIA_GPU_Computing_SDK/OpenCL/common/inc -I/root/NVIDIA_GPU_Computing_SDK/shared/inc" CPPFLAGS=" -DUSE_FFTW -DUSE_CONVERSION_OPT -DUSE_INCREASED_PRECISION -DSMALL_CHIRP_TABLE -DUSE_OPENCL -DUSE_AVX -DTWIN_FFA -DUSE_OPENCL_NV -DOPENCL_WRITE -DCOMBINED_DECHIRP_KERNEL -DOCL_ZERO_COPY -DAP_CLIENT" LIBS=" -L/usr/lib64 -lOpenCL -L/opt/lib-4.12/lib" LDFLAGS=" -static-libgcc -static-libstdc++" BOINCDIR=" /home/petri/boinc_repo" SETI_BOINC_DIR=" ../../AKv8" LIBS="/opt/lib-4.12/lib/libm.so.6 /opt/lib-4.12/lib/libc.so /opt/lib-4.12/lib/libpthread.so /usr/lib64/libstdc++.so /usr/lib64/lib/libm.so.6 " ./configure --enable-bitness=64 --build=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-boinc-platform=x86_64-pc-linux-gnu --enable-static --enable-static-client --enable-sse2 --disable-shared --disable-graphics --enable-intrinsics CXXFLAGS=" -O3 -march=core2 -mtune=core2 -mfpmath=sse -msse2 --param inline-unit-growth=3000 -I/root/NVIDIA_GPU_Computing_SDK/OpenCL/common/inc -I/root/NVIDIA_GPU_Computing_SDK/shared/inc" CPPFLAGS=" -DUSE_FFTW -DUSE_CONVERSION_OPT -DUSE_INCREASED_PRECISION -DSMALL_CHIRP_TABLE -DUSE_OPENCL -DUSE_SSE2 -DUSE_OPENCL_NV -DOPENCL_WRITE -DCOMBINED_DECHIRP_KERNEL -DOCL_ZERO_COPY -DAP_CLIENT" LIBS=" -L/usr/lib64 -lOpenCL -L/opt/lib-4.12/lib" LDFLAGS=" -static-libgcc -static-libstdc++" BOINCDIR=" /home/petri/boinc_repo" SETI_BOINC_DIR=" ../../AKv8" LIBS="/opt/lib-4.12/lib/libm.so.6 /opt/lib-4.12/lib/libc.so /opt/lib-4.12/lib/libpthread.so /usr/lib64/libstdc++.so /usr/lib64/lib/libm.so.6 " #For Nvidia version please add "-DUSE_OPENCL_NV" to CPPFLAGS above. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
... If I remember correctly I have not downloaded any extra packages to get OpenCL. I have a feeling that it comes with the NVIDIA driver. I have to install the driver under init 1 on my machine and kill X before installing. then just reboot. -- Petri To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Juha Send message Joined: 7 Mar 04 Posts: 388 Credit: 1,857,738 RAC: 0 |
I like Fedora 14. It has the last of the good windows-like interface. As in Gnome 2? There's MATE which is a fork of Gnome 2 and I think I read something about Gnome 3 having a "classic" mode. I haven't used either one but if you want to stay in Redhad-land you could give those a try. edit: Oh, right. PNI = Prescott New Instructions aka SSE3 |
arkayn Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0 |
If you do that, go with Mint instead of Ubuntu. http://www.linuxmint.com/ |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
In a not so distant future I'm going to build a new machine. Which linux should I pick? -- Petri To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
setiathome_enhanced is the old v6 application we left behind in the spring of 2013. You need to be on setiathome_v7 now - global search and replace. I *think* x41g will handle that, but double-check. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
PRE tags. -rw-rw-r--. 1 guy guy 2488 Feb 1 14:51 app_info.xml -rw-rw-r--. 1 guy guy 2438 Feb 1 14:46 app_info.xml~ -rw-r--r--. 1 boinc boinc 52779 Feb 1 11:21 arecibo_181.png -rwxr-xr-x. 1 root root 313872 Feb 1 10:42 libcudart.so.3 -rwxr-xr-x. 1 root root 28996288 Feb 1 10:42 libcufft.so.3 -rw-r--r--. 1 boinc boinc 2536 Feb 1 11:21 sah_40.png -rw-r--r--. 1 boinc boinc 25488 Feb 1 11:21 sah_banner_290.png -rw-r--r--. 1 boinc boinc 35399 Feb 1 11:21 sah_ss_290.png -rwxr-xr-x. 1 root root 1564204 Feb 1 14:34 setiathome_7.01_x86_64-pc-linux-gnu -rwxr-xr-x. 1 root root 6373944 Feb 1 10:43 setiathome_x41g_x86_64-pc-linux-gnu_cuda32 -rw-r--r--. 1 boinc boinc 71 Feb 1 14:52 slideshow_setiathome_enhanced_00 -rw-r--r--. 1 boinc boinc 72 Feb 1 14:52 slideshow_setiathome_enhanced_01 -rw-r--r--. 1 boinc boinc 75 Feb 1 14:52 slideshow_setiathome_enhanced_02 -rw-r--r--. 1 boinc boinc 67 Feb 1 14:52 stat_icon |
arkayn Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0 |
setiathome_enhanced is the old v6 application we left behind in the spring of 2013. It does, I have an updated app_info with it on my site. http://www.arkayn.us/forum/index.php?action=tpmod;dl=item24 |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.