Message boards :
Number crunching :
Linux (ARM processor) app and alternatives
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 8 · Next
Author | Message |
---|---|
Juha Send message Joined: 7 Mar 04 Posts: 388 Credit: 1,857,738 RAC: 0 |
If password is set then it is always needed. That's something you have to decide. The idea behind the password is that an admin can install BOINC on a computer and then people using the computer can't add or remove projects or otherwise mess up with BOINC's settings. If you trust everyone using your Parallela you can leave the password empty. If you trust everyone using your LAN you can leave the password empty and enable remote connections. BOINC Manager can be at times a bit difficult thinking it knows better than you what password you want to use. I tried connecting to my passwordless RPi and on first attempt it failed but succeeded on the second attempt. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
FFTW3.3.6 based build considerably faster than current stock, especially on VHAR: ==================================================================== Current WU: PG0444_v8.wu ---------------------------------------------------------------- Running default app with command :... setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog ./setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog 9465.81 sec 9388.08 sec 62.09 sec Elapsed Time: ....................... 9466 seconds ---------------------------------------------------------------- Running app with command : .......... setiathome-8.0.armv7l-unknown-linux-gnueabihf_R_default -st -verb -nog ./setiathome-8.0.armv7l-unknown-linux-gnueabihf_R_default -st -verb -nog 7800.72 sec 7724.10 sec 64.75 sec Elapsed Time : ...................... 7801 seconds Speed compared to default : ......... 121 % ----------------- Comparing results Result : Strongly similar, Q= 99.97% ---------------------------------------------------------------- Done with PG0444_v8.wu ==================================================================== Current WU: PG1327_v8.wu ---------------------------------------------------------------- Running default app with command :... setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog ./setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog 11311.02 sec 11144.48 sec 149.21 sec Elapsed Time: ....................... 11311 seconds ---------------------------------------------------------------- Running app with command : .......... setiathome-8.0.armv7l-unknown-linux-gnueabihf_R_default -st -verb -nog ./setiathome-8.0.armv7l-unknown-linux-gnueabihf_R_default -st -verb -nog 7801.03 sec 7640.19 sec 150.95 sec Elapsed Time : ...................... 7801 seconds Speed compared to default : ......... 144 % <<<<<<<<<<<<<<< ----------------- Comparing results Result : Strongly similar, Q= 100.0% ---------------------------------------------------------------- Done with PG1327_v8.wu SETI apps news We're not gonna fight them. We're gonna transcend them. |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
FFTW3.3.6 based build considerably faster than current stock, especially on VHAR: Is that because of longer better fftw planning or because of fftw 3.3.6? Or both? Claggy |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
If password is set then it is always needed. cc_config approach failed: 04-Feb-2017 21:18:23 [---] Unrecognized tag in cc_config.xml: <allow_remote_gui_rpc> Also, windows boinc manager getting really slow when fails to connect to another computer. It's a bug... EDIT: but I have parallella's IP listed and empty password file now - still windows boinc manager can't connect. What's wrong with it? (BOINC 7.4.42 x86) SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
I did some modifications to Linux KWSN bench script so now it's more similar to Windows one. It suspends/resumes BOINC automatically and keeps cache files intact. Something unusual (if coming from Windows ) already spotted:
As one can see, elapsed time remains same between all 3 wisgen tasks though wisdom.sah now available from prev run. It's very different from usual Windows app behavior where planning is 2-stage process so first time planning short enough, second is much longer and third run should use fully-prepared wisdom and takes smallest amount of time. Obviously, current stock build either does very little planning (estimate instead of bench). Lets see now how my build will behave (it was compiled with slow timer FFTW mode).... SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
FFTW3.3.6 based build considerably faster than current stock, especially on VHAR: I'm just in process of establishing that. But one thing is obvious currently. stock FFTW3.3.4 has no NEON codelets at all while my FFTW3.3.6-based app shows NEON codelets in wisdom file. Regarding planning itself - just await results on next test... SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
And now same test with another build: KWSN-Linux-MBbench v3.0 cache-keeping edition Summary: 1) stock codebase doesn't implement 2-stage planning Joe introduced to AKv8 codebase. 2) 1-stage planning (real one) takes place still So first run considerably longer and consequent ones can use wisdom. 3) for real-life application this "use wisdom" will occur only on restart from checkpoint cause wisdom.sah placed in slot instead of project directory. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Tom Rinehart Send message Joined: 12 Dec 01 Posts: 113 Credit: 13,255,975 RAC: 6 |
I sent Eric both the Linux ARMHF and ARM64 apps for testing on Beta. FFTW 3.3.6 makes such a big difference. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Yes, 20 to 40% speed improvement can be achieved from simple rebuild, that's great. Now into app's own NEON functions... EDIT: and some question on building protocol: should i re-run automake and configure after make clean command? Or just make -j 2 ? EDIT2: seems not if only sources were changed w/o build options change SETI apps news We're not gonna fight them. We're gonna transcend them. |
Tom Rinehart Send message Joined: 12 Dec 01 Posts: 113 Credit: 13,255,975 RAC: 6 |
I think you can just run make clean and then make if you are testing code changes. |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
But one thing is obvious currently. stock FFTW3.3.4 has no NEON codelets at all while my FFTW3.3.6-based app shows NEON codelets in wisdom file. That's why I built the 8.03 app available on Seti Beta when Raspbian switched to the Debian fftw 3.3.4 that has the Neon patch. Claggy |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
But one thing is obvious currently. stock FFTW3.3.4 has no NEON codelets at all while my FFTW3.3.6-based app shows NEON codelets in wisdom file. Hadn't get it so far. Is it awailable for direct download? SETI apps news We're not gonna fight them. We're gonna transcend them. |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
But one thing is obvious currently. stock FFTW3.3.4 has no NEON codelets at all while my FFTW3.3.6-based app shows NEON codelets in wisdom file. Yes, just take the Beta url, and change the setiathome_8.02_arm-unknown-linux-gnueabihf filename to 8.03: http://boinc2.ssl.berkeley.edu/beta/download/setiathome_8.03_arm-unknown-linux-gnueabihf It is a little bit faster than the 8.02 app, but not hugely. Claggy |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
thanks, will try it too. regarding neon_ChirpData: *** Error in `./setiathome-8.0.armv7l-unknown-linux-gnueabihf': free(): invalid pointer: 0xb26e7080 *** so it fails not in chirp itself it seems. need some debugging SETI apps news We're not gonna fight them. We're gonna transcend them. |
MarkJ Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 |
You did put the host names or IP addresses in remote_hosts.cfg as well? That is the names of the computers you're going to allow to connect to the Parallella. Also you might need to have it listed in /etc/hosts.allow I found that for whatever reason my Pi's and Parallella's wouldn't appear on the routers list of connected computers by name but they always work if I use the IP address. I assume this is either a bug with the router or Linux (on the Pi/Parallella). If you want 7.6.33 you can add Jessie-backports and get it from the standard Debian repo. You just need to include it in /etc/apt/sources.list and then do an apt-get update followed by apt-get install -t Jessie-backports BOINC-client. BOINC blog |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
thanks, always worth to learn smth new about Linux management, but I think I'll stay just with current setup for now. I added both parallella and windows hosts into boincs host file (by IP) and now, after killing BOINCtasks process on windows host both boinccmd from parallella and boinc manager from windows can connect. So, automated suspend/resume works perfectly keeping parallella busy while I'm not benching new build. Currently adding some debug info to see where exactly app crashes with neon chirp active. EDIT: just doing make into client folder seems enough to re-compile changed souces and get updated binary - much-much faster than complete rebuild on this not too fast host. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
That's how in/out bufs look in v_Chirp: before Chirp test: v_ChirpData in[0].xy=(1.00000,0.00000)] in[1].xy=(1.00000,0.00000)] in[2].xy=(1.00000,0.00001)] after Chirp test: v_ChirpData out[0].xy=(1.00000,0.00000)] out[1].xy=(1.00000,0.00000)] out[2].xy=(1.00000,0.00001)] and in neon_Chirp: before Chirp test: neon_ChirpData in[0].xy=(1.00000,0.00000)] in[1].xy=(1.00000,0.00000)] in[2].xy=(1.00000,0.00001)] after Chirp test: neon_ChirpData out[0].xy=(0.00000,0.00000)] out[1].xy=(0.00000,0.00000)] out[2].xy=(0.00000,0.00000)] neon_ChirpData 0.000004 0.00000 test neon_ChirpData 0.000004 0.00000 choice So, with this debug info printing it doesn't crash immediately but shows unbelievable fast completion, just as Claggy saw in his builds before. So, neon_Chirp returns wrong data in out buffer + perhaps does some out of bounds memory writes that result in other data corruption and crash in some of binaries. Funny, that wisgen task per se validated OK %) SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Well, the main difference between Chirp that doesn't work and other functions that do work are doubles in arguments. App tuned to pass params in VFP registers. It will not link if just single object compiled with different convention. But could VFP register hold double?...And should it in armhf calling model? Oriiginal code intended for Android contains: /* r0 - input data * r1 - output data * r2 - chirprateind * sp[0-1] - chirp_rate * sp[2] - numDataPoints * sp[4-5] - sample_rate */ So, double arguments take 2 sp whatever (stack?) each. I think correct move will be to write some dummy C-function, copile it into assembler and see how it treats its arguments on armhf. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
C: #include "sah_config.h" extern "C" { int neon_ChirpData(float* in, float* out, int chirp_rate_ind, double chirp_rate, int ul_NumDataPoints, double sample_rate){ out[0]=2.f*in[0]; return int(chirp_rate+sample_rate)+ul_NumDataPoints+chirp_rate_ind; } } asm: .syntax unified .arch armv7-a .eabi_attribute 27, 3 .eabi_attribute 28, 1 .fpu vfpv3-d16 .eabi_attribute 23, 1 .eabi_attribute 24, 1 .eabi_attribute 25, 1 .eabi_attribute 26, 2 .eabi_attribute 30, 2 .eabi_attribute 34, 1 .eabi_attribute 18, 4 .thumb .file "neon_ChirpDummy.cpp" .text .align 2 .global neon_ChirpData .thumb .thumb_func .type neon_ChirpData, %function neon_ChirpData: .fnstart .LFB0: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. vadd.f64 d1, d0, d1 vldr.32 s15, [r0] vadd.f32 s15, s15, s15 vcvt.s32.f64 s2, d1 vstr.32 s15, [r1] vmov r1, s2 @ int add r1, r1, r3 adds r0, r1, r2 bx lr .cantunwind .fnend .size neon_ChirpData, .-neon_ChirpData .ident "GCC: (Ubuntu/Linaro 4.9.2-10ubuntu13) 4.9.2" .section .note.GNU-stack,"",%progbits GCC made it thumb-coded to make things comparison harder :D and with NEON as FPU and ARM as ISA: .arch armv7-a .eabi_attribute 27, 3 .eabi_attribute 28, 1 .fpu neon .eabi_attribute 23, 1 .eabi_attribute 24, 1 .eabi_attribute 25, 1 .eabi_attribute 26, 2 .eabi_attribute 30, 2 .eabi_attribute 34, 1 .eabi_attribute 18, 4 .file "neon_ChirpDummy.cpp" .text .align 2 .global dummy_ChirpData .type dummy_ChirpData, %function dummy_ChirpData: .fnstart .LFB0: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. vadd.f64 d1, d0, d1 <<<<<both double arguments vldr.32 s15, [r0] <<<<<<<<<input vadd.f32 s15, s15, s15 vcvt.s32.f64 s2, d1 vstr.32 s15, [r1] <<<<<<<output vmov r1, s2 @ int add r3, r1, r3 add r0, r3, r2 bx lr .cantunwind .fnend .size dummy_ChirpData, .-dummy_ChirpData .ident "GCC: (Ubuntu/Linaro 4.9.2-10ubuntu13) 4.9.2" .section .note.GNU-stack,"",%progbits This little exsample shows that in armhf arguments are in registers indeed where d* used for doubles perhaps. So, original neon chirp definitely should be rewritten at least in function arguments retrieval part. It shouldn't take them from stack, it should use d-registers. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Juha Send message Joined: 7 Mar 04 Posts: 388 Credit: 1,857,738 RAC: 0 |
cc_config approach failed: Those options go way back, must something wrong in your cc_config.xml . As for the optimised functions not working on hardfp. Those functions were originally coded for Android and Android uses softfp ABI in which floating point function arguments are passed in integer registers. Parallela (and RPi and others) use hardfp ABI in which floating point arguments are passed in floating point registers. Other than the function headers I don't see anything in the assembly code that takes the different ABIs into account. So I think they are just hard-coded for softfp. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.