Linux (ARM processor) app and alternatives

Message boards : Number crunching : Linux (ARM processor) app and alternatives
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 8 · Next

AuthorMessage
Juha
Volunteer tester

Send message
Joined: 7 Mar 04
Posts: 388
Credit: 1,857,738
RAC: 0
Finland
Message 1846682 - Posted: 4 Feb 2017, 20:45:01 UTC - in response to Message 1846333.  

If password is set then it is always needed.

So should I make that file void?


That's something you have to decide. The idea behind the password is that an admin can install BOINC on a computer and then people using the computer can't add or remove projects or otherwise mess up with BOINC's settings.

If you trust everyone using your Parallela you can leave the password empty. If you trust everyone using your LAN you can leave the password empty and enable remote connections.

BOINC Manager can be at times a bit difficult thinking it knows better than you what password you want to use. I tried connecting to my passwordless RPi and on first attempt it failed but succeeded on the second attempt.
ID: 1846682 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1846698 - Posted: 4 Feb 2017, 21:44:34 UTC

FFTW3.3.6 based build considerably faster than current stock, especially on VHAR:
====================================================================
Current WU: PG0444_v8.wu

----------------------------------------------------------------
Running default app with command :... setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog
./setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog 9465.81 sec 9388.08 sec 62.09 sec
Elapsed Time: ....................... 9466 seconds

----------------------------------------------------------------
Running app with command : .......... setiathome-8.0.armv7l-unknown-linux-gnueabihf_R_default -st -verb -nog
./setiathome-8.0.armv7l-unknown-linux-gnueabihf_R_default -st -verb -nog 7800.72 sec 7724.10 sec 64.75 sec
Elapsed Time : ...................... 7801 seconds
Speed compared to default : ......... 121 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.97%

----------------------------------------------------------------
Done with PG0444_v8.wu

====================================================================
Current WU: PG1327_v8.wu

----------------------------------------------------------------
Running default app with command :... setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog
./setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog 11311.02 sec 11144.48 sec 149.21 sec
Elapsed Time: ....................... 11311 seconds

----------------------------------------------------------------
Running app with command : .......... setiathome-8.0.armv7l-unknown-linux-gnueabihf_R_default -st -verb -nog
./setiathome-8.0.armv7l-unknown-linux-gnueabihf_R_default -st -verb -nog 7801.03 sec 7640.19 sec 150.95 sec
Elapsed Time : ...................... 7801 seconds
Speed compared to default : ......... 144 % <<<<<<<<<<<<<<<
-----------------
Comparing results
Result      : Strongly similar,  Q= 100.0%

----------------------------------------------------------------
Done with PG1327_v8.wu

SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1846698 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1846701 - Posted: 4 Feb 2017, 21:48:44 UTC - in response to Message 1846698.  
Last modified: 4 Feb 2017, 21:49:38 UTC

FFTW3.3.6 based build considerably faster than current stock, especially on VHAR:

Is that because of longer better fftw planning or because of fftw 3.3.6? Or both?

Claggy
ID: 1846701 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1846705 - Posted: 4 Feb 2017, 21:53:13 UTC - in response to Message 1846682.  
Last modified: 4 Feb 2017, 22:00:46 UTC

If password is set then it is always needed.

So should I make that file void?


That's something you have to decide. The idea behind the password is that an admin can install BOINC on a computer and then people using the computer can't add or remove projects or otherwise mess up with BOINC's settings.

If you trust everyone using your Parallela you can leave the password empty. If you trust everyone using your LAN you can leave the password empty and enable remote connections.

BOINC Manager can be at times a bit difficult thinking it knows better than you what password you want to use. I tried connecting to my passwordless RPi and on first attempt it failed but succeeded on the second attempt.

cc_config approach failed:

04-Feb-2017 21:18:23 [---] Unrecognized tag in cc_config.xml: <allow_remote_gui_rpc>
04-Feb-2017 21:18:23 [---] Unrecognized tag in cc_config.xml: <skip_cpu_benchmarks>

Also, windows boinc manager getting really slow when fails to connect to another computer.
It's a bug...

EDIT: but I have parallella's IP listed and empty password file now - still windows boinc manager can't connect. What's wrong with it? (BOINC 7.4.42 x86)
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1846705 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1846716 - Posted: 4 Feb 2017, 22:34:11 UTC
Last modified: 4 Feb 2017, 22:56:02 UTC

I did some modifications to Linux KWSN bench script so now it's more similar to Windows one.
It suspends/resumes BOINC automatically and keeps cache files intact.
Something unusual (if coming from Windows ) already spotted:

KWSN-Linux-MBbench v3.0 cache-keeping edition
Running on parallella at Sat 04 Feb 2017 10:20:13 PM UTC
----------------------------------------------------------------
Starting benchmark run...
----------------------------------------------------------------
Suspending BOINC
Listing wu-file(s) in /testWUs :
#WisGen1_v8.wu
#WisGen2_v8.wu
#WisGen3_v8.wu

Listing executable(s) in /APPS :
setiathome_8.02_arm-unknown-linux-gnueabihf

Listing executable in /REF_APPS :
setiathome_8.02_arm-unknown-linux-gnueabihf
----------------------------------------------------------------
Current WU: #WisGen1_v8.wu

----------------------------------------------------------------
Skipping default app setiathome_8.02_arm-unknown-linux-gnueabihf, displaying saved result(s)
Elapsed Time: ....................... 43 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog
./setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog 39.49 sec 34.88 sec 2.54 sec
Elapsed Time : ...................... 39 seconds
Speed compared to default : ......... 110 %
-----------------
Comparing results
Result : Strongly similar, Q= 100.0%

----------------------------------------------------------------
Done with #WisGen1_v8.wu

====================================================================
Current WU: #WisGen2_v8.wu

----------------------------------------------------------------
Skipping default app setiathome_8.02_arm-unknown-linux-gnueabihf, displaying saved result(s)
Elapsed Time: ....................... 41 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog
./setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog 40.34 sec 35.57 sec 2.49 sec
Elapsed Time : ...................... 41 seconds
Speed compared to default : ......... 100 %
-----------------
Comparing results
Result : Strongly similar, Q= 100.0%

----------------------------------------------------------------
Done with #WisGen2_v8.wu

====================================================================
Current WU: #WisGen3_v8.wu

----------------------------------------------------------------
Skipping default app setiathome_8.02_arm-unknown-linux-gnueabihf, displaying saved result(s)
Elapsed Time: ....................... 40 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog
./setiathome_8.02_arm-unknown-linux-gnueabihf -st -verb -nog 38.94 sec 34.98 sec 1.88 sec
Elapsed Time : ...................... 39 seconds
Speed compared to default : ......... 102 %
-----------------
Comparing results
Result : Strongly similar, Q= 100.0%

----------------------------------------------------------------
Done with #WisGen3_v8.wu

====================================================================
Hosts CPU data ...
model name : ARMv7 Processor rev 0 (v7l)

Done with Benchmark run! Removing temporary files!
Resuming BOINC



As one can see, elapsed time remains same between all 3 wisgen tasks though wisdom.sah now available from prev run.
It's very different from usual Windows app behavior where planning is 2-stage process so first time planning short enough, second is much longer and third run should use fully-prepared wisdom and takes smallest amount of time.
Obviously, current stock build either does very little planning (estimate instead of bench).

Lets see now how my build will behave (it was compiled with slow timer FFTW mode)....
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1846716 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1846718 - Posted: 4 Feb 2017, 22:35:46 UTC - in response to Message 1846701.  

FFTW3.3.6 based build considerably faster than current stock, especially on VHAR:

Is that because of longer better fftw planning or because of fftw 3.3.6? Or both?

Claggy

I'm just in process of establishing that.
But one thing is obvious currently. stock FFTW3.3.4 has no NEON codelets at all while my FFTW3.3.6-based app shows NEON codelets in wisdom file.
Regarding planning itself - just await results on next test...
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1846718 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1846725 - Posted: 4 Feb 2017, 23:13:26 UTC
Last modified: 4 Feb 2017, 23:15:51 UTC

And now same test with another build:

KWSN-Linux-MBbench v3.0 cache-keeping edition
Running on parallella at Sat 04 Feb 2017 10:57:14 PM UTC
----------------------------------------------------------------
Starting benchmark run...
----------------------------------------------------------------
Suspending BOINC
Listing wu-file(s) in /testWUs :
#WisGen1_v8.wu
#WisGen2_v8.wu
#WisGen3_v8.wu

Listing executable(s) in /APPS :
setiathome-8.0.armv7l-unknown-linux-gnueabihf_R_default

Listing executable in /REF_APPS :
setiathome_8.02_arm-unknown-linux-gnueabihf
----------------------------------------------------------------
Current WU: #WisGen1_v8.wu

----------------------------------------------------------------
Skipping default app setiathome_8.02_arm-unknown-linux-gnueabihf, displaying saved result(s)
Elapsed Time: ....................... 43 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome-8.0.armv7l-unknown-linux-gnueabihf_R_default
./setiathome-8.0.armv7l-unknown-linux-gnueabihf_R_default 385.91 sec 378.05 sec 5.06 sec
Elapsed Time : ...................... 386 seconds
Speed compared to default : ......... 11 %
-----------------
Comparing results
Result : Strongly similar, Q= 99.99%

----------------------------------------------------------------
Done with #WisGen1_v8.wu

====================================================================
Current WU: #WisGen2_v8.wu

----------------------------------------------------------------
Skipping default app setiathome_8.02_arm-unknown-linux-gnueabihf, displaying saved result(s)
Elapsed Time: ....................... 41 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome-8.0.armv7l-unknown-linux-gnueabihf_R_default
./setiathome-8.0.armv7l-unknown-linux-gnueabihf_R_default 98.09 sec 92.98 sec 2.90 sec
Elapsed Time : ...................... 98 seconds
Speed compared to default : ......... 41 %
-----------------
Comparing results
Result : Strongly similar, Q= 99.99%

----------------------------------------------------------------
Done with #WisGen2_v8.wu

====================================================================
Current WU: #WisGen3_v8.wu

----------------------------------------------------------------
Skipping default app setiathome_8.02_arm-unknown-linux-gnueabihf, displaying saved result(s)
Elapsed Time: ....................... 40 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome-8.0.armv7l-unknown-linux-gnueabihf_R_default
./setiathome-8.0.armv7l-unknown-linux-gnueabihf_R_default 97.40 sec 92.93 sec 2.44 sec
Elapsed Time : ...................... 98 seconds
Speed compared to default : ......... 40 %
-----------------
Comparing results
Result : Strongly similar, Q= 99.99%

----------------------------------------------------------------
Done with #WisGen3_v8.wu

====================================================================
Hosts CPU data ...
model name : ARMv7 Processor rev 0 (v7l)

Done with Benchmark run! Removing temporary files!
Resuming BOINC


Summary:
1) stock codebase doesn't implement 2-stage planning Joe introduced to AKv8 codebase.
2) 1-stage planning (real one) takes place still So first run considerably longer and consequent ones can use wisdom.
3) for real-life application this "use wisdom" will occur only on restart from checkpoint cause wisdom.sah placed in slot instead of project directory.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1846725 · Report as offensive
Tom Rinehart
Volunteer tester

Send message
Joined: 12 Dec 01
Posts: 113
Credit: 13,255,975
RAC: 6
United States
Message 1846726 - Posted: 4 Feb 2017, 23:15:19 UTC - in response to Message 1846718.  

I sent Eric both the Linux ARMHF and ARM64 apps for testing on Beta. FFTW 3.3.6 makes such a big difference.
ID: 1846726 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1846727 - Posted: 4 Feb 2017, 23:18:34 UTC - in response to Message 1846726.  
Last modified: 4 Feb 2017, 23:27:11 UTC

Yes, 20 to 40% speed improvement can be achieved from simple rebuild, that's great.

Now into app's own NEON functions...

EDIT: and some question on building protocol: should i re-run automake and configure after make clean command? Or just make -j 2 ?
EDIT2: seems not if only sources were changed w/o build options change
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1846727 · Report as offensive
Tom Rinehart
Volunteer tester

Send message
Joined: 12 Dec 01
Posts: 113
Credit: 13,255,975
RAC: 6
United States
Message 1846729 - Posted: 4 Feb 2017, 23:28:16 UTC - in response to Message 1846727.  

I think you can just run make clean and then make if you are testing code changes.
ID: 1846729 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1846734 - Posted: 4 Feb 2017, 23:53:43 UTC - in response to Message 1846718.  

But one thing is obvious currently. stock FFTW3.3.4 has no NEON codelets at all while my FFTW3.3.6-based app shows NEON codelets in wisdom file.

That's why I built the 8.03 app available on Seti Beta when Raspbian switched to the Debian fftw 3.3.4 that has the Neon patch.

Claggy
ID: 1846734 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1846789 - Posted: 5 Feb 2017, 9:06:59 UTC - in response to Message 1846734.  

But one thing is obvious currently. stock FFTW3.3.4 has no NEON codelets at all while my FFTW3.3.6-based app shows NEON codelets in wisdom file.

That's why I built the 8.03 app available on Seti Beta when Raspbian switched to the Debian fftw 3.3.4 that has the Neon patch.

Claggy

Hadn't get it so far. Is it awailable for direct download?
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1846789 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1846797 - Posted: 5 Feb 2017, 10:44:26 UTC - in response to Message 1846789.  
Last modified: 5 Feb 2017, 10:45:28 UTC

But one thing is obvious currently. stock FFTW3.3.4 has no NEON codelets at all while my FFTW3.3.6-based app shows NEON codelets in wisdom file.

That's why I built the 8.03 app available on Seti Beta when Raspbian switched to the Debian fftw 3.3.4 that has the Neon patch.

Claggy

Hadn't get it so far. Is it awailable for direct download?

Yes, just take the Beta url, and change the setiathome_8.02_arm-unknown-linux-gnueabihf filename to 8.03:

http://boinc2.ssl.berkeley.edu/beta/download/setiathome_8.03_arm-unknown-linux-gnueabihf

It is a little bit faster than the 8.02 app, but not hugely.

Claggy
ID: 1846797 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1846801 - Posted: 5 Feb 2017, 11:05:01 UTC - in response to Message 1846797.  

thanks, will try it too.

regarding neon_ChirpData:
*** Error in `./setiathome-8.0.armv7l-unknown-linux-gnueabihf': free(): invalid pointer: 0xb26e7080 ***
so it fails not in chirp itself it seems.
need some debugging
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1846801 · Report as offensive
MarkJ Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 08
Posts: 1139
Credit: 80,854,192
RAC: 5
Australia
Message 1846803 - Posted: 5 Feb 2017, 11:39:50 UTC - in response to Message 1846705.  
Last modified: 5 Feb 2017, 11:44:07 UTC


Also, windows boinc manager getting really slow when fails to connect to another computer.
It's a bug...

EDIT: but I have parallella's IP listed and empty password file now - still windows boinc manager can't connect. What's wrong with it? (BOINC 7.4.42 x86)

You did put the host names or IP addresses in remote_hosts.cfg as well? That is the names of the computers you're going to allow to connect to the Parallella. Also you might need to have it listed in /etc/hosts.allow

I found that for whatever reason my Pi's and Parallella's wouldn't appear on the routers list of connected computers by name but they always work if I use the IP address. I assume this is either a bug with the router or Linux (on the Pi/Parallella).

If you want 7.6.33 you can add Jessie-backports and get it from the standard Debian repo. You just need to include it in /etc/apt/sources.list and then do an apt-get update followed by apt-get install -t Jessie-backports BOINC-client.
BOINC blog
ID: 1846803 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1846810 - Posted: 5 Feb 2017, 12:52:36 UTC - in response to Message 1846803.  
Last modified: 5 Feb 2017, 12:59:23 UTC


Also, windows boinc manager getting really slow when fails to connect to another computer.
It's a bug...

EDIT: but I have parallella's IP listed and empty password file now - still windows boinc manager can't connect. What's wrong with it? (BOINC 7.4.42 x86)

You did put the host names or IP addresses in remote_hosts.cfg as well? That is the names of the computers you're going to allow to connect to the Parallella. Also you might need to have it listed in /etc/hosts.allow

I found that for whatever reason my Pi's and Parallella's wouldn't appear on the routers list of connected computers by name but they always work if I use the IP address. I assume this is either a bug with the router or Linux (on the Pi/Parallella).

If you want 7.6.33 you can add Jessie-backports and get it from the standard Debian repo. You just need to include it in /etc/apt/sources.list and then do an apt-get update followed by apt-get install -t Jessie-backports BOINC-client.

thanks, always worth to learn smth new about Linux management, but I think I'll stay just with current setup for now.
I added both parallella and windows hosts into boincs host file (by IP) and now, after killing BOINCtasks process on windows host both boinccmd from parallella and boinc manager from windows can connect.
So, automated suspend/resume works perfectly keeping parallella busy while I'm not benching new build.

Currently adding some debug info to see where exactly app crashes with neon chirp active.

EDIT: just doing make into client folder seems enough to re-compile changed souces and get updated binary - much-much faster than complete rebuild on this not too fast host.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1846810 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1846811 - Posted: 5 Feb 2017, 13:07:10 UTC - in response to Message 1846810.  
Last modified: 5 Feb 2017, 13:10:27 UTC

That's how in/out bufs look in v_Chirp:
before Chirp test: v_ChirpData
in[0].xy=(1.00000,0.00000)]
in[1].xy=(1.00000,0.00000)]
in[2].xy=(1.00000,0.00001)]
after Chirp test: v_ChirpData
out[0].xy=(1.00000,0.00000)]
out[1].xy=(1.00000,0.00000)]
out[2].xy=(1.00000,0.00001)]

and in neon_Chirp:
before Chirp test: neon_ChirpData
in[0].xy=(1.00000,0.00000)]
in[1].xy=(1.00000,0.00000)]
in[2].xy=(1.00000,0.00001)]
after Chirp test: neon_ChirpData
out[0].xy=(0.00000,0.00000)]
out[1].xy=(0.00000,0.00000)]
out[2].xy=(0.00000,0.00000)]
neon_ChirpData 0.000004 0.00000 test
neon_ChirpData 0.000004 0.00000 choice

So, with this debug info printing it doesn't crash immediately but shows unbelievable fast completion, just as Claggy saw in his builds before.

So, neon_Chirp returns wrong data in out buffer + perhaps does some out of bounds memory writes that result in other data corruption and crash in some of binaries.

Funny, that wisgen task per se validated OK %)
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1846811 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1846844 - Posted: 5 Feb 2017, 17:16:51 UTC
Last modified: 5 Feb 2017, 17:32:34 UTC

Well, the main difference between Chirp that doesn't work and other functions that do work are doubles in arguments.
App tuned to pass params in VFP registers. It will not link if just single object compiled with different convention.
But could VFP register hold double?...And should it in armhf calling model?

Oriiginal code intended for Android contains:
/* r0 - input data
* r1 - output data
* r2 - chirprateind
* sp[0-1] - chirp_rate
* sp[2] - numDataPoints
* sp[4-5] - sample_rate
*/
So, double arguments take 2 sp whatever (stack?) each.

I think correct move will be to write some dummy C-function, copile it into assembler and see how it treats its arguments on armhf.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1846844 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1846852 - Posted: 5 Feb 2017, 18:03:30 UTC - in response to Message 1846844.  
Last modified: 5 Feb 2017, 18:49:26 UTC

C:
#include "sah_config.h" 

extern "C" {
int neon_ChirpData(float* in, float* out,
    int chirp_rate_ind, double chirp_rate, int  ul_NumDataPoints, double sample_rate){
	out[0]=2.f*in[0];
	return int(chirp_rate+sample_rate)+ul_NumDataPoints+chirp_rate_ind;
}
}

asm:
	.syntax unified
	.arch armv7-a
	.eabi_attribute 27, 3
	.eabi_attribute 28, 1
	.fpu vfpv3-d16
	.eabi_attribute 23, 1
	.eabi_attribute 24, 1
	.eabi_attribute 25, 1
	.eabi_attribute 26, 2
	.eabi_attribute 30, 2
	.eabi_attribute 34, 1
	.eabi_attribute 18, 4
	.thumb
	.file	"neon_ChirpDummy.cpp"
	.text
	.align	2
	.global	neon_ChirpData
	.thumb
	.thumb_func
	.type	neon_ChirpData, %function
neon_ChirpData:
	.fnstart
.LFB0:
	@ args = 0, pretend = 0, frame = 0
	@ frame_needed = 0, uses_anonymous_args = 0
	@ link register save eliminated.
	vadd.f64	d1, d0, d1
	vldr.32	s15, [r0]
	vadd.f32	s15, s15, s15
	vcvt.s32.f64	s2, d1
	vstr.32	s15, [r1]
	vmov	r1, s2	@ int
	add	r1, r1, r3
	adds	r0, r1, r2
	bx	lr
	.cantunwind
	.fnend
	.size	neon_ChirpData, .-neon_ChirpData
	.ident	"GCC: (Ubuntu/Linaro 4.9.2-10ubuntu13) 4.9.2"
	.section	.note.GNU-stack,"",%progbits


GCC made it thumb-coded to make things comparison harder :D

and with NEON as FPU and ARM as ISA:

	.arch armv7-a
	.eabi_attribute 27, 3
	.eabi_attribute 28, 1
	.fpu neon
	.eabi_attribute 23, 1
	.eabi_attribute 24, 1
	.eabi_attribute 25, 1
	.eabi_attribute 26, 2
	.eabi_attribute 30, 2
	.eabi_attribute 34, 1
	.eabi_attribute 18, 4
	.file	"neon_ChirpDummy.cpp"
	.text
	.align	2
	.global	dummy_ChirpData
	.type	dummy_ChirpData, %function
dummy_ChirpData:
	.fnstart
.LFB0:
	@ args = 0, pretend = 0, frame = 0
	@ frame_needed = 0, uses_anonymous_args = 0
	@ link register save eliminated.
	vadd.f64	d1, d0, d1 <<<<<both double arguments
	vldr.32	s15, [r0] <<<<<<<<<input
	vadd.f32	s15, s15, s15
	vcvt.s32.f64	s2, d1
	vstr.32	s15, [r1] <<<<<<<output
	vmov	r1, s2	@ int
	add	r3, r1, r3
	add	r0, r3, r2
	bx	lr
	.cantunwind
	.fnend
	.size	dummy_ChirpData, .-dummy_ChirpData
	.ident	"GCC: (Ubuntu/Linaro 4.9.2-10ubuntu13) 4.9.2"
	.section	.note.GNU-stack,"",%progbits


This little exsample shows that in armhf arguments are in registers indeed where d* used for doubles perhaps. So, original neon chirp definitely should be rewritten at least in function arguments retrieval part. It shouldn't take them from stack, it should use d-registers.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1846852 · Report as offensive
Juha
Volunteer tester

Send message
Joined: 7 Mar 04
Posts: 388
Credit: 1,857,738
RAC: 0
Finland
Message 1846872 - Posted: 5 Feb 2017, 20:22:24 UTC - in response to Message 1846705.  

cc_config approach failed:

04-Feb-2017 21:18:23 [---] Unrecognized tag in cc_config.xml: <allow_remote_gui_rpc>
04-Feb-2017 21:18:23 [---] Unrecognized tag in cc_config.xml: <skip_cpu_benchmarks>


Those options go way back, must something wrong in your cc_config.xml .

As for the optimised functions not working on hardfp. Those functions were originally coded for Android and Android uses softfp ABI in which floating point function arguments are passed in integer registers. Parallela (and RPi and others) use hardfp ABI in which floating point arguments are passed in floating point registers.

Other than the function headers I don't see anything in the assembly code that takes the different ABIs into account. So I think they are just hard-coded for softfp.
ID: 1846872 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 8 · Next

Message boards : Number crunching : Linux (ARM processor) app and alternatives


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.