Compiling Applications for Linux

Message boards : Number crunching : Compiling Applications for Linux
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1517831 - Posted: 18 May 2014, 15:36:24 UTC - in response to Message 1517820.  
Last modified: 18 May 2014, 15:40:45 UTC

Is there someplace I can just download a tarball of "branches/sah_v7_opt"? I googled but can't seem to find it.

Just do:

svn checkout https://setisvn.ssl.berkeley.edu/svn/branches/sah_v7_opt

http://setiathome.berkeley.edu/sah_porting.php

Access the SVN repository directly, e.g. with a command like
svn checkout https://setisvn.ssl.berkeley.edu/svn/seti_boinc


There are tarballs there, but they are a year out of date.

Claggy
ID: 1517831 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1517842 - Posted: 18 May 2014, 16:17:03 UTC - in response to Message 1517820.  
Last modified: 18 May 2014, 16:19:01 UTC

yeah, you won't be able to hurt anything with a normal checkout like Claggy's indicated.

Was delayed here (as usual), but 680 located and Linux machine ready for some midnight surgery.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1517842 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1517872 - Posted: 18 May 2014, 17:40:19 UTC - in response to Message 1517867.  
Last modified: 18 May 2014, 17:52:43 UTC

Thanks. Loading rapidsvn allowed me to checkout that directory.

_autosetup ran without any errors

I modified a few items in petri's configure options (like directory locations)

But now it looks like I need a /usr/include/boinc/std_fixes.h

and probably some other headers in /usr/include/boinc

(no /boinc directory)

--------------------
main.cpp:1:0: warning: this target does not support ‘-fsection-anchors’ [-fsection-anchors]
// Copyright 2003 Regents of the University of California
^
In file included from <command-line>:0:0:
./../sah_config.h:536:23: fatal error: std_fixes.h: No such file or directory
#include "std_fixes.h"
^
compilation terminated.
make[2]: *** [seti_cuda-main.o] Error 1
make[2]: Leaving directory `/home/guy/Desktop/sah_v7/sah_v7_opt/Xbranch/client'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/guy/Desktop/sah_v7/sah_v7_opt/Xbranch'
make: *** [all] Error 2
[root@localhost Xbranch]#

-----------------------
Any idea where I can get that directory with all header files?

I've seen that error before, i think it was because eithier i hadn't built the boinc libs, or i was pointing at the wrong directory,
i'll run through the steps I use to build Boinc on Arm/x86_64, the Boinc libs, and apps (not that I've had much success with apps yet).

Claggy
ID: 1517872 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1517877 - Posted: 18 May 2014, 17:59:43 UTC - in response to Message 1517867.  
Last modified: 18 May 2014, 18:00:51 UTC

Up and running again, older ubuntu x64 12.04 LTS (just finished doing all the updates). Going to walk through a quick clean checkout and build of stock mb7 here first, see if anything's up there first, then all going well will try the same on Xbranch and commit any tweaks needed.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1517877 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1517879 - Posted: 18 May 2014, 18:20:04 UTC - in response to Message 1517867.  
Last modified: 18 May 2014, 18:20:27 UTC

http://boinc.berkeley.edu/trac/wiki/SourceCodeGit


Any idea where I can get that directory with all header files?

To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1517879 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1517882 - Posted: 18 May 2014, 18:23:41 UTC - in response to Message 1517872.  
Last modified: 18 May 2014, 18:33:08 UTC

To get the Software prerequisites to build Boinc (probably only need the ones in bold for building apps):

sudo apt-get install git make m4 libtool autoconf pkg-config automake g++ gcc
sudo apt-get install libcurl4-openssl-dev libssl-dev libwxgtk2.8-dev libsqlite3-dev
sudo apt-get install gettext docbook2x docbook-xml libxml2-utils zlib1g-dev libsm-dev libice-dev libxmu-dev libxi-dev
sudo apt-get install libx11-dev libnotify-dev freeglut3-dev libgtk2.0-dev libmysqlclient-dev python libfcgi-dev libjpeg8-dev
sudo apt-get install libxss-dev libxcb-util0-dev libxcb-dpms0-dev libxext-dev libfftw3-dev subversion libstdc++6-4.6-dev (or libstdc++6-4.7-dev or libstdc++6-4.8-dev)

http://boinc.berkeley.edu/trac/wiki/SoftwarePrereqsUnix

How to build Boinc (for running from the home directory):

git clone git://boinc.berkeley.edu/boinc-v2.git boinc

cd boinc

git checkout client_release/7.2/7.2.47; git status

./_autosetup

./configure --disable-server --enable-client --with-boinc-alt-platform=arm-unknown-linux-gnueabihf CXXFLAGS="-O3 "
or for x86_64: ./configure --disable-server --enable-client CXXFLAGS="-O3 "

make

cd packages/generic/sea/

make

copy and paste boinc_7.2.47_armv6l-unknown-linux-gnueabihf.sh (or boinc_7.2.47_x86_64-pc-linux-gnu.sh) into home directory

sh boinc_7.2.47_armv6l-unknown-linux-gnueabihf.sh (or boinc_7.2.47_x86_64-pc-linux-gnu.sh) (you may need to change the permissions of the BOINC directory for it to run properly)

http://boinc.berkeley.edu/trac/wiki/CompileClient

How to build sample apps/libraries that are required for apps:

cd boinc (you need to have got the boinc source from above)

./_autosetup

./configure --disable-client --disable-manager --disable-server LDFLAGS=-static-libgcc

make

cd samples/example_app

ln -s `g++ -print-file-name=libstdc++.a`

make

http://boinc.berkeley.edu/trac/wiki/CompileAppLinux

To build Seti apps:

svn checkout https://setisvn.ssl.berkeley.edu/svn/seti_boinc

svn checkout https://setisvn.ssl.berkeley.edu/svn/branches

cd seti_boinc or branches

./_autosetup or sh ./_autosetup

./configure --disable-server LDFLAGS=-static-libgcc BOINCDIR=${HOME}/boinc (and lots of other options to make the app any good)

make


If i've missed any steps please correct me.

Claggy
ID: 1517882 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1517884 - Posted: 18 May 2014, 18:34:57 UTC
Last modified: 18 May 2014, 18:36:08 UTC

For stock multibeam, after making Boinc libs, then in seti_boinc (which is multibeam_V7) running
./_autosetup
./configure --disable-server --disable-graphics --enable-fast-math
make clean
make

*almost* works, BUT FFTW configuration seems to be damaged, so a build isn't produced. Will have to do some digging, but I'm guessing that's probably collateral damage from Android beta juggling. Will probably query that in dues course, if I can't find the configure options (or they are broken).
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1517884 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1517887 - Posted: 18 May 2014, 18:45:53 UTC - in response to Message 1517884.  

For stock multibeam, after making Boinc libs, then in seti_boinc (which is multibeam_V7) running
./_autosetup
./configure --disable-server --disable-graphics --enable-fast-math
make clean
make

*almost* works, BUT FFTW configuration seems to be damaged, so a build isn't produced. Will have to do some digging, but I'm guessing that's probably collateral damage from Android beta juggling. Will probably query that in dues course, if I can't find the configure options (or they are broken).

I've managed to build a Stock 7.28 x86_64 app on my Ubuntu 12.04 host today using the above (It actually built two, a debug build as well),
I've already benched the debug build, it was slow (the normal build didn't like the parameters, so i'm doing a bench of that on it's own)

Claggy
ID: 1517887 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1517889 - Posted: 18 May 2014, 18:48:37 UTC

Meanwhile on one of my Parallella's building AKv8 fails with (same as on the Pi):

In file included from main.cpp:87:0:
analyzeFuncs.h: At global scope:
analyzeFuncs.h:76:9: error: ‘__m128’ does not name a type
typedef __m128 vFloat;
^
analyzeFuncs.h:77:9: error: ‘__m128i’ does not name a type
typedef __m128i vUInt32;
^
analyzeFuncs.h:78:9: error: ‘__m128i’ does not name a type
typedef __m128i vSInt32;
^
analyzeFuncs.h:79:9: error: ‘__m128i’ does not name a type
typedef __m128i vUInt8;
^
analyzeFuncs.h:80:9: error: ‘__m128d’ does not name a type
typedef __m128d vDouble;
^
make[2]: *** [seti_boinc-main.o] Error 1
make[1]: *** [all-recursive] Error 1

Claggy
ID: 1517889 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1517891 - Posted: 18 May 2014, 19:00:26 UTC - in response to Message 1517887.  

yeah probably defaults to using ourra fft on android (which is slow). To confirm I'll have to see what happen if I try force USE_FFTWF here. If it builds I can probably take a look at the configure macros for easy fixes, once I re-familiarise completely.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1517891 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1517895 - Posted: 18 May 2014, 19:21:56 UTC

The detailed vectorizations Alex Kan did for Power PC and x86 (and later vectorized additions) would need to be refactored for ARM to get an optimized AKv8 build for Parallella or Pi, and suitable preprocessor defines added to enable that code. For most parts of the code, the original non-vectorized code derived from stock 5.13 is still in place so could be enabled at least temporarily for code sections not yet refactored. It will be a considerable amount of work, but probably worthwhile.
                                                                   Joe
ID: 1517895 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1517927 - Posted: 18 May 2014, 20:32:51 UTC
Last modified: 18 May 2014, 20:48:20 UTC

Okey doke, seemed to get 7.28 (stock multibeam) building here (untested binary) Ubuntu 12.04 LTS x64, fully patched.

Biggest issue here was that Ubuntu's repository fftw3 doesn't appear to put fftw libraries in the standard places, so wasn't being picked up by MB 7's .configure script. Rather than dig for it to make links to an old lib, I just built it to the standard lcoation as per fftw docs instead. I did both the double precision and and single float ones just for completion.

Ignoring checking out / extracting Sequence was:

#1) make and install fftw libraries and headers: ( location I used ~/seti/fftw)
./configure
make
sudo make install
make clean
./configure --enable-float --enable-type-prefix
make
sudo make install

#2 make and install boinc libs: ( location I used ~/seti/boinc)
./_autosetup
./configure --disable-server --disable-client --disable-manager --enable-optimize
make
sudo make install

#3) make seti client: ( location I used ~/seti/seti_boinc)
./_autosetup
./configure --disable-server --disable-graphics --enable-fast-math --enable-static


Binary produced in the client folder is called setiathome-7.28.x86_64-pc-linux-gnu


Xbranch will have some parts of the process simpler, because of not requiring fftw at all, and some more challenging due to Cuda toolkit variations, some boincapi instability, and possible system header changes to look into if necessary. I'll wade through that in the next day or so, but I don't expect any particularly challenging roadblocks.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1517927 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1517928 - Posted: 18 May 2014, 20:41:26 UTC - in response to Message 1517895.  
Last modified: 18 May 2014, 20:41:56 UTC

The detailed vectorizations Alex Kan did for Power PC and x86 (and later vectorized additions) would need to be refactored for ARM to get an optimized AKv8 build for Parallella or Pi, and suitable preprocessor defines added to enable that code. For most parts of the code, the original non-vectorized code derived from stock 5.13 is still in place so could be enabled at least temporarily for code sections not yet refactored. It will be a considerable amount of work, but probably worthwhile.
                                                                   Joe



Joe, Alex's 'Hand' vectorisation there looks and feels largely heuristically based. Have you examined in any detail if the bulk alignment macros could be generated instead of hand copy-pasted etc ?
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1517928 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1518049 - Posted: 19 May 2014, 5:42:40 UTC - in response to Message 1517928.  

The detailed vectorizations Alex Kan did for Power PC and x86 (and later vectorized additions) would need to be refactored for ARM to get an optimized AKv8 build for Parallella or Pi, and suitable preprocessor defines added to enable that code. For most parts of the code, the original non-vectorized code derived from stock 5.13 is still in place so could be enabled at least temporarily for code sections not yet refactored. It will be a considerable amount of work, but probably worthwhile.
                                                                   Joe

Joe, Alex's 'Hand' vectorisation there looks and feels largely heuristically based. Have you examined in any detail if the bulk alignment macros could be generated instead of hand copy-pasted etc ?

I guess you're referring to the paligned pulse finding designed for Core 2 with macros which expand to provide 16 foldarrayby3, 64 foldarrayby4, and 256 foldarrayby5 functions. All that's needed only because the palign instruction requires immediate operands for the amount of shifting it does. Alex's PPC code uses Altivec shifting which can work from a variable so doesn't need multiple forms. So depending on what ARM has available that approach might need either few or many functions.

As it happens I wrote an alternative pulse finding approach a few months ago which is both simpler and has tested out a little faster for x86. It only requires SSE so it works well on a broad range of Intel and AMD systems. I have no idea which approach will be best on ARM.
                                                                  Joe
ID: 1518049 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1518110 - Posted: 19 May 2014, 8:18:35 UTC - in response to Message 1517986.  

One of these listed above probably has std_fixes.h in it.

No, none of them have std_fixes.h in them, the Boinc Git repository has it,
the two steps you need to cure that error is one, get it with:

git clone git://boinc.berkeley.edu/boinc-v2.git boinc

and two, during your configure is to point at the correct location, ie for me it is:

./configure --disable-server LDFLAGS=-static-libgcc BOINCDIR=${HOME}/boinc

Claggy
ID: 1518110 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1518129 - Posted: 19 May 2014, 8:47:25 UTC - in response to Message 1517887.  
Last modified: 19 May 2014, 8:57:45 UTC

For stock multibeam, after making Boinc libs, then in seti_boinc (which is multibeam_V7) running
./_autosetup
./configure --disable-server --disable-graphics --enable-fast-math
make clean
make

*almost* works, BUT FFTW configuration seems to be damaged, so a build isn't produced. Will have to do some digging, but I'm guessing that's probably collateral damage from Android beta juggling. Will probably query that in dues course, if I can't find the configure options (or they are broken).

I've managed to build a Stock 7.28 x86_64 app on my Ubuntu 12.04 host today using the above (It actually built two, a debug build as well),
I've already benched the debug build, it was slow (the normal build didn't like the parameters, so i'm doing a bench of that on it's own)

Claggy
My Stock 7.28 build was v slow compared to 7.01_x86_64 (not surprising as i didn't try to optimise it, had around 26 to 47% of the speed of 7.01: (I suppose that makes it a perfect candidate give to the project to get credit rebalanced, just a few more apps to de-optimise now ;-)

=====================================================================
Current WU: PG0009_v7.wu

----------------------------------------------------------------
Skipping default app setiathome_7.01_x86_64-pc-linux-gnu, displaying saved result(s)
Elapsed Time: ....................... 852 seconds
----------------------------------------------------------------
Running app with command : .......... setiathome-7.28.x86_64-pc-linux-gnu -standalone -verbose
./setiathome-7.28.x86_64-pc-linux-gnu -standalone -verbose 3238.00 sec 3229.78 sec 4.17 sec
Elapsed Time : ...................... 3238 seconds
Speed compared to default : ......... 26 %
-----------------
Comparing results
Result : Strongly similar, Q= 99.54%

----------------------------------------------------------------
Done with PG0009_v7.wu

====================================================================

The 7.28 stderr.txt:

shmget in attach_shmem: Invalid argument
19:35:18 (30748): Can't set up shared mem: -1. Will run in standalone mode.
setiathome_v7 7.28 Revision: 2294 g++ (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
libboinc: BOINC 7.3.18

Work Unit Info:
...............
WU true angle range is : 0.008955
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_BaseLineSmooth (no other)

v_GetPowerSpectrum 0.001165 0.00000 test
v_vGetPowerSpectrum 0.001339 0.00000 test
v_vGetPowerSpectrum2 0.001106 0.00000 test
v_vGetPowerSpectrumUnrolled 0.000948 0.00000 test
v_vGetPowerSpectrumUnrolled2 0.000849 0.00000 test
v_avxGetPowerSpectrum faulted
v_vGetPowerSpectrumUnrolled2 0.000849 0.00000 choice

v_ChirpData 0.035283 0.00000 test
fpu_ChirpData 0.087126 0.00000 test
fpu_opt_ChirpData 0.036848 0.00000 test
v_vChirpData_x86_64 0.126647 0.03953 test
sse1_ChirpData_ak 0.062471 1.51106 test
sse1_ChirpData_ak8e 0.061625 1.51106 test
sse1_ChirpData_ak8h 0.056696 1.51106 test
sse2_ChirpData_ak 0.063761 0.00000 test
sse2_ChirpData_ak8 0.055760 0.00000 test
sse3_ChirpData_ak 0.061210 0.00000 test
sse3_ChirpData_ak8 0.054399 0.00000 test
avx_ChirpData_a faulted
avx_ChirpData_b faulted
avx_ChirpData_c faulted
avx_ChirpData_d faulted
v_ChirpData 0.035283 0.00000 choice

v_Transpose 0.036992 0.00000 test
v_Transpose2 0.031865 0.00000 test
v_Transpose4 0.022080 0.00000 test
v_Transpose8 0.038972 0.00000 test
v_pfTranspose2 0.027533 0.00000 test
v_pfTranspose4 0.016125 0.00000 test
v_pfTranspose8 0.029242 0.00000 test
v_vTranspose4 0.010056 0.00000 test
v_vTranspose4np 0.008566 0.00000 test
v_vTranspose4ntw 0.008977 0.00000 test
v_vTranspose4x8ntw 0.011023 0.00000 test
v_vTranspose4x16ntw 0.007414 0.00000 test
v_vpfTranspose8x4ntw 0.008975 0.00000 test
v_avxTranspose4x8ntw faulted
v_avxTranspose4x16ntw faulted
v_avxTranspose8x4ntw faulted
v_avxTranspose8x8ntw_a faulted
v_avxTranspose8x8ntw_b faulted
v_vTranspose4x16ntw 0.007414 0.00000 choice

FPU opt folding 0.015638 0.00000 test
ben SSE folding 0.018437 0.00000 test
AK SSE folding 0.012172 0.00000 test
BH SSE folding 0.015057 0.00000 test
JS AVX_a folding faulted
JS AVX_c folding faulted
AK SSE folding 0.012172 0.00000 choice

Test duration 14.72 seconds


Flopcounter: 2234230375307.252441

Spike count: 0
Autocorr count: 0
Pulse count: 1
Triplet count: 0
Gaussian count: 0
20:29:14 (30748): called boinc_finish(0)

The Stock 7.01 stderr.txt:

shmget in attach_shmem: Invalid argument
20:10:32 (13659): Can't set up shared mem: -1. Will run in standalone mode.
setiathome_v7 7.00 Revision: 1772 g++ (GCC) 4.4.6 20110731 (Red Hat 4.4.6-3)
libboinc: BOINC 7.1.0

Work Unit Info:
...............
WU true angle range is : 0.008955
Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_BaseLineSmooth (no other)

v_GetPowerSpectrum 0.000198 0.00000 test
v_vGetPowerSpectrum 0.000112 0.00000 test
v_vGetPowerSpectrum2 0.000130 0.00000 test
v_vGetPowerSpectrumUnrolled 0.000094 0.00000 test
v_vGetPowerSpectrumUnrolled2 0.000121 0.00000 test
v_avxGetPowerSpectrum faulted
v_vGetPowerSpectrumUnrolled 0.000094 0.00000 choice

v_ChirpData 0.015045 0.00000 test
fpu_ChirpData 0.026866 0.00000 test
fpu_opt_ChirpData 0.017029 0.00000 test
v_vChirpData_x86_64 0.097592 0.03953 test
sse1_ChirpData_ak 0.012817 0.00000 test
sse1_ChirpData_ak8e 0.009766 0.00000 test
sse1_ChirpData_ak8h 0.010493 0.00000 test
sse2_ChirpData_ak 0.015373 0.00000 test
sse2_ChirpData_ak8 0.008895 0.00000 test
sse3_ChirpData_ak 0.015345 0.00000 test
sse3_ChirpData_ak8 0.008817 0.00000 test
avx_ChirpData_a faulted
avx_ChirpData_b faulted
avx_ChirpData_c faulted
avx_ChirpData_d faulted
sse3_ChirpData_ak8 0.008817 0.00000 choice

v_Transpose 0.028504 0.00000 test
v_Transpose2 0.015696 0.00000 test
v_Transpose4 0.008515 0.00000 test
v_Transpose8 0.013994 0.00000 test
v_pfTranspose2 0.016903 0.00000 test
v_pfTranspose4 0.008823 0.00000 test
v_pfTranspose8 0.013902 0.00000 test
v_vTranspose4 0.008670 0.00000 test
v_vTranspose4np 0.007855 0.00000 test
v_vTranspose4ntw 0.008813 0.00000 test
v_vTranspose4x8ntw 0.009319 0.00000 test
v_vTranspose4x16ntw 0.005367 0.00000 test
v_vpfTranspose8x4ntw 0.008859 0.00000 test
v_avxTranspose4x8ntw faulted
v_avxTranspose4x16ntw faulted
v_avxTranspose8x4ntw faulted
v_avxTranspose8x8ntw_a faulted
v_avxTranspose8x8ntw_b faulted
v_vTranspose4x16ntw 0.005367 0.00000 choice

FPU opt folding 0.002897 0.00000 test
ben SSE folding 0.002542 0.00000 test
AK SSE folding 0.002239 0.00000 test
BH SSE folding 0.002119 0.00000 test
JS AVX_a folding faulted
JS AVX_c folding faulted
BH SSE folding 0.002119 0.00000 choice

Test duration 7.81 seconds


Flopcounter: 2235616446600.232910

Spike count: 0
Autocorr count: 0
Pulse count: 1
Triplet count: 0
Gaussian count: 0
20:24:42 (13659): called boinc_finish

The fftw version in use is only fftw 3.3 and not the later fftw3.3.3 or 3.3.4:

(fftw-3.3 fftwf_wisdom #x08ac4c16 #x457005cc #xea102cf7 #xd7ff9038
(fftwf_dft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #x4bcaa768 #x30ebe029 #xace80c47 #x392adf52)
(fftwf_rdft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #x6f391ea1 #xf6b1cc12 #x4a05e243 #x8dfaa129)
(fftwf_codelet_n1bv_64_sse2 0 #x11bdd #x11bdd #x0 #x961ce71f #x59de6052 #x66f939c8 #xfcc9299f)
(fftwf_codelet_t1bv_64_sse2 0 #x11bdd #x11bdd #x0 #x889f271b #xfc322bc3 #x77dab9f2 #x358ba725)
(fftwf_dft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #x92ecda84 #x1dc77dac #xe062e745 #xf023d4d0)
(fftwf_dft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #x9d0ceefc #x925f1a75 #x6fd386ad #x89f85bf3)
(fftwf_dft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #x6c0133e1 #x5215998c #x76d48e7f #xa909d043)
(fftwf_codelet_t2bv_64_sse2 0 #x11bdd #x11bdd #x0 #xb9da9e1a #xb5ad84a3 #xd98a6c23 #x906c1255)
(fftwf_codelet_t2bv_64_sse2 0 #x11bdd #x11bdd #x0 #xcb762674 #x3c6fceba #x280bf4fb #x5b530e10)
(fftwf_codelet_t1bv_64_sse2 0 #x11bdd #x11bdd #x0 #xd86f0fcc #x804581bb #x5ccfdfba #x14108c49)
(fftwf_codelet_n2bv_16_sse2 0 #x11bdd #x11bdd #x0 #xb479c847 #x56630b2b #xda519e34 #x0a30b689)
(fftwf_codelet_t2bv_64_sse2 0 #x11bdd #x11bdd #x0 #x922d2663 #x4ade7ece #x1bc99685 #x94d5de16)
(fftwf_codelet_r2cf_32 0 #x11bdd #x11bdd #x0 #x1a689b9e #xca8b24d5 #x734a8afd #x606d5286)
(fftwf_codelet_t1bv_16_sse2 0 #x11bdd #x11bdd #x0 #xb070d958 #x23f0478f #x95da449c #x52df1d59)
(fftwf_codelet_n1bv_16_sse2 0 #x11bdd #x11bdd #x0 #x3c52ee85 #xcdd2dc65 #x5441298e #xffc88cb0)
(fftwf_reodft010e_r2hc_register 0 #x113c5 #x11bdd #x0 #x08507b7c #x13b10359 #x869608d2 #xc719a48b)
(fftwf_codelet_hf_64 1 #x11bcd #x11bdd #x0 #xd7ba8a81 #x269d1e2f #x8c825f0b #x2f53c532)
(fftwf_codelet_t1bv_64_sse2 0 #x11bdd #x11bdd #x0 #x5ff9bccd #xdc2a85f0 #x61c44a1b #xc9c85ea2)
(fftwf_codelet_t2bv_4_sse2 0 #x11bdd #x11bdd #x0 #xa33f803b #x1d3d7036 #x2bd76bc9 #x1385cd33)
(fftwf_dft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #xb2c3c9fa #xd38b8fd8 #xa222b9c2 #x396eb78b)
(fftwf_codelet_r2cf_64 1 #x11bdd #x11bdd #x0 #x88a820ec #x0d7b375a #x138fbac4 #xde79be2b)
(fftwf_codelet_t2bv_64_sse2 0 #x11bdd #x11bdd #x0 #x97fa91c1 #x58b39cce #x9fa121c9 #x61132994)
(fftwf_codelet_n1bv_16_sse2 0 #x11bdd #x11bdd #x0 #xc690c06b #x33906474 #xf19e0361 #xcb49aac4)
(fftwf_codelet_t2bv_8_sse2 0 #x11bdd #x11bdd #x0 #x93f7a20f #xc3b6d212 #x0dc177b9 #x4a0f2eec)
(fftwf_codelet_t1bv_4_sse2 0 #x11bdd #x11bdd #x0 #xd74fa760 #x1a26b430 #x25e78344 #x00a91678)
(fftwf_codelet_n2bv_16_sse2 0 #x11bdd #x11bdd #x0 #xc0ad9768 #xd03118b4 #x1bfbab73 #xbca89889)
(fftwf_codelet_t2bv_64_sse2 0 #x11bdd #x11bdd #x0 #xb9873a67 #x3d2576ec #xaf6939cd #xdf3de0d9)
(fftwf_codelet_t2bv_2_sse2 0 #x11bdd #x11bdd #x0 #xf06b0ac4 #x8ab72ac2 #x63b670d4 #xf83e4dff)
(fftwf_codelet_t3bv_16_sse2 0 #x11bdd #x11bdd #x0 #x951f7cf7 #xfd4266d3 #x48c39d15 #x59a8ae52)
(fftwf_codelet_t2bv_16_sse2 0 #x11bdd #x11bdd #x0 #x71b35416 #x033191ee #x08427f48 #x060813e0)
(fftwf_codelet_n2bv_64_sse2 0 #x11bdd #x11bdd #x0 #x173bfad6 #xfe0ac75c #x930d4dd6 #x56064877)
(fftwf_codelet_n1bv_16_sse2 0 #x11bdd #x11bdd #x0 #x5c16141e #x5a581937 #x6f57b62b #x47af443e)
(fftwf_codelet_n2bv_32_sse2 0 #x11bdd #x11bdd #x0 #x99e60d73 #x0bde6d2d #xf46fd4b1 #xc03f9218)
(fftwf_codelet_r2cf_64 0 #x11bdd #x11bdd #x0 #x6d656ee6 #x53393ec8 #xfa8dd6e0 #x15e5d458)
(fftwf_codelet_t3bv_8_sse2 0 #x11bdd #x11bdd #x0 #x257746e6 #xa31870a8 #xca5b8a3e #x7ba31d58)
(fftwf_dft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #x1c3789d0 #x88ba3c20 #xd03b591e #x3646e9bd)
(fftwf_codelet_t2bv_16_sse2 0 #x11bdd #x11bdd #x0 #xdecbd8fc #x48c91b31 #x1e3d2434 #xc732f759)
(fftwf_codelet_n2bv_8_sse2 0 #x11bdd #x11bdd #x0 #xbd7ff304 #x3b08865b #x47962b2b #xd1189ffa)
(fftwf_dft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #xd327ab6e #x6534e8d5 #xa9ed2dc7 #x5db381cb)
(fftwf_codelet_t1bv_8_sse2 0 #x11bdd #x11bdd #x0 #xfc12363b #xc099baef #x0c8f5564 #xd69b3056)
(fftwf_codelet_n2bv_16_sse2 0 #x11bdd #x11bdd #x0 #x2c80e542 #x63fb67a7 #xc4301a5d #x86e57661)
(fftwf_codelet_n1_8 0 #x11bdd #x11bdd #x0 #xb96dd905 #x8e03146d #xc67fe91d #xf994b076)
(fftwf_codelet_n2bv_32_sse2 0 #x11bdd #x11bdd #x0 #x7915b129 #x48243e82 #xe00b783f #xe6fb8a03)
(fftwf_dft_vrank_geq1_register 0 #x11bdd #x11bdd #x0 #x78a4a021 #x277ccb9f #x9b37cf7e #x1c9f910a)
(fftwf_codelet_t2bv_2_sse2 0 #x11bdd #x11bdd #x0 #x9d2a9389 #xa11bb061 #xea224368 #xb6d89bb4)
(fftwf_codelet_n1bv_64_sse2 0 #x11bdd #x11bdd #x0 #x8255687f #x0939e5f2 #xf8ff08f5 #x79a60721)
(fftwf_codelet_r2cfII_32 0 #x11bdd #x11bdd #x0 #xfd958223 #x4fcd2440 #x54c48ff5 #xe150f643)
(fftwf_codelet_hf_32 0 #x11bcd #x11bdd #x0 #xdae2400d #x25a291d8 #x9b06ff05 #xda505df7)
(fftwf_codelet_n1bv_16_sse2 0 #x11bdd #x11bdd #x0 #xc932d0f5 #x76a64631 #x75f502f8 #x0700dd0e)
(fftwf_codelet_r2cfII_64 0 #x11bdd #x11bdd #x0 #xbbbede56 #x008850ac #xd8a24bee #xd88e05c3)
(fftwf_codelet_t1bv_8_sse2 0 #x11bdd #x11bdd #x0 #x1958e352 #xe6d199d6 #x1c52689e #x6918a081)
(fftwf_codelet_t2bv_4_sse2 0 #x11bdd #x11bdd #x0 #xbc91c04d #x38665880 #xcf423043 #x62f00107)
(fftwf_codelet_n2bv_16_sse2 0 #x11bdd #x11bdd #x0 #x09828f7b #x31d0c35d #xa617d809 #x4ccb96bd)
)


Claggy
ID: 1518129 · Report as offensive
Wedge009
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 451
Credit: 431,396,357
RAC: 553
Australia
Message 1518176 - Posted: 19 May 2014, 13:04:48 UTC
Last modified: 19 May 2014, 13:17:37 UTC

Wow, a lot has been happening.

I am wondering: who used to do the public Linux builds? I have a feeling that, in this situation at least, I may be more useful as a tester rather than a developer - I don't mind keeping up with various builds. Getting a successful build is one thing, making an optimised build that's better than what I have now might be something else. Main reason I'm wanting to do this is that I'm quite surprised at the drop in performance migrating from WinXP, especially for MB CUDA (x41g vs x41zc). I've done switches between Windows and Linux before, but only on less powerful CPU/GPUs.

Out of the list of dependencies that Claggy had (at least in bold), I was only missing libfftw3-dev. I also noticed I'm using the master copy instead of the 7.2.47 release, so that may be complicating things as well for me, but I'm too tired to try again right now - going to bed after this.

If Guy is successful with his work, I'll probably make a Fedora 19 VM to give try building there - given that I seem to be such a novice at this, step-by-step instructions are most welcome at this point. Especially since I'm more familiar with Debian-based systems than Red Hat.

I also noticed that what looks like blanking work is being done for AstroPulse and the commit notes suggest that various optimising switches are in flux, so that complicates things. Still, the prospect of minimal CPU usage on GPU applications even with high-blanking tasks sounds exciting.
Soli Deo Gloria
ID: 1518176 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1518205 - Posted: 19 May 2014, 14:18:50 UTC - in response to Message 1518176.  
Last modified: 19 May 2014, 14:19:39 UTC

I am wondering: who used to do the public Linux builds?

Eric does the Public builds, i think he is best compiler guru here at getting the best out of the Stock builds, i'm on the other hand a Noob at compiling apps/clients, only been doing it for a few months, and that was my first attempt at building a Stock x86_64 app, and i purposely didn't try to use any extra optimisations, all my attempts have been to just compile an Arm app, anything as long as it works, so don't be put off.

I've tried using --enable-fast-math, but it looks as if it'll be slower, just,
i'd love to know what Eric puts into his configure, i'll have to make do with what Debian does (suitably modified, hopefully)

I have a feeling that, in this situation at least, I may be more useful as a tester rather than a developer - I don't mind keeping up with various builds. Getting a successful build is one thing, making an optimised build that's better than what I have now might be something else. Main reason I'm wanting to do this is that I'm quite surprised at the drop in performance migrating from WinXP, especially for MB CUDA (x41g vs x41zc). I've done switches between Windows and Linux before, but only on less powerful CPU/GPUs.

Out of the list of dependencies that Claggy had (at least in bold), I was only missing libfftw3-dev. I also noticed I'm using the master copy instead of the 7.2.47 release, so that may be complicating things as well for me, but I'm too tired to try again right now - going to bed after this.

If you want to use 7.2.47 libs just checkout that tag:

git checkout client_release/7.2/7.2.47; git status

If Guy is successful with his work, I'll probably make a Fedora 19 VM to give try building there - given that I seem to be such a novice at this, step-by-step instructions are most welcome at this point. Especially since I'm more familiar with Debian-based systems than Red Hat.

I also noticed that what looks like blanking work is being done for AstroPulse and the commit notes suggest that various optimising switches are in flux, so that complicates things. Still, the prospect of minimal CPU usage on GPU applications even with high-blanking tasks sounds exciting.


Claggy
ID: 1518205 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Compiling Applications for Linux


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.