Compute errors Linux "stock" app

Message boards : Number crunching : Compute errors Linux "stock" app
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1884938 - Posted: 19 Aug 2017, 18:30:57 UTC - in response to Message 1884934.  

As the OP can I remind folks that the issue I reported was with CPU tasks, NOT GPU, so please start another thread for similar GPU issues.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1884938 · Report as offensive
Profile Suzuki
Volunteer tester
Avatar

Send message
Joined: 17 Sep 01
Posts: 318
Credit: 4,474,402
RAC: 1
United Kingdom
Message 1884940 - Posted: 19 Aug 2017, 18:40:27 UTC - in response to Message 1884938.  

As the OP can I remind folks that the issue I reported was with CPU tasks, NOT GPU, so please start another thread for similar GPU issues.


My posts are not about GPU issues, so I am on-thread.

Steve.
ID: 1884940 · Report as offensive
Juha
Volunteer tester

Send message
Joined: 7 Mar 04
Posts: 388
Credit: 1,857,738
RAC: 0
Finland
Message 1884944 - Posted: 19 Aug 2017, 19:47:52 UTC - in response to Message 1884923.  

does the app need rewriting or is there some setting that can be changed to work around this?


Did you try "vsyscall=emulate"? To try it once you can add it to kernel command line in Grub's boot menu by highlighting a menu entry and pressing "e" to edit the command. "vsyscall=emulate" goes at the end of "linux" line. (If the line ends in "--" remove that.)

Setting it permanently depends on what distro you run and what distro it is based on.

As far as I know, if the "vsyscall=disabled" becomes default setting in all distros instead of just testing it the app will have to be recompiled.
ID: 1884944 · Report as offensive
Profile Suzuki
Volunteer tester
Avatar

Send message
Joined: 17 Sep 01
Posts: 318
Credit: 4,474,402
RAC: 1
United Kingdom
Message 1884946 - Posted: 19 Aug 2017, 19:53:28 UTC - in response to Message 1884944.  

does the app need rewriting or is there some setting that can be changed to work around this?


Did you try "vsyscall=emulate"? To try it once you can add it to kernel command line in Grub's boot menu by highlighting a menu entry and pressing "e" to edit the command. "vsyscall=emulate" goes at the end of "linux" line. (If the line ends in "--" remove that.)

Setting it permanently depends on what distro you run and what distro it is based on.

As far as I know, if the "vsyscall=disabled" becomes default setting in all distros instead of just testing it the app will have to be recompiled.


I didn't - apologies. I shall try tomorrow, if I get chance, and report back.

Steve.
ID: 1884946 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20140
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1885247 - Posted: 21 Aug 2017, 11:48:54 UTC
Last modified: 21 Aug 2017, 12:34:28 UTC

Still problems? Or all running ok now? (For what fix?)


Going back to the original error:

One of my Linux boxes threw a pile of errors yesterday evening. All are basically the same message:

<core_client_version>7.6.31</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
setiathome_v8 8.00 Revision: 3335 g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)
libboinc: BOINC 7.7.0

[...]

Task details:

Computer: 8317875
Task: 5931085984
Work Unit: 2633071782
Date/Time 9 Aug 2017, 14:32:42 UTC (send)
Date/time 9 Aug 2017, 14:49:37 UTC (return)
Message: Error while computing
Run time: 140.98
CPU time: 139.94
Version: --- SETI@home v8 v8.05 i686-pc-linux-gnu

For a giggle I've updated one of my old machines on Gentoo to run a test with (abbreviated):

Linux 4.12.6-gentoo

Starting BOINC client version 7.8.1 for x86_64-pc-linux-gnu
Libraries: libcurl/7.55.0 OpenSSL/1.0.2k zlib/1.2.11
No usable GPUs found
Version change (7.6.33 -> 7.8.1)


[SETI@home] Starting task 12mr08af.12052.64154.9.36.24_1
Application version SETI@home v8 v8.05 i686-pc-linux-gnu


Using the standard Berkeley app, it's a few hours in and all is well.


(Nothing silly like a problem with overclocking or overheating? Or an old PSU?...)

Happy crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1885247 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1885249 - Posted: 21 Aug 2017, 12:18:24 UTC - in response to Message 1885247.  

Thanks for trying Martin.
The problem has "receded into the background" for now, having installed optimised apps (AVX).
The PSU is fairly new, CPU is running at stock settings.
I'll not be trying anything "new" on that one for a couple of weeks as I'm in "fly-by" mode, so no time to get a monitor, keyboard etc hooked up - and the machine the "right" side of the in-house router to be accessed easily (I'm relying on SETI's reporting system to tell me if anything goes wrong).
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1885249 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20140
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1885250 - Posted: 21 Aug 2017, 12:36:48 UTC - in response to Message 1885249.  

No worries then :-)

Good luck and happy crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1885250 · Report as offensive
Profile Suzuki
Volunteer tester
Avatar

Send message
Joined: 17 Sep 01
Posts: 318
Credit: 4,474,402
RAC: 1
United Kingdom
Message 1885348 - Posted: 21 Aug 2017, 20:51:25 UTC - in response to Message 1885250.  

No worries then :-)

My machine is still throwing the errors on a stock, non-GPU app. I haven't had chance to try the vsyscall setting yet as I'm working away from the machine at present.

The initial problem seemed to be OS-specific. Is this not now the case?

Steve.
ID: 1885348 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1885352 - Posted: 21 Aug 2017, 20:59:56 UTC

The initial problem was under Linux Mint 18.2, running stock, with stock clock speeds on the CPU.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1885352 · Report as offensive
Profile Suzuki
Volunteer tester
Avatar

Send message
Joined: 17 Sep 01
Posts: 318
Credit: 4,474,402
RAC: 1
United Kingdom
Message 1885354 - Posted: 21 Aug 2017, 21:04:58 UTC - in response to Message 1885352.  

The initial problem was under Linux Mint 18.2, running stock, with stock clock speeds on the CPU.


Sorry, I'd read it as a Linux box, rather than a specific Mint 18.2 issue. My Kali 4.11, running stock app & clock speeds, is having the same problem.

Steve.
ID: 1885354 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20140
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1885450 - Posted: 22 Aug 2017, 22:39:37 UTC - in response to Message 1885247.  
Last modified: 22 Aug 2017, 22:40:48 UTC

[SETI@home] Starting task 12mr08af.12052.64154.9.36.24_1
Application version SETI@home v8 v8.05 i686-pc-linux-gnu


Using the standard Berkeley app, it's a few hours in and all is well.

And it completed fine:

<core_client_version>7.8.1</core_client_version>
<![CDATA[
<stderr_txt>
setiathome_v8 8.00 Revision: 3335 g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)
libboinc: BOINC 7.7.0

Work Unit Info:
...............
WU true angle range is :  0.389075
Optimal function choices:
--------------------------------------------------------
                            name   timing   error
--------------------------------------------------------
                v_BaseLineSmooth (no other)
    v_vGetPowerSpectrumUnrolled2 0.000504 0.00000 
              sse2_ChirpData_ak8 0.035118 0.00000 
                 fftwf_transpose 0.012867 0.00000 
                  BH SSE folding 0.002697 0.00000

[...]

Spike count:    15
Autocorr count: 0
Pulse count:    2
Triplet count:  7
Gaussian count: 0
04:48:07 (21865): called boinc_finish(0)

</stderr_txt>
]]>


So... Good there are no errors.

Note that the "setiathome_v8 8.00 Revision: 3335 g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4) libboinc: BOINC 7.7.0" reflects what was used at Berkeley to compile the s@h app used.


Happy crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1885450 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20140
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1885451 - Posted: 22 Aug 2017, 22:45:17 UTC - in response to Message 1885354.  
Last modified: 22 Aug 2017, 22:46:24 UTC

The initial problem was under Linux Mint 18.2, running stock, with stock clock speeds on the CPU.

Sorry, I'd read it as a Linux box, rather than a specific Mint 18.2 issue. My Kali 4.11, running stock app & clock speeds, is having the same problem.

A curious one...

Do you know what kernel and glibc you are running?

To find out, use the commands in a terminal:

uname -a

file file /lib64/libc*



Or does a system update clear the problem?

Or... A problem with overclocking or overheating? Or an old PSU?...


Happy crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1885451 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1885635 - Posted: 23 Aug 2017, 18:04:52 UTC
Last modified: 23 Aug 2017, 18:08:25 UTC

uname reports:
Linux linux-x64 4.8.0-53-generic #56-16.04.1 Ubuntu SMP Tue May 16 01:18:56 UTC 2017 x86_x64 x86_x64 GNU/Linux 


and "file file /lib64/libc*" gives an error saying it can't find the command/file "file", or "lib64/libc*"
(/lib64 doesn't have anything called libc)

However, if I look in /lib32 I find that libc appears to be version 2.23
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1885635 · Report as offensive
Profile Suzuki
Volunteer tester
Avatar

Send message
Joined: 17 Sep 01
Posts: 318
Credit: 4,474,402
RAC: 1
United Kingdom
Message 1885739 - Posted: 24 Aug 2017, 8:02:01 UTC - in response to Message 1885451.  

Do you know what kernel and glibc you are running?


My uname -a gives me:

Linux KALI 4.11.0-kali1-amd64 #1 SMP Debian 4.11.6-1kali1 (2017-06-21) x86_64 GNU/Linux


And I used "ldd --version" to get:

ldd (Debian GLIBC 2.24-12) 2.24


Any useful info in there?

Steve.
ID: 1885739 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20140
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1885867 - Posted: 24 Aug 2017, 23:42:56 UTC

Sorry for the typo, the instructions should have been:

uname -a

file /lib64/libc*


And for both of you... You're recent enough with your versions that there should be no problem!

(eg: I'm on libc version 2.23)


So...

Is this WU specific?

Does Memtest86+ and "GIMPS Prime Torture Test" complete ok?

Or something else?


Any other clues?

Keep searchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1885867 · Report as offensive
Profile Suzuki
Volunteer tester
Avatar

Send message
Joined: 17 Sep 01
Posts: 318
Credit: 4,474,402
RAC: 1
United Kingdom
Message 1885938 - Posted: 25 Aug 2017, 8:44:40 UTC - in response to Message 1885867.  
Last modified: 25 Aug 2017, 8:59:38 UTC


Is this WU specific?

Does Memtest86+ and "GIMPS Prime Torture Test" complete ok?

Or something else?

Any other clues?

I tried Memtest and my machine wouldn't boot into it - Grub was modified but it got no further when trying to boot into it. I don't have a USB pen or CDR to make any other bootable media.

GIMPS is running fine - it's pretty silent on its output but it is running happily. There's "Self-test 448k passed!" messages from time to time.

I don't think this is WU specific. This box went from merrily crunching lots of units to trashing *everything* it receives. Unlikely to be the PSU as there's no other issues - the same box crunches units when booted into Windows without any problems at all. There's nothing overclocked; everything's stock.

I haven't tried modifying the boot script with the vsyscall=emulate thing. Is that the next sensible step?

Steve.
ID: 1885938 · Report as offensive
Profile Suzuki
Volunteer tester
Avatar

Send message
Joined: 17 Sep 01
Posts: 318
Credit: 4,474,402
RAC: 1
United Kingdom
Message 1885939 - Posted: 25 Aug 2017, 8:57:34 UTC - in response to Message 1884944.  

[Did you try "vsyscall=emulate"?


Just did that and, so far (we're 100 seconds into one unit), it is working. Previously, it wouldn't complete a second of computation before raising a computation error.

Thanks for the advice!

Steve.
ID: 1885939 · Report as offensive
Profile Suzuki
Volunteer tester
Avatar

Send message
Joined: 17 Sep 01
Posts: 318
Credit: 4,474,402
RAC: 1
United Kingdom
Message 1885942 - Posted: 25 Aug 2017, 8:59:02 UTC - in response to Message 1885867.  



Or something else?


Any other clues?


Adding vsyscall=emulate to my boot script seems to have fixed this. :-)

Steve.
ID: 1885942 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20140
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1886194 - Posted: 26 Aug 2017, 13:11:00 UTC - in response to Message 1885942.  
Last modified: 26 Aug 2017, 13:14:40 UTC

Or something else?


Any other clues?

Adding vsyscall=emulate to my boot script seems to have fixed this. :-)

Steve.

Very good find! Thanks for finding that :-)

So... Why is that not a problem for my old system?!...

Taking a quick look for the old Gentoo setup I have gives:

$ zgrep -i vsyscall /proc/config.gz 
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_X86_VSYSCALL_EMULATION=y
# CONFIG_LEGACY_VSYSCALL_NATIVE is not set
CONFIG_LEGACY_VSYSCALL_EMULATE=y
# CONFIG_LEGACY_VSYSCALL_NONE is not set


So the "CONFIG_X86_VSYSCALL_EMULATION=y", "CONFIG_LEGACY_VSYSCALL_EMULATE=y" explains why no problem for my old system.

Considering the local malware exploits made possible by that, I should update to disable the (very old) vsyscall to set it to "NONE", as is now the default for more recent kernels.

See: security things in Linux v4.4. Also, Einstein stumbled across the same problem: vsyscall is now disabled on latest Linux distros.

The latest Linux kernel is now up at version 4.14.


So... Anyone know what is the procedure for the devs to compile an updated version of the Linux app without the retired vsyscall?

I strongly suspect the only change needed is to compile against a more recent version of glibc... As Linux users update their systems, the old vsyscall is going to become more of a show stopper.


Happy crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1886194 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20140
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1887276 - Posted: 1 Sep 2017, 13:40:29 UTC - in response to Message 1886194.  

FYI:

Request for the code to be updated made over on:

VSYSCALL deprecated - s@h client recompile needed


Happy crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1887276 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : Compute errors Linux "stock" app


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.