Questions and Answers :
Unix/Linux :
[Debian 10]No new tasks
Message board moderation
Author | Message |
---|---|
Sam Send message Joined: 6 Sep 08 Posts: 2 Credit: 898,695 RAC: 0 |
Hi ! A few days ago I upgraded from Stretch to Buster. Since then, all my finished tasks are flagged with a "computing error" and Boinc does not download new tasks (it seems to communicate with the server yet). I purged and reinstalled BOINC, it changed nothing. Something I can do to mend it ? Thank you, Sam |
rob smith Send message Joined: 7 Mar 03 Posts: 22436 Credit: 416,307,556 RAC: 380 |
I've grabbed the text of the stnderr file for one of your failed tasks (all the others I've looked at have the same pattern) It looks to me as if it is failing during initialization (there's no run time or CPU time). This is supported by the last few lines in the report:- <message> This generally implies issues with the application.... (Passing thought - is your anti-virus software intruding and blocking the applications from initializing properly?) Task 8008579828 Name 23jn10ab.30580.13973.12.39.11.vlar_1 Workunit 3633387745 Created 31 Aug 2019, 0:01:29 UTC Sent 31 Aug 2019, 5:59:43 UTC Report deadline 23 Oct 2019, 10:59:25 UTC Received 31 Aug 2019, 6:04:53 UTC Server state Over Outcome Computation error Client state Compute error Exit status 11 (0x0000000B) Unknown error code Computer ID 7936241 Run time CPU time Validate state Invalid Credit 0.00 Device peak FLOPS 3.99 GFLOPS Application version SETI@home v8 v8.00 x86_64-pc-linux-gnu Peak disk usage 0.01 MB Stderr output <core_client_version>7.14.2</core_client_version> <![CDATA[ <message> process got signal 11</message> <stderr_txt> SIGSEGV: segmentation violation </stderr_txt> ]]> Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Reinstalling BOINC won't fix computing errors as the computation is done by the Seti science application. Your error is a: <core_client_version>7.14.2</core_client_version> <![CDATA[ <message> process got signal 11</message> <stderr_txt> SIGSEGV: segmentation violation </stderr_txt> ]]> 11. SIGSEGV: Segmentation violation. The invalid storage access (Segmentation Violation) signal. In any case, it's a problem with your operating system. Especially because it worked before you updated your OS and since you did it doesn't work anymore. So best go report it at their developers. |
Sam Send message Joined: 6 Sep 08 Posts: 2 Credit: 898,695 RAC: 0 |
Hi Rob ! I've no antivirus installed. But I read that Apparmor is no installed by default with Debian 10 (I don't now if it was on my 9). Maybe it's a part of the problem ? |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
I am getting these errors as well on a new Debian buster install. I see a number of references to adding “GRUB_CMDLINE_LINUX="vsyscall=emulate"†to /etc/default/grub but this did not work for me. I have read multiple posts all over the web with about 50/50 on whether it’s the running app or an os problem and one post suggesting a hardware issue. With as many people running seti@home you would think this would have been addressed many times. My Linux experience is moderate but nothing I have tried works. Radjin~ |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I thought the necessity for running the syscall kernel command line was at Einstein@home and that is before they implemented the "use LIBC2.15" applications. I never found it necessary. I just run the normal apps. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
I thought the necessity for running the syscall kernel command line was at Einstein@home and that is before they implemented the "use LIBC2.15" applications. I never found it necessary. I just run the normal apps. That is very possible, although I have read several post about it, including one blog on setting up seti@home where they added the line to the file. It may be totally unnecessary. Either way I tried and it was no help. You mentioned that you thought it was probably a hardware issue with cpu or memory timing I think. I betting you have a ton more experience with tweaking Linux than I, but as I am stuck until I figure it out I might as well research every possibility. The challenge is kind of entertaining and I get to meet people who are super geeks in this stuff. Most of what I have read leans toward some sort of software call that points to an unavailable or wrong memory location; I am way out of my league in diagnosing such on Linux, but there must be someone out there who has resolved it; there’s two people here that has the problem right now. The question is what steps do we take to isolate the issue then how do we resolve it. Radjin~ |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
First is to look what is called out in the System Log. On Ubuntu you have the Logs utility and the Important category is going to show the detected faults with the system. A lot will be benign but you are interested in anything to do with a kernel panic or memory issues. I don't know what utility that Debian 10 has. Hopefully something similar. Second is to look at the syslog file and the kern.log file in /var/log. Those are both text files you can read with the Text Editor. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
I tried strace with the help of a friend and below is the results: ~/BOINC$ strace -e file ./setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101 execve("./setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101", ["./setiathome_x41p_V0.98b1_x86_64"...], 0x7ffce3c97770 /* 20 vars */) = 0 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/librt.so.1", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libm.so.6", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 stat("stderr.txt", {st_mode=S_IFREG|0644, st_size=588, ...}) = 0 openat(AT_FDCWD, "stderr.txt", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3 openat(AT_FDCWD, "init_data.xml", O_RDONLY) = -1 ENOENT (No such file or directory) stat("init_data.xml", 0x7ffc199f17b0) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/etc/localtime", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "boinc_lockfile", O_WRONLY|O_CREAT, 0664) = 3 stat("init_data.xml", 0x7ffc199f1490) = -1 ENOENT (No such file or directory) lstat("work_unit.sah", 0x7ffc199f11f0) = -1 ENOENT (No such file or directory) stat("work_unit.sah", 0x7ffc199f11c0) = -1 ENOENT (No such file or directory) stat("work_unit.sah", 0x7ffc199f1400) = -1 ENOENT (No such file or directory) --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- openat(AT_FDCWD, "boinc_finish_called", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 4 --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} --- stat("boinc_lockfile", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 unlink("boinc_lockfile") = 0 +++ exited with 0 +++ Radjin~ |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
That seems to be a permission problem on the slots directories for BOINC tasks. Since you are running a repository version of BOINC, permissions get complicated. For each gpu task for example to crunch the tasks, the necessary files have to be copied into the slot that the gpu work unit is copied to. Files like the OpenCL resource .CL file. The gcc compiler has to compile the compute kernel primitives to the Compute Cache in your /home directory. If you take a look at your very first gpu task you completed, you will see it compiling the compute kernel in the stderr.txt output for that task. It only has to do it for the very first task under each new gpu graphics driver though. After the that the OpenCL .wisdom file are copied into each slot containing a new gpu task that is crunched. I just went to look for that task and see you blew your installation away again by upgrading the kernel and once again you have no gpu detected. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
That last post was while running the Berkeley version you turned me on to. I did switch back to the repository version to solve a number of other issues but the compute error persist. I never had my gpu installed, sorting that out is on the list. You and my friend both picked out issues from that scan like magic. I can see I have a long way to go to get to your level of understanding. Radjin~ |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
How do I confirm that vsyscall=emulate is in effect. I tried a number of commands that said file not found. Radjin~ |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
How do I confirm that vsyscall=emulate is in effect. I tried a number of commands that said file not found. vsyscall is not a file. It is a kernel parameter. You would check the output via: cat /usr/src/linux-headers-$(uname -r)/.config | grep VSYSCALL and the output would indicate whether it is active or not. cat /usr/src/linux-headers-$(uname -r)/.config | grep VSYSCALL CONFIG_GENERIC_TIME_VSYSCALL=y CONFIG_X86_VSYSCALL_EMULATION=y CONFIG_LEGACY_VSYSCALL_EMULATE=y # CONFIG_LEGACY_VSYSCALL_NONE is not set Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
That last post was while running the Berkeley version you turned me on to. I did switch back to the repository version to solve a number of other issues but the compute error persist. Sorry confused who I was replying too. I am helping Fizz get his AMD RX 470 working in BOINC. At least that was successful. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
How do I confirm that vsyscall=emulate is in effect. I tried a number of commands that said file not found. I tried that command and the results: cat: /usr/src/linux-headers-4.19.0-6-amd64/.config: No such file or directory Once you put the line in /etc/default/grub, is there something more than doing a reboot to activate it? Radjin~ |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
From what I have read about the issue, you shouldn't even need to add that parameter as vsyscall=emulate is the default now in all modern kernels. Notice my output. I don't have the parameter in my grub kernel command line. That was only needed back when glibc was at the 2.11 level and gcc6 was the default compiler. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
I mostly agree, but as I don’t see another way eliminate the error I thought it worth the try. I can always reverse the setting if it does not help. Grub seems the wrong place to put the line and since I don’t get the expected response I am guessing that is true. Where would I put the kernel option? Radjin~ |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I mostly agree, but as I don’t see another way eliminate the error I thought it worth the try. I can always reverse the setting if it does not help. No. Grub is the correct place to put the kernel command parameter. GRUB_CMDLINE_LINUX_DEFAULT="VSYCALL=EMULATE" Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Radjin Send message Joined: 2 May 00 Posts: 105 Credit: 14,928,529 RAC: 102 |
I mostly agree, but as I don’t see another way eliminate the error I thought it worth the try. I can always reverse the setting if it does not help. Awesome, thanks for the confirmation there. So the line above is a command to place it in grub? I gave it a shot and rebooted. Running: cat /usr/src/linux-headers-$(uname -r)/.config | grep VSYSCALL I get: cat: /usr/src/linux-headers-4.19.0-6-amd64/.config: No such file or directory Radjin~ |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Well I see I typoed. But you don't do anything other than put VSYSCALL=EMULATE in between the already existing GRUB_CMDLINE_LINUX_DEFAULT="" line in between the existing "" Then save the file and then issue a sudo grub-update. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.