Questions and Answers :
Unix/Linux :
Kernel error on a Raspberry Pi Rev2 - what to do now
Message board moderation
Author | Message |
---|---|
RonSC Send message Joined: 23 Dec 16 Posts: 4 Credit: 6,145 RAC: 0 |
Hello, After setting up boinc on a Raspberry Pi Rev2, I started its first SaH work unit. Soon after, I observed a kernel error, as follows: Task: I decided to increase the RAM allocation to 224MB, and restarted the RPi, but again the same kernel error occurred. Lastly, I removed a Gertboard plugin card to reduce power demand. After booting up the RPi, the same kernel error happened again. The 3 kernel error occurrences revealed a pattern: they all happened a few minutes after a new work unit is started, and also after an RPi restart. The RPi is running a freshly installed and upgraded Raspbian Jessie version & kernel. Interestingly, this RPi-Rev2 has 512MB RAM and is having kernel errors, while my Raspbian-based RPi-Rev1 with 256MB RAM seems to be running error-free. Oddly, the avg cpu loads given by /proc/loadavg are 0.0 (1 min), 0.0 (5 min), 0.0 (15 min) !! while get_tasks shows the work unit as executing. With about cpu 420,000secs left (4.8 days) I think this task should be killed. My opinion is to do a --prroject URL detach (I login to my headless RPi systems via ssh), but that means the problem remains unsolved :(. Do you need more information? I'll leave it running for now until I get your advice. Cheers, Ron |
klammerj Send message Joined: 3 Jan 08 Posts: 4 Credit: 500,895 RAC: 0 |
It said undefined instruction... maybe they got the binaries mixed up.. or wrong compilation flags... It's unlikely to be the amount of ram installed... |
RonSC Send message Joined: 23 Dec 16 Posts: 4 Credit: 6,145 RAC: 0 |
Thanks for the diagnosis klammerj. ITEM 1: I see that the "current cpu time" is stalled, so I assume that it will just sit there until the task deadline. I wish to somehow kill the offending task and start another. Is the following approach best?: $ boinccmd --host RPi-Rev2 --project URL reset the Boinccmd Tool Manual describes reset as: delete current work and get more ITEM 2: Would I be correct to say that if I do a standard system upgrade (apt-get), which might update boinc itself, this would cause problems for the currently executing task? ITEM 3: If I need to do some system maintenance I would like to pause project work. Which of the following methods would work best? $ boinccmd --host RPi-Rev2 --project URL suspend $ boinccmd --host RPi-Rev2 --project URL resume or should I do: $ boinccmd --host RPi-Rev2 --project URL nomorework $ boinccmd --host RPi-Rev2 --project URL allowmorework and what is the difference? - the Manual doesn't explain. Thanks, Ron |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Suspend and Resume is just pausing and resuming the project, all tasks are affected. Nomorework and Allowmorework is telling the project not to download any more tasks, and telling the project to again download new work. But I have checked your computers, the Raspberry Pi v1 is the single core, I think? It hasn't finished any tasks yet. The Raspberry V2 is the quad core? It seems to run the work fine, there are no errors. Work is being returned and validated. So there's no need for rash decisions here. And certainly not for resetting the project. All that that does is throw away the work in cache. It may reset the application to crunch with, but then will download the same one afterwards. |
RonSC Send message Joined: 23 Dec 16 Posts: 4 Credit: 6,145 RAC: 0 |
Hello, I don't know if the info below is useful, in general, for the project; please let me know if it is not useful. My id is RonSC, user id is 10419846, computer name is RPi-Rev1 (single core), computer id is 8179761. The task (name is: 13fe09aa.27929.4162.10.37.195_1) is hung - ie. current CPU time is not incrementing, and cpu load is 0.02 (15 min avg). To avoid wasting time waiting until the deadline, I plan to resetthe task and resume processing as soon as I have posted this note. Hopefully, this info might help to resolve the problem with this task. Cheers, Ron |
RonSC Send message Joined: 23 Dec 16 Posts: 4 Credit: 6,145 RAC: 0 |
Since my last post, my computer RPi-Rev2 has had 2 more kernel crashes of the same type on consecutive tasks: undefined instruction: 0 [#1] ARM. klammerj thought the first such error was probably caused by improper compilation options, and that was a month ago and its plain the problem has not been fixed. I've come to the conclusion that I might as well retire RPi-Rev2 from Seti work, especially since it is a very slow processor, and maybe the Seti project shares this view - ie. is it worth the effort required to support. Cheers, Ron |
Karl Roos Send message Joined: 19 Mar 01 Posts: 36 Credit: 206,258,788 RAC: 0 |
I'm running boinc & seti@home on a Raspberry Pi. Works great but I can only get 2 of the 4 cores processing WU's. On my Android tablet there was an option in the app to run 2 cores or 4, I don't see the equivalent in the Pi interface. I have computing preferences set to not limit cores or processor %, but it still only runs 2 wu's on 2 of the cores. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.