Message boards :
Number crunching :
Stock AP app having an issue with a batch of work?
Message board moderation
Author | Message |
---|---|
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
I noticed an invalid task pop up today, 837596801. Which is not unexpected from time to time, but when I looked at the task I thought it looked funny. Task Computer Sent Time reported or deadline Status Run time (sec) CPU time (sec) Credit Application 2113573897 4725574 14 Oct 2011 | 10:42:54 UTC 14 Oct 2011 | 15:45:45 UTC Error while computing 0.00 0.00 --- Astropulse v505 v5.06 2113573898 6191261 14 Oct 2011 | 10:42:55 UTC 14 Oct 2011 | 10:48:11 UTC Error while computing 1.07 0.00 --- Astropulse v505 v5.06 2114177255 6186874 14 Oct 2011 | 21:22:36 UTC 14 Oct 2011 | 21:27:47 UTC Error while downloading 0.00 0.00 --- Astropulse v505 v5.05 2114467297 6067658 15 Oct 2011 | 2:41:46 UTC 15 Oct 2011 | 2:46:52 UTC Error while computing 0.00 0.00 --- Astropulse v505 v5.06 2114779667 6185696 15 Oct 2011 | 9:02:55 UTC 15 Oct 2011 | 9:09:03 UTC Error while computing 0.00 0.00 --- Astropulse v505 v5.05 2115128329 5012752 15 Oct 2011 | 14:00:37 UTC 24 Oct 2011 | 5:57:12 UTC Completed, can't validate 63,245.10 63,240.96 0.00 Astropulse v505 Anonymous platform (CPU) 2115428038 6180271 15 Oct 2011 | 20:23:45 UTC 16 Oct 2011 | 14:55:09 UTC Error while computing 1.03 0.00 --- Astropulse v505 v5.06 All of the stock app machines had an error with the task. So I looked at the 5 'Error while computing' tasks and found them all to have: <core_client_version>6.10.17</core_client_version> <![CDATA[ <message> process got signal 11 </message> <stderr_txt> In ap_gfx_main.cpp: in ap_graphics_init(): Starting client. </stderr_txt> ]]> So I looked at the tasks for each of the machines at it seems that all of the AP tasks on those machines are ending with 'Error while computing'. I then looked through my valid tasks and have found several where there is a 3rd result with the 'Error while computing' status. Some of them were "process got signal 8" instead of 11. Admittedly I don't have a clue what all the various application exit codes mean. These might be as relevant as the -9 overflow message, but I thought it seemed odd. Perhaps this is just an example of the lunatics code handling things better than the stock app? Maybe there is just some wonky data out there? Who knows. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
|
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
"process got signal ..." is a characteristic error code for Linux machines only. If you look that the right-hand column of your screen-grab, four of the computing errors were with application v5.06 - the stock Linux app (the v5.05 error - on Windows Vista - was "too many exit(0)s", probably unrelated). I suppose the two questions are: 1) Why is stock app v5.06 still in use, when it has been so problematic for so long? (I seem to remember Urs Echternacht and others having trouble with it in early AP beta testing) 2) Why did this WU end up being allocated to so many Linux hosts? |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
"process got signal ..." is a characteristic error code for Linux machines only. At first I saw Linux box, Linux box, Linux box... and started to think "Linux machines are broken". Then I saw the windows machine and thought "ah OK it is more than just Linux". The windows box looks like it just broken as all of the tasks, MB & AP, are spitting out "too many exit(0)s". It was probably just a fluke that it happen to be in there. I have often thought that some kind of platform mechanism should be used on the back end. So if a tasks gets errors on a specific platform stop sending it to that one. That might be in place to some extent or it could be a total mess to do. Just the little things like this that point out the holes in the current system that could be worked on. Where this potentially valid data is probably going to the bit bucket. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
"process got signal ..." is a characteristic error code for Linux machines only. Why were so many channels 'bad', channels ended in error: 0 MB 19 AstroPulse. As of 24 Oct 2011 | 17:00:08 UTC, according to the SERVER Page? Too much RFI, well I'm guesssing, RFI & RADAR Blanking, is present almost all the time, maybe just a bad series of channels? |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
"process got signal ..." is a characteristic error code for Linux machines only. I thought Urs Echternacht had a fix for the signal 11 problem, and it was in the repository, but a new app was never built, couldn't find a post the last time i looked. Claggy |
arkayn Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0 |
|
Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489 |
I've also noticed that plenty of Linux hosts have been erroring out AP work. Cheers. |
JohnDK Send message Joined: 28 May 00 Posts: 1222 Credit: 451,243,443 RAC: 1,127 |
I've also noticed that plenty of Linux hosts have been erroring out AP work. Same, nearly all AP errors I've seen from my wingmen runs linux. |
David Anderson (not *that* DA) Send message Joined: 5 Dec 09 Posts: 215 Credit: 74,008,558 RAC: 74 |
Each time I upgrade to latest Ubuntu Linux (at 11.10 now, every 6 months there is a new release) I turn on AP, get 4 or so AP failures with signal 11 (with no AP successes), and turn it off again. Running stock apps, not optimized apps. No app_info.xml file. MultiBeam works fine -- I get an error or two (signal 11 or whatever) four or five times a year (between 2 machines). One machine x86, the other machine x86_64. |
Khangollo Send message Joined: 1 Aug 00 Posts: 245 Credit: 36,410,524 RAC: 0 |
I've also noticed that plenty of Linux hosts have been erroring out AP work. I'm noticing this a lot and noticed it on my machines, too, when I was still running stock. Stock 64 bit AP application fails on newer Linux distributions (glibc incompatibility?) and admins don't care enough to *finally* remove it. 32 bit app. works just fine on 64 bit distros. I know Urs has built a AP app that does run just fine on Linux, but it is only available as a optimized app And it works awesomely and much faster. There is no reason to run stock :) |
Woofie Send message Joined: 11 Jan 12 Posts: 4 Credit: 68,135 RAC: 0 |
Hi I'm new here and I have similar problem with AP on my PC running Gentoo 32bit. Maybe I'm not so good in searchnig but where I can find this Ursa app for AP can someone point me? Thanks |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Hi I'm new here and I have similar problem with AP on my PC running Gentoo 32bit. Maybe I'm not so good in searchnig but where I can find this Ursa app for AP can someone point me? Thanks In this sub-forum, at the top is a number of sticky threads, one of them is the Optimised Apps Release News thread, which have links, here are the links anyway: Lunatics downloads Where the latest apps are available Crunchers Anonymous Where the latest apps, installers (for Windows and Mac) and SSE-bitness Packages are available Claggy |
Woofie Send message Joined: 11 Jan 12 Posts: 4 Credit: 68,135 RAC: 0 |
Thanks Claggy and sorry for that noob question I try better searching next time :) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.