Astropulse always crashes out.


log in

Advanced search

Message boards : Number crunching : Astropulse always crashes out.

Author Message
Mike Davies
Send message
Joined: 10 Apr 10
Posts: 2
Credit: 114,668
RAC: 2
United Kingdom
Message 1179716 - Posted: 21 Dec 2011, 14:12:33 UTC

For some months now, I've noticed that Astropulse work-units give a computation error almost immediately. I had assumed that someone was aware, but maybe they are not, so this is a "heads up".

Here are some of my latest results..

http://setiathome.berkeley.edu/result.php?resultid=2221711451
http://setiathome.berkeley.edu/result.php?resultid=2221682555

The links should be all you need to discover my environment etc.

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3339
Credit: 19,520,731
RAC: 18,545
Sweden
Message 1179719 - Posted: 21 Dec 2011, 14:19:42 UTC - in response to Message 1179716.
Last modified: 21 Dec 2011, 14:29:51 UTC

For some months now, I've noticed that Astropulse work-units give a computation error almost immediately. I had assumed that someone was aware, but maybe they are not, so this is a "heads up".

Here are some of my latest results..

http://setiathome.berkeley.edu/result.php?resultid=2221711451
http://setiathome.berkeley.edu/result.php?resultid=2221682555

The links should be all you need to discover my environment etc.


Yes, that seems to be the rule rather than the exception when running AP on Linux. The "process got signal 11" error can often be seen when running AP's on Linux. I know there was a solution to that, and I'm sure someone who knows about Linux, will help you solve that issue.

Edit: I have had hundreds of wingmen on my AP's running Linux, and erroring out, so the WU has to be sent out again to a third wingman. I do remember though, that there was a solution to the problem, but since I am not running Linux, I did not pay attention to what it was.
____________

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4066
Credit: 32,857,043
RAC: 6,988
United Kingdom
Message 1179722 - Posted: 21 Dec 2011, 14:31:26 UTC - in response to Message 1179719.
Last modified: 21 Dec 2011, 14:40:20 UTC

For some months now, I've noticed that Astropulse work-units give a computation error almost immediately. I had assumed that someone was aware, but maybe they are not, so this is a "heads up".

Here are some of my latest results..

http://setiathome.berkeley.edu/result.php?resultid=2221711451
http://setiathome.berkeley.edu/result.php?resultid=2221682555

The links should be all you need to discover my environment etc.


Yes, that seems to be the rule rather than the exception when running AP on Linux. I know there was a solution to that, and I'm sure someone who knows about Linux, will help you solve that issue.

The solution is to run the Optimised Linux apps, they are a lot faster and a lot less prone to erroring out, check out Arkayn's site for downloads:

Crunchers Anonymous Downloads

Be aware that once you go onto Anonymous Platform, the Setiathome apps won't get updated by the project, you'll have to do that yourself when new Optimised apps are available, or by reverting back to the Stock apps.
(we're probably only a few months away from the Rollout of Setiathome v7 and Astropulse v6)

Claggy

Profile James SotherdenProject donor
Avatar
Send message
Joined: 16 May 99
Posts: 8671
Credit: 32,891,724
RAC: 56,498
United States
Message 1179727 - Posted: 21 Dec 2011, 14:39:28 UTC
Last modified: 21 Dec 2011, 14:42:06 UTC

exit139

Here is a link that has info on exit 139 for linux.
____________

Old James

Profile Khangollo
Avatar
Send message
Joined: 1 Aug 00
Posts: 245
Credit: 36,410,524
RAC: 0
Slovenia
Message 1179730 - Posted: 21 Dec 2011, 14:44:52 UTC
Last modified: 21 Dec 2011, 14:48:28 UTC

This is happening for years now...

x86_64 stock version of Linux AP is broken and sigsegvs on any semi-recent glibc.
You probably don't have (all) 32-bit compatibility libs (such as glibc) and 32-bit version cannot be run (which I think is preferred by the scheduler for Linux).

Shame that admins don't at least remove the 64-bit version. It's 8 MBs of wasted bandwidth.
Don't they run any statistics, checking for errors/invalid counts per platform ??? Like they do at Einstein.

Anyway, run the optimized applications, as already suggested.
They are more than 2x faster!
____________

Mike Davies
Send message
Joined: 10 Apr 10
Posts: 2
Credit: 114,668
RAC: 2
United Kingdom
Message 1179767 - Posted: 21 Dec 2011, 18:12:50 UTC - in response to Message 1179730.

Thanks for all the replies. I'll investigate the optimised versions.

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2245
Credit: 8,574,909
RAC: 4,360
United States
Message 1179807 - Posted: 21 Dec 2011, 20:07:46 UTC

I had an issue with that way back when we went from AP_v5 to AP_v505. All of my tasks would instantly error out with a segment violation. The solution was to upgrade my GLIBC to at least 3.6, I believe it was, as I only had something like 3.1 at the time.

I forget the full details, but bottom line is it looks to me like a GLIBC issue, based on my past experience.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Profile BilBg
Volunteer tester
Avatar
Send message
Joined: 27 May 07
Posts: 2632
Credit: 5,977,591
RAC: 3,956
Bulgaria
Message 1179917 - Posted: 22 Dec 2011, 6:06:26 UTC - in response to Message 1179767.

Thanks for all the replies. I'll investigate the optimised versions.


You may find some useful discussion here:
http://setiathome.berkeley.edu/forum_thread.php?id=66123


____________



- ALF - "Find out what you don't do well ..... then don't do it!" :)

Woofie
Send message
Joined: 11 Jan 12
Posts: 4
Credit: 7,159
RAC: 2
Czech Republic
Message 1185886 - Posted: 17 Jan 2012, 8:05:03 UTC
Last modified: 17 Jan 2012, 8:08:19 UTC

I notices that I got "signal 11" when suspend my computer or I hit the snooze button in boinc manager, computing is without problem until one of this event happened, maybe some other comes. I'm runing boinc v. 6.12.42 on Gentoo 32bit.

Profile Wiggo
Avatar
Send message
Joined: 24 Jan 00
Posts: 6748
Credit: 92,684,803
RAC: 76,198
Australia
Message 1190305 - Posted: 31 Jan 2012, 10:09:23 UTC - in response to Message 1185886.

Personally I wish that AP's were stopped from being given to Linux hosts all together as I've only ever seen 1 Linux host that returns good results, all the rest have just trashed them, and as a result this impacts on the connection due to resends (1 AP that I did recently was sent out 7 times because 4 separate Linux hosts trashed it, 1 Win machine timed out on it, before it was finally put to rest.

Cheers.
____________

Profile ignorance is no excuse
Avatar
Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,274
RAC: 0
Korea, North
Message 1190357 - Posted: 31 Jan 2012, 13:36:02 UTC - in response to Message 1190305.

mine always does because I use the optimized apps. prior to the Opt app install the AP's were always trashed. Thanks again to the Lunatics folks for their diligent work
____________
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

Message boards : Number crunching : Astropulse always crashes out.

Copyright © 2014 University of California