Astropulse always crashes out.

Message boards : Number crunching : Astropulse always crashes out.

To post messages, you must log in.

AuthorMessage
Mike Davies

Send message
Joined: 10 Apr 10
Posts: 2
Credit: 122,944
RAC: 0
United Kingdom
Message 1179716 - Posted: 21 Dec 2011, 14:12:33 UTC

For some months now, I've noticed that Astropulse work-units give a computation error almost immediately. I had assumed that someone was aware, but maybe they are not, so this is a "heads up".

Here are some of my latest results..

http://setiathome.berkeley.edu/result.php?resultid=2221711451
http://setiathome.berkeley.edu/result.php?resultid=2221682555

The links should be all you need to discover my environment etc.

ID: 1179716 · Report as offensive
Tutankhamon "Communist"
Volunteer tester
Avatar

Send message
Joined: 1 Nov 08
Posts: 6081
Credit: 37,635,846
RAC: 16,153
Sweden
Message 1179719 - Posted: 21 Dec 2011, 14:19:42 UTC - in response to Message 1179716.
Last modified: 21 Dec 2011, 14:29:51 UTC

For some months now, I've noticed that Astropulse work-units give a computation error almost immediately. I had assumed that someone was aware, but maybe they are not, so this is a "heads up".

Here are some of my latest results..

http://setiathome.berkeley.edu/result.php?resultid=2221711451
http://setiathome.berkeley.edu/result.php?resultid=2221682555

The links should be all you need to discover my environment etc.


Yes, that seems to be the rule rather than the exception when running AP on Linux. The "process got signal 11" error can often be seen when running AP's on Linux. I know there was a solution to that, and I'm sure someone who knows about Linux, will help you solve that issue.

Edit: I have had hundreds of wingmen on my AP's running Linux, and erroring out, so the WU has to be sent out again to a third wingman. I do remember though, that there was a solution to the problem, but since I am not running Linux, I did not pay attention to what it was.
This is a test of the Emergency Moron System. Had there been a real moron in the room, there would've been a small mushroom cloud in the place where the idiot had been standing.

ID: 1179719 · Report as offensive
ClaggyProject Donor
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4622
Credit: 46,337,052
RAC: 2,686
United Kingdom
Message 1179722 - Posted: 21 Dec 2011, 14:31:26 UTC - in response to Message 1179719.
Last modified: 21 Dec 2011, 14:40:20 UTC

For some months now, I've noticed that Astropulse work-units give a computation error almost immediately. I had assumed that someone was aware, but maybe they are not, so this is a "heads up".

Here are some of my latest results..

http://setiathome.berkeley.edu/result.php?resultid=2221711451
http://setiathome.berkeley.edu/result.php?resultid=2221682555

The links should be all you need to discover my environment etc.


Yes, that seems to be the rule rather than the exception when running AP on Linux. I know there was a solution to that, and I'm sure someone who knows about Linux, will help you solve that issue.

The solution is to run the Optimised Linux apps, they are a lot faster and a lot less prone to erroring out, check out Arkayn's site for downloads:

Crunchers Anonymous Downloads

Be aware that once you go onto Anonymous Platform, the Setiathome apps won't get updated by the project, you'll have to do that yourself when new Optimised apps are available, or by reverting back to the Stock apps.
(we're probably only a few months away from the Rollout of Setiathome v7 and Astropulse v6)

Claggy

ID: 1179722 · Report as offensive
Profile James SotherdenProject Donor
Avatar

Send message
Joined: 16 May 99
Posts: 10133
Credit: 65,678,303
RAC: 35,700
United States
Message 1179727 - Posted: 21 Dec 2011, 14:39:28 UTC
Last modified: 21 Dec 2011, 14:42:06 UTC

exit139

Here is a link that has info on exit 139 for linux.


[/quote]

Old James

ID: 1179727 · Report as offensive
Profile Khangollo
Avatar

Send message
Joined: 1 Aug 00
Posts: 245
Credit: 36,410,524
RAC: 0
Slovenia
Message 1179730 - Posted: 21 Dec 2011, 14:44:52 UTC
Last modified: 21 Dec 2011, 14:48:28 UTC

This is happening for years now...

x86_64 stock version of Linux AP is broken and sigsegvs on any semi-recent glibc.
You probably don't have (all) 32-bit compatibility libs (such as glibc) and 32-bit version cannot be run (which I think is preferred by the scheduler for Linux).

Shame that admins don't at least remove the 64-bit version. It's 8 MBs of wasted bandwidth.
Don't they run any statistics, checking for errors/invalid counts per platform ??? Like they do at Einstein.

Anyway, run the optimized applications, as already suggested.
They are more than 2x faster!


ID: 1179730 · Report as offensive
Mike Davies

Send message
Joined: 10 Apr 10
Posts: 2
Credit: 122,944
RAC: 0
United Kingdom
Message 1179767 - Posted: 21 Dec 2011, 18:12:50 UTC - in response to Message 1179730.

Thanks for all the replies. I'll investigate the optimised versions.

ID: 1179767 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 2871
Credit: 10,620,806
RAC: 326
United States
Message 1179807 - Posted: 21 Dec 2011, 20:07:46 UTC

I had an issue with that way back when we went from AP_v5 to AP_v505. All of my tasks would instantly error out with a segment violation. The solution was to upgrade my GLIBC to at least 3.6, I believe it was, as I only had something like 3.1 at the time.

I forget the full details, but bottom line is it looks to me like a GLIBC issue, based on my past experience.


Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)

ID: 1179807 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3654
Credit: 8,593,301
RAC: 1,323
Bulgaria
Message 1179917 - Posted: 22 Dec 2011, 6:06:26 UTC - in response to Message 1179767.

Thanks for all the replies. I'll investigate the optimised versions.


You may find some useful discussion here:
http://setiathome.berkeley.edu/forum_thread.php?id=66123





- ALF - "Find out what you don't do well ..... then don't do it!" :)

ID: 1179917 · Report as offensive
Woofie

Send message
Joined: 11 Jan 12
Posts: 4
Credit: 10,688
RAC: 0
Czech Republic
Message 1185886 - Posted: 17 Jan 2012, 8:05:03 UTC
Last modified: 17 Jan 2012, 8:08:19 UTC

I notices that I got "signal 11" when suspend my computer or I hit the snooze button in boinc manager, computing is without problem until one of this event happened, maybe some other comes. I'm runing boinc v. 6.12.42 on Gentoo 32bit.

ID: 1185886 · Report as offensive
Profile Wiggo "Socialist"
Avatar

Send message
Joined: 24 Jan 00
Posts: 10507
Credit: 135,234,633
RAC: 37,090
Australia
Message 1190305 - Posted: 31 Jan 2012, 10:09:23 UTC - in response to Message 1185886.

Personally I wish that AP's were stopped from being given to Linux hosts all together as I've only ever seen 1 Linux host that returns good results, all the rest have just trashed them, and as a result this impacts on the connection due to resends (1 AP that I did recently was sent out 7 times because 4 separate Linux hosts trashed it, 1 Win machine timed out on it, before it was finally put to rest.

Cheers.


ID: 1190305 · Report as offensive
Profile ignorance is no excuse
Avatar

Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,321
RAC: 0
Korea, North
Message 1190357 - Posted: 31 Jan 2012, 13:36:02 UTC - in response to Message 1190305.

mine always does because I use the optimized apps. prior to the Opt app install the AP's were always trashed. Thanks again to the Lunatics folks for their diligent work


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

ID: 1190357 · Report as offensive

Message boards : Number crunching : Astropulse always crashes out.


 
©2016 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.