Astropulse Errors-Optimized version 5


log in

Advanced search

Message boards : Number crunching : Astropulse Errors-Optimized version 5

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author Message
Profile Adam Alexander
Avatar
Send message
Joined: 24 Dec 07
Posts: 3
Credit: 1,004,509
RAC: 0
United States
Message 843402 - Posted: 22 Dec 2008, 2:47:34 UTC - in response to Message 843389.

I've got an AP 5.0 wu running that has time to completion as 450 hours on a Core 2 Quad Q6700. Is that normal? I've never noticed a SETI work unit that took that long to crunch.

No, it's because you've been running CUDA setiathome_enhanced work and the Duration Correction Factor is high. Crunch time wlll probably be a tenth or less of that estimate. Astropulse does take considerably longer than setiathome_enhanced though, even on hosts running both with CPU.
Joe


Thanks. I noticed after I posted that just under 4 hours in the WU is 12% complete so I thought there must be something like that going on,

Profile Adam Alexander
Avatar
Send message
Joined: 24 Dec 07
Posts: 3
Credit: 1,004,509
RAC: 0
United States
Message 843646 - Posted: 22 Dec 2008, 14:48:36 UTC - in response to Message 843402.

I've got an AP 5.0 wu running that has time to completion as 450 hours on a Core 2 Quad Q6700. Is that normal? I've never noticed a SETI work unit that took that long to crunch.

No, it's because you've been running CUDA setiathome_enhanced work and the Duration Correction Factor is high. Crunch time wlll probably be a tenth or less of that estimate. Astropulse does take considerably longer than setiathome_enhanced though, even on hosts running both with CPU.
Joe


Thanks. I noticed after I posted that just under 4 hours in the WU is 12% complete so I thought there must be something like that going on,


The work unit finished with a compute error. Here's the message:

<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
- exit code -202 (0xffffff36)
</message>
<stderr_txt>
In ap_gfx_main.cpp: in ap_graphics_init(): Starting client.
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 896
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 1024
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 1152
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 1280
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 1408
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 1536
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 1664
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 1792
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 1920
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 2048
In ap_gfx_main.cpp: in ap_graphics_init(): Starting client.
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 2048
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 2176
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 2304
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 2432
In ap_gfx_main.cpp: in ap_graphics_init(): Starting client.
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 2432
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 2560
In ap_gfx_main.cpp: in ap_graphics_init(): Starting client.
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 2560
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 2688
In ap_gfx_main.cpp: in ap_graphics_init(): Starting client.
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 2688
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 2816
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 2944
In ap_gfx_main.cpp: in ap_graphics_init(): Starting client.
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 2944
In ap_client_main.cpp: in mainloop(): at dm_chunk_large 3072
No heartbeat from core client for 30 sec - exiting
In ap_gfx_main.cpp: in ap_graphics_init(): Starting client.
boinc_graphics_make_shmem failed: 0

</stderr_txt>
]]>

What happened and how do I stop it from happening again? This WU took up 21871.03 of processor time, I'd rather this didn't happen again.

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3390
Credit: 46,311,954
RAC: 10,067
Russia
Message 843873 - Posted: 22 Dec 2008, 23:54:43 UTC - in response to Message 843646.

Sorry, but you running NOT optimized AP V5 application. You running stock app. And this thread about optimized application. I recommend you to install current opt app version just as it recommended in first post you cited and to try again with AP.

Profile arkaynProject donor
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3617
Credit: 48,443,012
RAC: 38,901
United States
Message 845881 - Posted: 28 Dec 2008, 6:21:00 UTC

A couple of units processed with the AP optimized client for OSX.

http://setiathome.berkeley.edu/result.php?resultid=1091949303
http://setiathome.berkeley.edu/result.php?resultid=1093419139

No, I did not process them, just reporting.
____________

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4225
Credit: 1,041,551
RAC: 361
United States
Message 846039 - Posted: 28 Dec 2008, 16:55:25 UTC - in response to Message 845881.

A couple of units processed with the AP optimized client for OSX.

http://setiathome.berkeley.edu/result.php?resultid=1091949303
http://setiathome.berkeley.edu/result.php?resultid=1093419139

No, I did not process them, just reporting.

Neither of those used the optimized AP application, looks like a direct unoptimized port. The host has since converted to the optimized app, see for instance
http://setiathome.berkeley.edu/result.php?resultid=1100332648
Joe

Profile arkaynProject donor
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3617
Credit: 48,443,012
RAC: 38,901
United States
Message 846041 - Posted: 28 Dec 2008, 17:08:54 UTC - in response to Message 846039.

I finally saw the optimized output text for OSX.

I was just posting info from the MacNN site, the poster has always run the optimized MB client.
____________

Profile [seti.international] Dirk SadowskiProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7051
Credit: 59,888,948
RAC: 21,881
Germany
Message 847528 - Posted: 1 Jan 2009, 1:15:54 UTC


I'm not up-to-date because of Astropulse.

If AP V4.3x and V5.x were compared, no rig get Credits.. it's now also?
Maybe a script is running to give the Credits?

It's safe now to let run AP?

____________
BR



>Das Deutsche Cafe. The German Cafe.<

Profile Fred J. Verster
Volunteer tester
Send message
Joined: 21 Apr 04
Posts: 3235
Credit: 31,648,506
RAC: 4,968
Netherlands
Message 847555 - Posted: 1 Jan 2009, 1:55:42 UTC - in response to Message 847528.
Last modified: 1 Jan 2009, 2:09:14 UTC

Hi Crunchers, first of all, best wishes for 2009, since 10 days, I have UPdated 3 QUAD's from BOINC 5.10.45 to 6.4.5 and added the V5 optimized app. for AP (+altered app_info.xml file.
I did already run optimized MB (AK SSSE3 x64).
I have NOT seen an error yet, hope it stays that way ;).

Here an example, it takes my (fastest) host more then 10 hours:
AP WU
I know, it's not an error :)
____________

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4225
Credit: 1,041,551
RAC: 361
United States
Message 847662 - Posted: 1 Jan 2009, 5:21:15 UTC - in response to Message 847528.

I'm not up-to-date because of Astropulse.

If AP V4.3x and V5.x were compared, no rig get Credits.. it's now also?
Maybe a script is running to give the Credits?

For clean WUs where the V5.00 radar blanking isn't needed it gives the same results as V4.3x, even some cases where a modest amount of blanking is done can produce strongly similar results. I don't know if the script is still being run periodically, but almost all hosts will be using V5.00 now. The few who set up V4.3x with an app_info.xml and stopped paying attention are just wasting their host's CPU time now.

It's safe now to let run AP?

Either stock or optimized 5.00 should produce results which validate. Absolute safety of credits cannot be expected; you might be paired with someone running an obsolete BOINC version, and that can be particularly painful on AP work.
Joe

Tom95134
Send message
Joined: 27 Nov 01
Posts: 213
Credit: 3,332,888
RAC: 1,000
United States
Message 848244 - Posted: 2 Jan 2009, 16:44:52 UTC

My system updated yesterday to Version 5.00 and then downloaded WO 389655491. The question I have is that the Time To Completion on this WU is 199 hrs. It's very strange.

My system is a Pentium 4 3.00GHz. I've never seen this kind of TTC before.

Any recommendations? Should I just keep running or abort and get a new WU?
____________

Fred W
Volunteer tester
Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 848246 - Posted: 2 Jan 2009, 16:54:53 UTC - in response to Message 848244.

My system updated yesterday to Version 5.00 and then downloaded WO 389655491. The question I have is that the Time To Completion on this WU is 199 hrs. It's very strange.

My system is a Pentium 4 3.00GHz. I've never seen this kind of TTC before.

Any recommendations? Should I just keep running or abort and get a new WU?

Is that the first AstroPulse WU you have snagged? Processing time on AP is fairly linear (much better than the initial "estimate" for assessing likely run-time) so do your own estimate from 20 x the 5% completed or 10 x the 10% completed time. Bear in mind that if it runs to completion it is worth about 750 creds.

F.
____________

Tom95134
Send message
Joined: 27 Nov 01
Posts: 213
Credit: 3,332,888
RAC: 1,000
United States
Message 848293 - Posted: 2 Jan 2009, 18:33:12 UTC - in response to Message 848246.

I've got AP workunits before but nothing that has had this kind of TTC. It kind of knocked me off my chair when I saw it. I'll keep an eye on it for the day to see where it's going.

Thanks for the reply.

Tom
____________

rkasparek
Send message
Joined: 18 May 99
Posts: 1
Credit: 1,933,004
RAC: 0
United States
Message 849518 - Posted: 5 Jan 2009, 2:39:40 UTC

I am a relative 'newbie' to all of this so if my question has been asked and I simply did not understand the answer, please excuse my ignorance.

I did not get credit for my Astro Pulse unit at
http://setiathome.berkeley.edu/result.php?resultid=1098316491

I did receive credit for a second one however. Can anyone explain?
____________

Profile tullioProject donor
Send message
Joined: 9 Apr 04
Posts: 3650
Credit: 368,797
RAC: 261
Italy
Message 849555 - Posted: 5 Jan 2009, 5:28:24 UTC - in response to Message 849518.

I am a relative 'newbie' to all of this so if my question has been asked and I simply did not understand the answer, please excuse my ignorance.

I did not get credit for my Astro Pulse unit at
http://setiathome.berkeley.edu/result.php?resultid=1098316491

I did receive credit for a second one however. Can anyone explain?

Two of the WUs were crunched with app 5.0 and one with 4.36 so evidently their results do not agree so it was sent to another wingman. When he finishes you'll get your credit.
Tullio
____________

Profile [seti.international] Dirk SadowskiProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7051
Credit: 59,888,948
RAC: 21,881
Germany
Message 849581 - Posted: 5 Jan 2009, 7:02:20 UTC - in response to Message 847662.
Last modified: 5 Jan 2009, 7:03:22 UTC

It's safe now to let run AP?

Either stock or optimized 5.00 should produce results which validate. Absolute safety of credits cannot be expected; you might be paired with someone running an obsolete BOINC version, and that can be particularly painful on AP work.
Joe


You mean <= BOINC V5.2.5 ?

And then I would get 0 Credits?
____________
BR



>Das Deutsche Cafe. The German Cafe.<

Profile 335deezl
Volunteer tester
Avatar
Send message
Joined: 8 Jan 07
Posts: 15
Credit: 12,207,926
RAC: 0
United States
Message 849584 - Posted: 5 Jan 2009, 7:26:14 UTC

How's this for getting ripped:

Task ID 1113139473
Name 01dc08ac.19476.25021.14.8.231_1
Workunit 390919178
Created 4 Jan 2009 18:04:46 UTC
Sent 4 Jan 2009 23:09:02 UTC
Received 5 Jan 2009 0:32:23 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 4739973
Report deadline 11 Jan 2009 23:09:02 UTC
CPU time 709.0713
stderr out

<core_client_version>6.4.5</core_client_version>
<![CDATA[
<stderr_txt>
Unrecognized XML in parse_init_data_file: computation_deadline
Skipping: 1231708141.136000
Skipping: /computation_deadline
Unrecognized XML in parse_init_data_file: computation_deadline
Skipping: 1231708141.136000
Skipping: /computation_deadline
Windows optimized S@H Enhanced application by Alex Kan
Version info: SSE4.1 (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSE4.1 Win32 Build 41 , Ported by : Jason G, Raistmer, JDWhale

CPUID: Intel(R) Core(TM)2 Quad CPU Q9450 @ 2.66GHz
Speed: 4 x 2995 MHz
Cache: L1=64K L2=6144K
Features: MMX SSE SSE2 SSE3 SSSE3 SSE4.1

Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 2.713406

Flopcounter: 5215356389065.445300

Spike count: 3
Pulse count: 0
Triplet count: 0
Gaussian count: 0
called boinc_finish

</stderr_txt>
]]>

Validate state Valid
Claimed credit 14.3606629702566
Granted credit 0
application version 6.03

Profile 335deezl
Volunteer tester
Avatar
Send message
Joined: 8 Jan 07
Posts: 15
Credit: 12,207,926
RAC: 0
United States
Message 849587 - Posted: 5 Jan 2009, 7:29:07 UTC

For comparison, here's another unit, which i was given credit for. Note the consistent time it took for both wu's:

Task ID 1113139369
Name 01dc08ac.19476.25021.14.8.167_0
Workunit 390919113
Created 4 Jan 2009 18:04:45 UTC
Sent 4 Jan 2009 23:09:01 UTC
Received 5 Jan 2009 0:32:23 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 4739973
Report deadline 11 Jan 2009 23:09:01 UTC
CPU time 709.4925
stderr out

<core_client_version>6.4.5</core_client_version>
<![CDATA[
<stderr_txt>
Unrecognized XML in parse_init_data_file: computation_deadline
Skipping: 1231708140.136000
Skipping: /computation_deadline
Unrecognized XML in parse_init_data_file: computation_deadline
Skipping: 1231708140.136000
Skipping: /computation_deadline
Windows optimized S@H Enhanced application by Alex Kan
Version info: SSE4.1 (Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSE4.1 Win32 Build 41 , Ported by : Jason G, Raistmer, JDWhale

CPUID: Intel(R) Core(TM)2 Quad CPU Q9450 @ 2.66GHz
Speed: 4 x 2993 MHz
Cache: L1=64K L2=6144K
Features: MMX SSE SSE2 SSE3 SSSE3 SSE4.1

Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 2.713406

Flopcounter: 5215357230755.445300

Spike count: 1
Pulse count: 0
Triplet count: 3
Gaussian count: 0
called boinc_finish

</stderr_txt>
]]>

Validate state Valid
Claimed credit 14.3606629702566
Granted credit 14.3606629702566
application version 6.03

Profile [seti.international] Dirk SadowskiProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7051
Credit: 59,888,948
RAC: 21,881
Germany
Message 849590 - Posted: 5 Jan 2009, 7:57:49 UTC
Last modified: 5 Jan 2009, 8:11:23 UTC

@ HSS Global Server

Nothing to worry about..

wuid=390919178

It's a little bug in the server.

If one result is waiting for the 'wingman' it's pending.
If two results are waiting for the 3. 'wingman', the earlier pending is now 0.

You will get your Credits.


BTW.
Wrong thread.. ;-)
Your problem is AK v8.0 related.. not Astropulse..

AK V8 ported release ap. issues, install, questions etc.

For the next time.. :-)
____________
BR



>Das Deutsche Cafe. The German Cafe.<

Profile 335deezl
Volunteer tester
Avatar
Send message
Joined: 8 Jan 07
Posts: 15
Credit: 12,207,926
RAC: 0
United States
Message 849726 - Posted: 5 Jan 2009, 16:40:31 UTC - in response to Message 849590.

Oops! Apologies to all....thanks for the reply, though. :)

Blurf...can you delete my misposts, plz? Thx.

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4225
Credit: 1,041,551
RAC: 361
United States
Message 849800 - Posted: 5 Jan 2009, 19:38:00 UTC - in response to Message 849581.

It's safe now to let run AP?

Either stock or optimized 5.00 should produce results which validate. Absolute safety of credits cannot be expected; you might be paired with someone running an obsolete BOINC version, and that can be particularly painful on AP work.
Joe

You mean <= BOINC V5.2.5 ?

And then I would get 0 Credits?

Maybe with any version of BOINC the application can occasionally fail to find the shared memory through which it reports CPU time and fpops_cumulative, though it does seem to be more prevalent on BOINC 4.x particularly with Mac systems. Then the application runs in standalone mode, but BOINC does sense when it finishes and can upload the result file. It's the lack of either time or fpops which causes O credit claims in those cases.

The <= BOINC V5.2.5 cases will usually have the CPU time, and the claims will differ from yours using fpops_cumulative. The difference may go either way, someone is going to get less than claimed; maybe you, maybe the wingman.
Joe

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : Number crunching : Astropulse Errors-Optimized version 5

Copyright © 2014 University of California