(why is) ap_validate (synergy) not running


log in

Advanced search

Message boards : Number crunching : (why is) ap_validate (synergy) not running

Author Message
terencewee*
Send message
Joined: 10 Oct 09
Posts: 53
Credit: 7,022,510
RAC: 0
Malaysia
Message 1136855 - Posted: 6 Aug 2011, 17:07:14 UTC
Last modified: 6 Aug 2011, 17:27:00 UTC

Completed AP-WUs are piling up ( 11+k at present ) - any reason why ap_validate(1/2/3) are not running ?

Could it be due to all AP "tapes" are completed?

I see AP assimilators are running, but the queue is 0.

Just trying to understand better.

Thanks in advance.

terencewee*
Sicituradastra.

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8634
Credit: 51,631,146
RAC: 49,034
United Kingdom
Message 1136883 - Posted: 6 Aug 2011, 17:50:54 UTC

I imagine that the short answer is "because it's the weekend".

Matt Lebofsky did post recently (message 1112205):

There are some broken astropulse results clogging one of the validators (which is why it shows up on red on the status page). We'll have to figure out an automated way to detect these results and push them through (it's a real pain to do by hand).

We might be suffering a recurrence of that - I don't know if they had any luck in working out what exactly was 'broken' about the results and where they were coming from - or even if they had time to look.

And BTW - I think that Joe Segur explained recently that only ap_validate3 was active at the moment - 1 and 2 are left over from earlier Astropulse runs, and shouldn't have any work to do this long after the event.

Profile arkaynProject donor
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3692
Credit: 48,733,891
RAC: 6,024
United States
Message 1136891 - Posted: 6 Aug 2011, 18:03:02 UTC
Last modified: 6 Aug 2011, 18:03:42 UTC

And don't forget that Synergy is running on half of it's normal RAM right now as well.

The short story is we just plucked 48GB of memory out of synergy (back-end compute server) and added it to oscar (the main science database server).

____________

Profile Firehawk
Volunteer tester
Avatar
Send message
Joined: 21 May 99
Posts: 1731
Credit: 258,889,717
RAC: 3,298
Brazil
Message 1137442 - Posted: 7 Aug 2011, 23:24:57 UTC

Bah, that sux. I run just AP on cpu (thought it´s better payed and run less units on cache) and when this happens, my RAC just stall.
____________

terencewee*
Send message
Joined: 10 Oct 09
Posts: 53
Credit: 7,022,510
RAC: 0
Malaysia
Message 1137487 - Posted: 8 Aug 2011, 2:57:43 UTC

thanks guys. I know what to expect during "the-weekend". :D


@Firehawk:
No kidding. I lost 7th placing this recent SETI-Challenge over at BOINCstats due to ap_validate not running.

terencewee*
Sicituradastra.

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8535
Credit: 59,508,784
RAC: 87,234
United Kingdom
Message 1137551 - Posted: 8 Aug 2011, 7:24:40 UTC - in response to Message 1137442.

Bah, that sux. I run just AP on cpu (thought it´s better payed and run less units on cache) and when this happens, my RAC just stall.


Don't forget that, even if the assimilators are running you won't get credit until your wingman has completed his processing of that WU. And that assumes he gets it back in time, and the two results validate against each other, if either of these conditions "fails" then you have to wait for it to be sent out to someone else to process, return...

(Is a limit as to how often a WU can be sent out until its declared "dead"??)
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24558
Credit: 33,898,109
RAC: 24,169
Germany
Message 1137559 - Posted: 8 Aug 2011, 8:30:52 UTC - in response to Message 1137551.

Bah, that sux. I run just AP on cpu (thought it´s better payed and run less units on cache) and when this happens, my RAC just stall.


Don't forget that, even if the assimilators are running you won't get credit until your wingman has completed his processing of that WU. And that assumes he gets it back in time, and the two results validate against each other, if either of these conditions "fails" then you have to wait for it to be sent out to someone else to process, return...

(Is a limit as to how often a WU can be sent out until its declared "dead"??)


Yes, its 5/10/10

max 5 errors and 10 in total.

____________

Message boards : Number crunching : (why is) ap_validate (synergy) not running

Copyright © 2014 University of California