(why is) ap_validate (synergy) not running


log in

Advanced search

Message boards : Number crunching : (why is) ap_validate (synergy) not running

Author Message
terencewee*
Send message
Joined: 10 Oct 09
Posts: 53
Credit: 7,022,510
RAC: 0
Malaysia
Message 1136855 - Posted: 6 Aug 2011, 17:07:14 UTC
Last modified: 6 Aug 2011, 17:27:00 UTC

Completed AP-WUs are piling up ( 11+k at present ) - any reason why ap_validate(1/2/3) are not running ?

Could it be due to all AP "tapes" are completed?

I see AP assimilators are running, but the queue is 0.

Just trying to understand better.

Thanks in advance.

terencewee*
Sicituradastra.

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8435
Credit: 47,908,333
RAC: 57,515
United Kingdom
Message 1136883 - Posted: 6 Aug 2011, 17:50:54 UTC

I imagine that the short answer is "because it's the weekend".

Matt Lebofsky did post recently (message 1112205):

There are some broken astropulse results clogging one of the validators (which is why it shows up on red on the status page). We'll have to figure out an automated way to detect these results and push them through (it's a real pain to do by hand).

We might be suffering a recurrence of that - I don't know if they had any luck in working out what exactly was 'broken' about the results and where they were coming from - or even if they had time to look.

And BTW - I think that Joe Segur explained recently that only ap_validate3 was active at the moment - 1 and 2 are left over from earlier Astropulse runs, and shouldn't have any work to do this long after the event.

Profile arkaynProject donor
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3615
Credit: 48,193,170
RAC: 35,828
United States
Message 1136891 - Posted: 6 Aug 2011, 18:03:02 UTC
Last modified: 6 Aug 2011, 18:03:42 UTC

And don't forget that Synergy is running on half of it's normal RAM right now as well.

The short story is we just plucked 48GB of memory out of synergy (back-end compute server) and added it to oscar (the main science database server).

____________

Profile Firehawk
Volunteer tester
Avatar
Send message
Joined: 21 May 99
Posts: 1724
Credit: 255,473,395
RAC: 142,418
Brazil
Message 1137442 - Posted: 7 Aug 2011, 23:24:57 UTC

Bah, that sux. I run just AP on cpu (thought it´s better payed and run less units on cache) and when this happens, my RAC just stall.
____________

terencewee*
Send message
Joined: 10 Oct 09
Posts: 53
Credit: 7,022,510
RAC: 0
Malaysia
Message 1137487 - Posted: 8 Aug 2011, 2:57:43 UTC

thanks guys. I know what to expect during "the-weekend". :D


@Firehawk:
No kidding. I lost 7th placing this recent SETI-Challenge over at BOINCstats due to ap_validate not running.

terencewee*
Sicituradastra.

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8254
Credit: 54,397,557
RAC: 73,978
United Kingdom
Message 1137551 - Posted: 8 Aug 2011, 7:24:40 UTC - in response to Message 1137442.

Bah, that sux. I run just AP on cpu (thought it´s better payed and run less units on cache) and when this happens, my RAC just stall.


Don't forget that, even if the assimilators are running you won't get credit until your wingman has completed his processing of that WU. And that assumes he gets it back in time, and the two results validate against each other, if either of these conditions "fails" then you have to wait for it to be sent out to someone else to process, return...

(Is a limit as to how often a WU can be sent out until its declared "dead"??)
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 23661
Credit: 32,360,893
RAC: 24,802
Germany
Message 1137559 - Posted: 8 Aug 2011, 8:30:52 UTC - in response to Message 1137551.

Bah, that sux. I run just AP on cpu (thought it´s better payed and run less units on cache) and when this happens, my RAC just stall.


Don't forget that, even if the assimilators are running you won't get credit until your wingman has completed his processing of that WU. And that assumes he gets it back in time, and the two results validate against each other, if either of these conditions "fails" then you have to wait for it to be sent out to someone else to process, return...

(Is a limit as to how often a WU can be sent out until its declared "dead"??)


Yes, its 5/10/10

max 5 errors and 10 in total.

____________

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38676
Credit: 572,989,998
RAC: 553,171
United States
Message 1137627 - Posted: 8 Aug 2011, 16:30:11 UTC - in response to Message 1137551.



(Is a limit as to how often a WU can be sent out until its declared "dead"??)

Yes....

max # of error/total/success tasks 5, 10, 5

It is listed in the WU details for every WU sent out.
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

Message boards : Number crunching : (why is) ap_validate (synergy) not running

Copyright © 2014 University of California