(why is) ap_validate (synergy) not running |
![]() |
| log in |
Message boards : Number crunching : (why is) ap_validate (synergy) not running
| Author | Message |
|---|---|
|
Completed AP-WUs are piling up ( 11+k at present ) - any reason why ap_validate(1/2/3) are not running ? | |
| ID: 1136855 · | |
|
I imagine that the short answer is "because it's the weekend". There are some broken astropulse results clogging one of the validators (which is why it shows up on red on the status page). We'll have to figure out an automated way to detect these results and push them through (it's a real pain to do by hand). We might be suffering a recurrence of that - I don't know if they had any luck in working out what exactly was 'broken' about the results and where they were coming from - or even if they had time to look. And BTW - I think that Joe Segur explained recently that only ap_validate3 was active at the moment - 1 and 2 are left over from earlier Astropulse runs, and shouldn't have any work to do this long after the event. | |
| ID: 1136883 · | |
|
And don't forget that Synergy is running on half of it's normal RAM right now as well. The short story is we just plucked 48GB of memory out of synergy (back-end compute server) and added it to oscar (the main science database server). ____________ | |
| ID: 1136891 · | |
|
Bah, that sux. I run just AP on cpu (thought it´s better payed and run less units on cache) and when this happens, my RAC just stall. | |
| ID: 1137442 · | |
|
thanks guys. I know what to expect during "the-weekend". :D | |
| ID: 1137487 · | |
Bah, that sux. I run just AP on cpu (thought it´s better payed and run less units on cache) and when this happens, my RAC just stall. Don't forget that, even if the assimilators are running you won't get credit until your wingman has completed his processing of that WU. And that assumes he gets it back in time, and the two results validate against each other, if either of these conditions "fails" then you have to wait for it to be sent out to someone else to process, return... (Is a limit as to how often a WU can be sent out until its declared "dead"??) ____________ Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? | |
| ID: 1137551 · | |
Bah, that sux. I run just AP on cpu (thought it´s better payed and run less units on cache) and when this happens, my RAC just stall. Yes, its 5/10/10 max 5 errors and 10 in total. ____________ | |
| ID: 1137559 · | |
Yes.... max # of error/total/success tasks 5, 10, 5 It is listed in the WU details for every WU sent out. ____________ ****** "Ask not, what your kitty can do for you. Ask what you can do for your kitty." As it is kitten, so shall it be done. | |
| ID: 1137627 · | |
Message boards : Number crunching : (why is) ap_validate (synergy) not running
| Copyright © 2013 University of California |