Message boards :
Number crunching :
Panic Mode On (109) Server Problems?
Message board moderation
Previous · 1 . . . 33 · 34 · 35 · 36
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13822 Credit: 208,696,464 RAC: 304 |
I'd say the administrators need to shorten up the cron job interval on the deleters purge task so that we could maintain a higher average RTS buffer quantity. Or if the purge is threshold based, to lower it. I think it's just a question of I/O congestion. The deleters run all the time, however with the current rate of work return and the current rate of WU splitting required to keep that rate of return going, there's so much I/O contention that the deleters can't keep up. Eventually the I/O contention gets to such a point that the output of the splitters falls away, but the deleters still can't keep up with the load, so the backlog continues to grow. Eventually it gets to the point where the deleters are able to catch up & clear the backlog, then their reduced level of I/O allows the splitters to crank back up again; till the delete backlog & load reaches that trigger point & the splitter slow down again. Rinse and repeat. The combination of returned per hour, in progress, awaiting deletion & required splitter output is resulting in a huge amount of I/O, which is more than the servers can actually meet. So you end up with these moving trigger points where one function slows down and the other speeds up, then it slows down & the first one speeds up again. And back & forth they go. That's my speculation based on minimal facts. Grant Darwin NT |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
OK, I'm sure I read in some other post in recent days that the deleters and purgers don't run continuously. Now I have to find that post. [Edit] Found it. By Rob Smith Message 1913582 Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14666 Credit: 200,643,578 RAC: 874 |
That's probably a difference between Main and Beta. Beta certainly doesn't purge the database continuously - Eric likes to keep older tasks visible for comparison and retrospective bug-hunting. Main, on the other hand, needs to clear the decks within 24 hours or we're swamped.I think it's just a question of I/O congestion.OK, I'm sure I read in some other post in recent days that the deleters and purgers don't run continuously. Now I have to find that post. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13822 Credit: 208,696,464 RAC: 304 |
OK, I'm sure I read in some other post in recent days that the deleters and purgers don't run continuously. Now I have to find that post. Got me curious too. Generally (when things have been working well), the number of WU awaiting Validation, Assimilation and Deletion is generally around 0, occasionally 1-3 (emphasis on when everything is working OK). So even if they don't run all the time, they run when there is something to do. Which is pretty much all the time (especially with 145k results being returned per hour). Looking at AP, where the return rate is less than 1 per minute at the moment, the WUs awaiting Validation, Assimilation & Deletion are around 1, with periods of 0 & a few periods of 2 or 3. It could be they run all the time, and those values of 1-3 are at the time the data is read, before the WU is processed. Or it could be as you say- they don't run all the time, only when there is work to be done. Either way, it means the MB WU Validator/Deleter/Assimilators are running (effectively) all the time with 40/s there to be processed, as the values there were usually around (or very close to) 0. Grant Darwin NT |
Bernie Vine Send message Joined: 26 May 99 Posts: 9954 Credit: 103,452,613 RAC: 328 |
Panic Mode On (110) Server Problems? Now open for business |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.