Message boards :
Number crunching :
Panic Mode On (113) Server Problems?
Message board moderation
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 37 · Next
Author | Message |
---|---|
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Wow, it's up to 166k/hr return rate now, but the system seems to be handling it fine. . . Yes, these Blc11 tasks are a particularly noisy lot. Still, there are still quite a few of the slow Blc04, Blc06 and Blc16 files left to split so that is the safety backup. Hopefully we will survive the weekend. . . PS: the result returns up to 190K per hour, and it hasn't fallen over yet. Stephen :) |
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
The results in the last hour is now up to 191K. The ready to send is just below 400K and falling. high level of connections and the results needing db purge is slightly high but not bad 3.3 million still have 6533 channels to split ( around 1000 is not the blc 11) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
The results in the last hour is now up to 191K. . . That is close to a day of WUs as a reserve ... Stephen :) |
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
The results in the last hour is now up to 191K. yes. we have about a day of non blc11 files...and not all the blc11 files are bad. We should have enough data, our main problem is that it can't be split fast enough. My initial guess (back of the envelope with not enough data) is that we have about 4-5 hours before RTS runs dry) Hopefully someone can throw an Aricebo file in to be split. edited to add: with more data it still looks like we are dropping 15K in RTS every 10 minutes. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13161 Credit: 1,160,866,277 RAC: 1,873 |
Looks like the splitters have finally kicked into overdrive. But not soon enough. Just about all out of tasks (35K) in the RTS buffer. Return rate at 197K/hour. At least a Arecibo file just got loaded which should slow down the return rate eventually. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
A splitting rate of 59.0769/sec (current show) is 212,676/hour. We're gaining already. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13161 Credit: 1,160,866,277 RAC: 1,873 |
The splitting rate has halved. We're stagnant in rate climb and RTS buffer quantity. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
betreger Send message Joined: 29 Jun 99 Posts: 11360 Credit: 29,581,041 RAC: 66 |
Unless something changes soon we're screwed. |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
It will probably be 2-3 more hours before the majority of people start crunching the Arecibo tasks, which will slow the return rate, and let the system recover. |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
Has everyone else been seeing really short tasks on both the gpu and cpu sides? Tom A proud member of the OFA (Old Farts Association). |
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
Has everyone else been seeing really short tasks on both the gpu and cpu sides? yup. we are panicing... will the rts run out? will the amount to purge from db get to large? can the system handle the load... |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13722 Credit: 208,696,464 RAC: 304 |
Well it looks like the new system tipping point is around 150-160k/hr or so (the resolution of the graphs makes it difficult to see). That's where the WUs-awaiting-deletion started to spike, and the splitter output choked. WUs-awaiting-deletion are starting to clear out, although the Results-awaiting -purge are still going through the roof. It's looking like the number of WUs in the Ready-to-send cache is now playing in to the splitter output, WU-awaiting-deletion, Received-last-hour balancing act. A buffer of around 30k and the splitters cranked up again, hitting around 78k and they die again. And again- 30k, off they go, 78k, now they don't. Hopefully the next batch of Arecibo work will have plenty of VLARs in it to take some of the load off of the servers. Grant Darwin NT |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
The return rate has been climbing steadily. It is now over 142k from ~90k before the BLC11 tapes. A 50% increase is something a (relatively) new release of Linux CUDA app can do.. I doubt that there are so many of us using Linux CUDA V0.97b2 yet to affect return rate. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
This has been related to the BLC11 files, not the apps. Fast overflows of CPU and GPU tasks on all platforms. |
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
looks like the RTS is going up again(146K+). Not sure how much time before it starts to fall again, but hope the aricebo file that was put into the mix helps for a while. I'm also still concerned about the results waiting for db purge as it is over 4 million now and is usually in the 2 million range. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
A splitting rate of 59.0769/sec (current show) is 212,676/hour. We're gaining already. . . Just barely! I think the road runner would be feeling the coyote's hot breath at this distance ... :) Stephen :) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
The return rate has been climbing steadily. It is now over 142k from ~90k before the BLC11 tapes. . . Nope but a very, very noisy set of Blc11 tapes sure can ... I have had cache loads that were nearly 90% noise bombs ... the cache was gone just like that ... (clicks fingers) ... Stephen :( |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
I see they just loaded an Arecibo file, that should help the servers out for a bit. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13722 Credit: 208,696,464 RAC: 304 |
I see they just loaded an Arecibo file, that should help the servers out for a bit. I hope so. There's been a big spike in AP WU-awaiting-deletion, along with a steady climb in MB WU-awaiting-deletion, and the splitter output has fallen off a cliff. It had been steadily declining from around 55/s down to around 40/s, then it plummeted to 10/s as those other backlogs hit their current peaks. And the result being the Ready-to-send buffer after hitting a peak of about 160k & then declining slightly has started falling like stone. Grant Darwin NT |
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
The RTS continues to slowly clim (228k at the moment), but the results waiting for db purge is at 4.8 million and also climbing. I'm betting we get a long outrage on Tuesday to clean all this up. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.