Message boards :
Number crunching :
Panic Mode On (41) Server problems
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 10 · Next
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
Time for the first panic of the new system. Network traffic has dropped way down (wasn't even pegged the way it was previously). Result creation rate shows as 0, splitters shown as all down. Grant Darwin NT |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
I would imagine da boyz in da lab are doing some tuning of the setup. I am sure there will be a lot of configuration issues in the coming weeks that will have to be adjusted as they learn the new hardware and get it tweaked to perfection. See Matt's last tech post. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
And they have Oscar offline now as well, so they are playing with things a bit. Hopefully not a big showstopper. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
No panic. Matt just posted this in tech news...... "In case anybody is wondering - we're trying to increase various settings like I mentioned at the top of the thread, and this is leading to the predictably unexpected snags. No worries - we've proven we can fall back to this morning's settings without much ado, but we're leaving splitters/assimilators off for now in case we can figure this out quickly. - Matt" "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Wiggo Send message Joined: 24 Jan 00 Posts: 34746 Credit: 261,360,520 RAC: 489 |
All but 2 AP splitters back to green now. Cheers. |
Dave Send message Joined: 29 Mar 02 Posts: 778 Credit: 25,001,396 RAC: 0 |
Surely someone should rename this thread to "Panic Mode Off (41) Server problems"... :D. |
Allie in Vancouver Send message Joined: 16 Mar 07 Posts: 3949 Credit: 1,604,668 RAC: 0 |
Everything save one ap splitter running now. 438 MB ready to send. Such riches to behold! @ Dave: there will always be the occasional interruption to the work flow and always some folks will panic about it. Nature of the beast. ;o) Pure mathematics is, in its way, the poetry of logical ideas. Albert Einstein |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Splitters down..... More playtime for Matt? He was having fun ramping up the RAM on Oscar yesterday....LOL. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
Up again, although i notice that the splitting rate is quite low- no Ready to Send buffer developing. Grant Darwin NT |
-BeNt- Send message Joined: 17 Oct 99 Posts: 1234 Credit: 10,116,112 RAC: 0 |
It's because from my understanding as of right now they are splitting and sending on demand. Traveling through space at ~67,000mph! |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
It's because from my understanding as of right now they are splitting and sending on demand. Well, not quite. It's just that work is being sent out as fast as it can be split. Once the caches start to fill, or current limits on cache are reached, ready to send will start to build. But given the magnitude of the big outage, that may take some time to happen. My rigs are currently having no problem getting enough to build a little cache and keep them all in production. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
DJStarfox Send message Joined: 23 May 01 Posts: 1066 Credit: 1,226,053 RAC: 2 |
Looks like Oscar still needs some tweaking. |
-BeNt- Send message Joined: 17 Oct 99 Posts: 1234 Credit: 10,116,112 RAC: 0 |
It's because from my understanding as of right now they are splitting and sending on demand. Yeah mine are sending in results to Seti almost instantly when they are finished and downloading a new WU along with it. My cache has been at 300+ on both my machines all day today. Right now it's saying there is 439/3 results in the ready to send box along with a ~13 second creation rate. And as soon as I typed that I checked, read refreshed, the server stats page and it's empty again. Ah well everything seems good on this end to keep my machines partly happy for awhile. Now if I could get some cpu WU's everything would be great! I'm sure there are some people out there that still can't get work units but over time it will get better, we are only what, one day in from the long downtime. Half full guys, half full. Traveling through space at ~67,000mph! |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
Still not spliting very fast & yet the network pipe is less than 2/3 full. Before the outage the system was able to saturate the network connection & still build a Ready to Send buffer. I think still more tweaking is required. And the assimilator queue contines to grow. Grant Darwin NT |
soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0 |
I have to disagree. Before the outtage the bandwidth was saturated with errors and failures, repeated handshakes to try again. It is not staying saturated because work is getting THROUGH!!! Janice |
Pappa Send message Joined: 9 Jan 00 Posts: 2562 Credit: 12,301,681 RAC: 0 |
I have to disagree. Before the outtage the bandwidth was saturated with errors and failures, repeated handshakes to try again. It is not staying saturated because work is getting THROUGH!!! I have to agree! Before the outage there were over 4000000 results in the field (which represents cache size etc). Work is being sent as fast as it can be split and sent. Then everyone knew that it would take a week+ to get things back to where they were before the Outage. So we are now looking at what the New Balance will be. Yes, that will take time to establish. Regards Please consider a Donation to the Seti Project. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
And i disagree. Previously network traffic was maxed out by downloads. The splitters were still able to produce enough work for the Ready to Send buffer to build up. At present the network traffic is only 60Mb/s, yet the Ready to Send buffer is only 151 Work Units, it used to have a limit of 200,000. So far the highest it's been is 1,000. Grant Darwin NT |
Blake Bonkofsky Send message Joined: 29 Dec 99 Posts: 617 Credit: 46,383,149 RAC: 0 |
Coming back from a weekly outage, typically there were 300k+ WU's ready to be sent, resulting in immediately clogged pipes, server back-off was only 10sec instead of 5 minutes, and people weren't COMPLETELY out of work like I'm sure 99% of the userbase was two days ago. Now, EVERYONE needs work, so what WU's are available, are immediately being snatched up. If you look at the server status, it has been averaging 300-500 DB requests per second since the project went live again. At 25 WU/Sec creation rate, if even just 1/10 (30-50) of those requests are for work, those 25 WU's are going to be gone within the same second they are created. Throw in the fact that the team is still working on optimizing the servers, I'd say we are in pretty good shape. The project has been running on these machines now for the last week, and has done so without ANY unexpected downtime. Sure there have been a few glitches here and there as they have been trying various settings, but it has been quickly fixed and brought back up. |
Highlander Send message Joined: 5 Oct 99 Posts: 167 Credit: 37,987,668 RAC: 16 |
And i only wondering about the result creation rate: actually i never saw them going over 25/s; before the outtage, they were able to go to 40/s. Hope, this is changeable through finetuning of the new servers. - Performance is not a simple linear function of the number of CPUs you throw at the problem. - |
Blake Bonkofsky Send message Joined: 29 Dec 99 Posts: 617 Credit: 46,383,149 RAC: 0 |
And i only wondering about the result creation rate: actually i never saw them going over 25/s; before the outtage, they were able to go to 40/s. Hope, this is changeable through finetuning of the new servers. I'm sure they'll be able to get that tuned up. This hardware is far superior to the previous setup, but being brand new, it will take some fine-tuning to get it to really run like it should. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.