Message boards :
Number crunching :
Panic Mode On (60) Server problems?
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 9 · Next
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304 |
I don't know how the Scheduler is going at the moment Uploads finally cleared, and the Scheduler requests are still timing out. Grant Darwin NT |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
I don't know how the Scheduler is going at the moment Maybe time to do a tracert test on the scheduler from your side of the planet. I got a scheduler response in 45 seconds, half an hour ago: that's slow, but nothing like as slow as yesterday. |
JohnDK Send message Joined: 28 May 00 Posts: 1222 Credit: 451,243,443 RAC: 1,127 |
Just had 6 timeouts in a row, the 7th got through sending lost tasks. |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
I don't know how the Scheduler is going at the moment I suggested putting the scheduler on the campus link, then there would be no contention with downloads on a maxed out hurricane link, (there was no response to the suggestion) Claggy |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304 |
I don't know how the Scheduler is going at the moment My other system's had a couple of timeouts, but most times it gets a resposne. But it's got 6.10.58 so it tries every 5 min; sooner or later you'll get a response. The one with 6.12.33 has been backing off anything from 30 min to 4 hours. Even with a better hit/miss ratio than 6.10.58, it's not going to get much work. Grant Darwin NT |
rob smith Send message Joined: 7 Mar 03 Posts: 22190 Credit: 416,307,556 RAC: 380 |
That sounds like an NIH response Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304 |
Here's some pings. Ping statistics for 208.68.240.13: Packets: Sent = 4, Received = 2, Lost = 2 (50% loss), Approximate round trip times in milli-seconds: Minimum = 286ms, Maximum = 290ms, Average = 288ms Ping statistics for 208.68.240.16: Packets: Sent = 4, Received = 3, Lost = 1 (25% loss), Approximate round trip times in milli-seconds: Minimum = 282ms, Maximum = 311ms, Average = 292ms Ping statistics for 208.68.240.18: Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), Ping statistics for 208.68.240.20: Packets: Sent = 4, Received = 1, Lost = 3 (75% loss), Approximate round trip times in milli-seconds: Minimum = 278ms, Maximum = 278ms, Average = 278ms EDIT- just noticed my good machine was getting Scheduler responses in less than a minute, so i did a manaul update. Reported & got some work in about 1:30. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304 |
And here's a tracert Tracing route to boinc2.ssl.berkeley.edu [208.68.240.13] over a maximum of 30 hops: 1 1 ms <1 ms <1 ms home.gateway.home.gateway [192.168.1.254] 2 56 ms 55 ms 55 ms lo5000.lns2.adl1.adnap.net.au [122.49.191.12] 3 55 ms 55 ms 54 ms g3-32.cor2.adl1.adnap.net.au [219.90.143.121] 4 55 ms 55 ms 54 ms te1-3.cor2.adl4.adnap.net.au [219.90.143.242] 5 55 ms 56 ms 55 ms vlan369.55drc76fg.optus.net.au [59.154.0.49] 6 236 ms 235 ms 236 ms 203.208.192.241 7 241 ms 240 ms 240 ms xe-0-0-0-0.plapx-cr2.ix.singtel.com [203.208.183 .161] 8 237 ms 238 ms 236 ms paix.he.net [198.32.176.20] 9 241 ms 240 ms 240 ms 64.71.140.42 10 278 ms 278 ms 279 ms 208.68.243.254 11 * 294 ms 280 ms boinc2.ssl.berkeley.edu [208.68.240.13] Trace complete. Tracing route to setiboincdata.ssl.berkeley.edu [208.68.240.16] over a maximum of 30 hops: 1 <1 ms <1 ms <1 ms home.gateway.home.gateway [192.168.1.254] 2 55 ms 55 ms 54 ms lo5000.lns2.adl1.adnap.net.au [122.49.191.12] 3 54 ms 54 ms 54 ms g3-32.cor2.adl1.adnap.net.au [219.90.143.121] 4 54 ms 54 ms 55 ms te1-3.cor2.adl4.adnap.net.au [219.90.143.242] 5 55 ms 55 ms 55 ms vlan369.55drc76fg.optus.net.au [59.154.0.49] 6 239 ms 240 ms 240 ms 203.208.192.241 7 244 ms 243 ms 244 ms POS3-2.sngtp-ar2.ix.singtel.com [203.208.182.205 ] 8 250 ms 249 ms 249 ms paix.he.net [198.32.176.20] 9 244 ms 265 ms 244 ms 64.71.140.42 10 282 ms * 282 ms 208.68.243.254 11 * 282 ms * setiboincdata.ssl.berkeley.edu [208.68.240.16] 12 283 ms 282 ms 282 ms setiboincdata.ssl.berkeley.edu [208.68.240.16] Trace complete. Tracing route to boinc2.ssl.berkeley.edu [208.68.240.18] over a maximum of 30 hops: 1 <1 ms <1 ms <1 ms home.gateway.home.gateway [192.168.1.254] 2 55 ms 55 ms 55 ms lo5000.lns2.adl1.adnap.net.au [122.49.191.12] 3 54 ms 55 ms 55 ms g3-32.cor2.adl1.adnap.net.au [219.90.143.121] 4 54 ms 55 ms 55 ms te1-3.cor2.adl4.adnap.net.au [219.90.143.242] 5 55 ms 56 ms 55 ms vlan369.55drc76fg.optus.net.au [59.154.0.49] 6 240 ms 359 ms 239 ms 203.208.192.241 7 244 ms 244 ms 244 ms POS3-2.sngtp-ar2.ix.singtel.com [203.208.182.205 ] 8 240 ms 248 ms 248 ms paix.he.net [198.32.176.20] 9 245 ms 244 ms 245 ms 64.71.140.42 10 * 282 ms 282 ms 208.68.243.254 11 * * * Request timed out. 12 * * * Request timed out. 13 * * * Request timed out. 14 * * * Request timed out. 15 * * * Request timed out. 16 * * * Request timed out. 17 * * * Request timed out. 18 * * * Request timed out. 19 * * * Request timed out. 20 * * * Request timed out. 21 * * * Request timed out. 22 * * * Request timed out. 23 * * * Request timed out. 24 * * * Request timed out. 25 * * * Request timed out. 26 * * * Request timed out. 27 * * * Request timed out. 28 * * * Request timed out. 29 * * * Request timed out. 30 * * * Request timed out. Trace complete. Tracing route to setiboinc.ssl.berkeley.edu [208.68.240.20] over a maximum of 30 hops: 1 <1 ms <1 ms <1 ms home.gateway.home.gateway [192.168.1.254] 2 55 ms 54 ms 55 ms lo5000.lns2.adl1.adnap.net.au [122.49.191.12] 3 55 ms 55 ms 57 ms g3-32.cor2.adl1.adnap.net.au [219.90.143.121] 4 54 ms 54 ms 55 ms te1-3.cor2.adl4.adnap.net.au [219.90.143.242] 5 56 ms 55 ms 55 ms vlan369.55drc76fg.optus.net.au [59.154.0.49] 6 236 ms 237 ms 235 ms 203.208.192.241 7 253 ms 240 ms 240 ms xe-0-0-0-0.plapx-cr2.ix.singtel.com [203.208.183 .161] 8 239 ms 237 ms 236 ms paix.he.net [198.32.176.20] 9 240 ms 241 ms 241 ms 64.71.140.42 10 241 ms 240 ms 240 ms 208.68.243.254 11 241 ms 241 ms 241 ms setiboinc.ssl.berkeley.edu [208.68.240.20] Trace complete. EDIT- .18 & .13 are the download servers aren't they? In which case .18 must be providing most of the bandwidth, or it's broken, if the ping & tracert are anything to go by. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304 |
And down. No outbound network traffic. Grant Darwin NT |
Khangollo Send message Joined: 1 Aug 00 Posts: 245 Credit: 36,410,524 RAC: 0 |
|
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Looks like mebbe Vader hit the skids. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
YES - UL & scheduler NO - DL .. - server reachable, at least for my machine. - Best regards! - Sutaru Tsureku, team seti.international founder. - Optimize your PC for higher RAC. - SETI@home needs your help. - |
Donald L. Johnson Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 |
More like the organizational politics problem that Eric has mentioned a time or two. Donald Infernal Optimist / Submariner, retired |
Dave Send message Joined: 29 Mar 02 Posts: 778 Credit: 25,001,396 RAC: 0 |
NO - DL NJMT*. * Not just me then. |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
I only suggested it to the forum, and there was no response from the forum to it. Claggy |
rob smith Send message Joined: 7 Mar 03 Posts: 22190 Credit: 416,307,556 RAC: 380 |
Ah, I thought you'd suggested it directly to the men on the hill. Sounds a good idea, but a bit of thought would be needed to make sure that issues such as responses form "wrong" ip addresses weren't going to cause problems at the client end of things. And it shouldn't be too difficult to implement. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
rob smith Send message Joined: 7 Mar 03 Posts: 22190 Credit: 416,307,556 RAC: 380 |
Its going to be a "fun filled feeding frenzy" when the tyre kicker gets in. I've got about 80 downloads stalled, and the number is growing steadily. Interesting looking at the server status page - some bits of Vader are still showing as working (download server 2), but much of the rest is dead. This I don't understand - I'd have thought that if Vader was truly down for the count then all its processes would be out, but. Both download servers are apparently running, yet we are getting no deliveries. Must be more than just a "simple" server crash :-( That yellow fluff ball must be one mean critter... Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Blake Bonkofsky Send message Joined: 29 Dec 99 Posts: 617 Credit: 46,383,149 RAC: 0 |
Indeed. I've already got about 700 WU's trying to download, and that number is going to keep climbing if the DL server fixed soon. In fact, my caches will be mostly dry within 24 hours. With current limits in place, that's 3200 WU's across my 3 machines. 3200 tasks that have already been assigned and just waiting to download. Tasks are still being assigned, so I would think that the RAID system is still functioning. I'm betting a network connection got unplugged or failed. Both DL servers aren't even responding to pings, and traces fail, but the upload server and scheduler are both working just fine. If the RAID had died, work creation would cease, and the servers would still be reachable. Ready to send has been holding around 200,000. Hopefully it's nothing serious and we are back in business without too much trouble. |
AndyJ Send message Joined: 17 Aug 02 Posts: 248 Credit: 27,380,797 RAC: 0 |
This is the flight deck, Captain Server speaking. As you may have noticed, we have started our decent, so buckle up as things may get a little bumpy on the way down. We hope to get you on the ground real soon, as we will simply fall out of the sky on Tuesday. Thank you for flying Seti R.A.C, Have a great day. :-) Regards, A |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Indeed. I've already got about 700 WU's trying to download, and that number is going to keep climbing if the DL server fixed soon. In fact, my caches will be mostly dry within 24 hours. With current limits in place, that's 3200 WU's across my 3 machines. 3200 tasks that have already been assigned and just waiting to download. Yeah, very interesting. Overnight, the rigs have been issued tasks to fill the caches back up to the current limits. Of course, no downloads are possible right now, so they can't process any of it. Gonna be dicey getting those downloads through when the connection is reestablished. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.