Message boards :
Number crunching :
Panic Mode On (9) Server problems
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 11 · Next
Author | Message |
---|---|
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
I knoo the possibility of DNS attacks was floated in the past,,,,,,but was discounted at the time.........might another new analysis of network traffic be in order at this time?I was curious about the traffic too, as I've been studying some networking subjects lately. When I started traffic was at a much more humble rate around 20Mbps on the cricket graphs. Switching to the long term view recently, however, seems to tell the grim truth: bandwidth utilisation is scaling proportionally with Moore's law... Hi Mark and Jason, not too much trouble in gettin WU's, UP- or DOWNloaded, only goes in big 'chunks', like 20 to 40 WU's at a time. On all hosts, 'large' numbers off waitin to UPload. But never mind that. Eventually, they get UPloaded. {EDIT/ADD}We don't have to panic, though ;) |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
But is it all due to crunching, or server data transfer, or what??????Well surely some peak periods would represent recovery, and /or bulk moving server data about, however the general trend, and I'm looking at the yearly cricket graph here, looks somewhere between linear and exponential growth, and has gone from ~20Mbps to 60Mbps+ [in about a year?]. Assuming from the hardware donations threads that upgrading internal network speeds is going to happen gradually, that still leaves that 100Mbps link 'up the hill'. If that connection is fibre of some sort, replacing the transceivers at either end I reckon would cost big bucks :(. I hope we'll be alright for a while longer before that capacity is required, but then look at the general hardware performance increases of the last few years, seem pretty staggering, and set to continue. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65746 Credit: 55,293,173 RAC: 49 |
I know the possibility of DNS attacks was floated in the past,,,,,,but was discounted at the time.........might another new analysis of network traffic be in order at this time?I was curious about the traffic too, as I've been studying some networking subjects lately. When I started traffic was at a much more humble rate around 20Mbps on the cricket graphs. Switching to the long term view recently, however, seems to tell the grim truth: bandwidth utilization is scaling proportionally with Moore's law... I know that, But speculating is still fun. :) The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
But is it all due to crunching, or server data transfer, or what??????Well surely some peak periods would represent recovery, and /or bulk moving server data about, however the general trend, and I'm looking at the yearly cricket graph here, looks somewhere between linear and exponential growth, and has gone from ~20Mbps to 60Mbps+ [in about a year?]. One of Eric's posts estimated the upgrade cost at over 30K USD, definitely not pocket change. As to the blip, I've noticed it at recurring periods on the Cricket graph. That is, the traffic in to SSL jumps up to about 12 MBits/sec at intervals of about 3 hours and remains high for about half an hour. If I try to upload or get new work during those high periods, there are a lot of HTTP 500 errors (Internal server error) and occasionally 503 or even 403. When the rate is below 10 MBits/sec those errors are fairly rare. I think the 90+ MBits/sec output rates recently are mostly the project uploading raw data to the NERSC HPSS at LBNL. There were a couple of periods when the project was down but there was still a steady ~35 MBits/sec flowing out, that would be the max rate of that flow though Matt said it was NICE'd so probably doesn't get that much when WUs are also being downloaded. Still, those blips at ~3 hour intervals are suspiciously close to how long it takes to upload 50 GB at 35 MBits/sec. Joe |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65746 Credit: 55,293,173 RAC: 49 |
Ok I'm seeing trouble with a tracert I've run, earlier My WU's were getting connect failures and HTTP errors. Microsoft Windows [Version 5.2.3790] (C) Copyright 1985-2003 Microsoft Corp. C:\Documents and Settings\Administrator.PC1>tracert setiathome.berkeley.com Tracing route to setiathome.berkeley.com [208.254.26.139] over a maximum of 30 hops: 1 25 ms 25 ms 25 ms L100.DSL-35.LSANCA.verizon-gni.net [71.105.32.1] 2 27 ms 27 ms 26 ms G9-0-2035.LCR-10.LSANCA.verizon-gni.net [130.81.136.16] 3 28 ms 28 ms 28 ms P15-3.LCR-01.TAMPFL.verizon-gni.net [130.81.28.74] 4 30 ms 29 ms 30 ms 0.so-6-0-0.XT2.LAX9.ALTER.NET [152.63.10.157] 5 106 ms 105 ms 105 ms 0.so-5-0-2.XT2.DCA6.ALTER.NET [152.63.0.197] 6 107 ms 108 ms 107 ms so-0-0-0.ur2.iad6.web.wcom.net [157.130.59.74] 7 * * * Request timed out. 8 * * * Request timed out. 9 * * * Request timed out. 10 * * * Request timed out. 11 * * * Request timed out. 12 * * * Request timed out. 13 * * * Request timed out. 14 * * * Request timed out. 15 * * * Request timed out. 16 * * * Request timed out. 17 * * * Request timed out. 18 * * * Request timed out. 19 * * * Request timed out. 20 * * * Request timed out. 21 * * * Request timed out. 22 * * * Request timed out. 23 * * * Request timed out. 24 * * * Request timed out. 25 * * * Request timed out. 26 * * * Request timed out. 27 * * * Request timed out. 28 * * * Request timed out. 29 * * * Request timed out. 30 * * * Request timed out. Trace complete. C:\Documents and Settings\Administrator.PC1> I also tried to ping berkeley and the ping timed out 4 times. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Careface Send message Joined: 6 Jun 03 Posts: 128 Credit: 16,561,684 RAC: 0 |
I'm getting rather worried now.. my "best" cruncher (laptop) is almost out of work, for some reason isnt requesting anymore, and whenever I try to upload the last 3 or so days of work, the uploads get to 100% and then fail.. is this a server problem do you think? 19/09/2008 8:00:41 p.m.|SETI@home|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 0 completed tasks 19/09/2008 8:01:00 p.m.||Project communication failed: attempting access to reference site 19/09/2008 8:01:00 p.m.|SETI@home|Temporarily failed upload of 02ap08ad.19170.23385.14.8.119_2_0: http error 19/09/2008 8:01:00 p.m.|SETI@home|Backing off 3 hr 50 min 48 sec on upload of 02ap08ad.19170.23385.14.8.119_2_0 19/09/2008 8:01:00 p.m.|SETI@home|Temporarily failed upload of 18au08ab.20784.12752.9.8.246_1_0: http error 19/09/2008 8:01:00 p.m.|SETI@home|Backing off 3 hr 58 min 31 sec on upload of 18au08ab.20784.12752.9.8.246_1_0 19/09/2008 8:01:00 p.m.|SETI@home|Started upload of 04ap08ac.18298.2936.12.8.211_2_0 19/09/2008 8:01:00 p.m.|SETI@home|Started upload of 18au08ab.10585.16842.10.8.44_1_0 19/09/2008 8:01:02 p.m.||Access to reference site succeeded - project servers may be temporarily down. EDIT: My other cruncher can upload/download WU fine.. figured it could be the LAN connection (as my laptop is using wireless, and desktop is using wired), but I tried using both wireless and wired, and still no dice on the laptop.. *cries* |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65746 Credit: 55,293,173 RAC: 49 |
I'm getting rather worried now.. my "best" cruncher (laptop) is almost out of work, for some reason isnt requesting anymore, and whenever I try to upload the last 3 or so days of work, the uploads get to 100% and then fail.. is this a server problem do you think? Don't know as mine are all wired, It looks like My farms RAC is going up and down like the teeth on a saw blade(crosscut). The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
I'm getting rather worried now.. my "best" cruncher (laptop) is almost out of work, for some reason isnt requesting anymore, and whenever I try to upload the last 3 or so days of work, the uploads get to 100% and then fail.. is this a server problem do you think? As off 'now' (UTC=08:30) ready to report WU's are uploaded. 20-9-2008 10:07:23|SETI@home|Temporarily failed upload of 20au08ad.14308.11524.6.8.208_1_0: HTTP error 20-9-2008 10:07:23|SETI@home|Backing off 1 min 0 sec on upload of 20au08ad.14308.11524.6.8.208_1_0 20-9-2008 10:07:25||Internet access OK - project servers may be temporarily down. 20-9-2008 10:08:24|SETI@home|Started upload of 20au08ad.14308.11524.6.8.208_1_0 20-9-2008 10:08:59|SETI@home|Finished upload of 20au08ad.14308.11524.6.8.208_1_0 20-9-2008 10:28:27|SETI@home|Computation for task 20au08aa.13418.13978.6.8.147_1 finished 20-9-2008 10:28:27|Einstein@Home|Resuming task h1_0818.10_S5R4__542_S5R4a_0 using einstein_S5R4 version 604 20-9-2008 10:28:30|SETI@home|Started upload of 20au08aa.13418.13978.6.8.147_1_0 Then Inet-Access 'blocks' for a few hours and then goes on. A 'tracert' looks like this now: 1 <1 ms <1 ms <1 ms SX551xxxxxx [192.168.2.1] 2 7 ms 7 ms 7 ms 195.190.249.32 3 10 ms 10 ms 9 ms iawxsrt-dc2-bb21-ge-5-0-0.328.wxs.nl [213.75.1.213] 4 25 ms 10 ms 10 ms 208.49.200.129 5 101 ms 94 ms 94 ms 0.ge-4-0-0.BR3.NYC4.ALTER.NET [204.255.169.125] 6 93 ms 93 ms 93 ms 0.ge-5-0-0.XL3.NYC4.ALTER.NET [152.63.3.109] 7 94 ms 93 ms 93 ms 0.so-4-0-3.XT1.DCA6.ALTER.NET [152.63.1.117] 8 96 ms 96 ms 96 ms so-0-0-0.ur1.iad6.web.wcom.net [157.130.59.70] 9 * * * Request timed out. 10 * * * Request timed out. 11 * * * Request timed out. 12 * ? |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65746 Credit: 55,293,173 RAC: 49 |
Looks like the server could use some brief attention: 9/21/2008 2:28:04 PM|SETI@home|[file_xfer] Temporarily failed download of 22mr08ac.7857.14387.16.8.31: HTTP error 9/21/2008 2:28:04 PM|SETI@home|Backing off 4 min 2 sec on download of file 22mr08ac.7857.14387.16.8.31 9/21/2008 2:28:43 PM|SETI@home|[file_xfer] Started download of file 21au08ab.1813.25021.15.8.98 9/21/2008 2:29:03 PM|SETI@home|[file_xfer] Finished download of file 21au08ab.1813.25021.15.8.98 9/21/2008 2:29:03 PM|SETI@home|[file_xfer] Throughput 17615 bytes/sec 9/21/2008 2:29:46 PM|SETI@home|Computation for task 20au08ac.7620.11524.8.8.27_0 finished 9/21/2008 2:29:46 PM|SETI@home|Starting 20au08ab.7342.15614.11.8.64_1 9/21/2008 2:29:46 PM|SETI@home|Starting task 20au08ab.7342.15614.11.8.64_1 using setiathome_enhanced version 528 9/21/2008 2:29:48 PM|SETI@home|[file_xfer] Started upload of file 20au08ac.7620.11524.8.8.27_0_0 9/21/2008 2:30:24 PM||Project communication failed: attempting access to reference site 9/21/2008 2:30:24 PM|SETI@home|[file_xfer] Temporarily failed upload of 20au08ac.7620.11524.8.8.27_0_0: connect() failed 9/21/2008 2:30:24 PM|SETI@home|Backing off 1 min 0 sec on upload of file 20au08ac.7620.11524.8.8.27_0_0 9/21/2008 2:30:25 PM||Access to reference site succeeded - project servers may be temporarily down. 9/21/2008 2:31:12 PM||Project communication failed: attempting access to reference site 9/21/2008 2:31:12 PM|SETI@home|[file_xfer] Temporarily failed upload of 20au08ac.7620.11115.8.8.223_0_0: HTTP error The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
MAKS Send message Joined: 29 Sep 03 Posts: 2 Credit: 61,096 RAC: 0 |
Same problems here (near Dresden, Germany) BOINC can't keep the connection to fully upload/download the packets. ping-pointing to the servers works fine on one second and stucks on the next one. how about a subproject SENSS (Search for Exterior Network SETI-Servers)? |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65746 Credit: 55,293,173 RAC: 49 |
Ok I'm getting HTTP errors and some of My PCs can't get in touch with the sever to either upload, report and probably download WU's. Someone needs to nudge the server awake I think. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
arkayn Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0 |
|
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65746 Credit: 55,293,173 RAC: 49 |
Looks like everything is back to normal as Matt killed that rogue process. Just what We needed a rogue process on the loose. It deserved to die. ;) The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65746 Credit: 55,293,173 RAC: 49 |
I don't know what's going on, But uploads don't seem to be working here. And now PC3 can't upload, Not just PC4, Somethings Borked. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
arkayn Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0 |
|
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65746 Credit: 55,293,173 RAC: 49 |
I think the scheduler needs some coffee, As It's not responding right now. 10/2/2008 11:28:09 AM|SETI@home|Reason: scheduler request failed 10/2/2008 11:28:22 AM|SETI@home|[file_xfer] Started upload of file 16au08aa.23831.9070.4.8.14_1_0 10/2/2008 11:28:38 AM|SETI@home|[file_xfer] Finished upload of file 16au08aa.23831.9070.4.8.14_1_0 10/2/2008 11:28:38 AM|SETI@home|[file_xfer] Throughput 1885 bytes/sec 10/2/2008 11:29:11 AM||Time passed...reporting result(s) now. 10/2/2008 11:29:11 AM|SETI@home|Sending scheduler request: To report completed tasks 10/2/2008 11:29:11 AM|SETI@home|Reporting 4 tasks 10/2/2008 11:29:33 AM||Project communication failed: attempting access to reference site 10/2/2008 11:29:35 AM||Access to reference site succeeded - project servers may be temporarily down. 10/2/2008 11:29:37 AM|SETI@home|Scheduler request failed: couldn't connect to server 10/2/2008 11:29:37 AM|SETI@home|Deferring communication for 1 min 42 sec 10/2/2008 11:29:37 AM|SETI@home|Reason: scheduler request failed 10/2/2008 11:30:28 AM|SETI@home|Sending scheduler request: Requested by user 10/2/2008 11:30:28 AM|SETI@home|Reporting 4 tasks 10/2/2008 11:30:49 AM||Project communication failed: attempting access to reference site 10/2/2008 11:30:50 AM||Access to reference site succeeded - project servers may be temporarily down. 10/2/2008 11:30:53 AM|SETI@home|Scheduler request failed: couldn't connect to server 10/2/2008 11:30:53 AM|SETI@home|Deferring communication for 5 min 19 sec 10/2/2008 11:30:53 AM|SETI@home|Reason: scheduler request failed The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Coffee applied. It was actually the file upload handlers whining, I think. I moved them all back to bruno (where they are working fine) - up until just now half of them were going to anakin (for redundancy purposes), but for some reason anakin started barfing on them. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65746 Credit: 55,293,173 RAC: 49 |
Coffee applied. It was actually the file upload handlers whining, I think. I moved them all back to bruno (where they are working fine) - up until just now half of them were going to anakin (for redundancy purposes), but for some reason anakin started barfing on them. Maybe anakin was havin an allergy attack? ;) Thanks Matt. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
I think it's fascinating that 22mr08aa still has the same two active channels as it had three weeks ago on 11 September, and 23mr08aa one. We've been remarkably lucky that the other splitters have had enough mid-range work to meet demand, thank the ALFALFA project for that. Joe |
skildude Send message Joined: 4 Oct 00 Posts: 9541 Credit: 50,759,529 RAC: 60 |
I've had trouble with sending results back for the last few days In a rich man's house there is no place to spit but his face. Diogenes Of Sinope |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.