Panic Mode On (95) Server Problems?

Author	Message
cliff Send message Joined: 16 Dec 07 Posts: 625 Credit: 3,590,440 RAC: 0	Message 1637160 - Posted: 4 Feb 2015, 3:02:47 UTC And guess what? Another bout of 'cannot connect to server' is occurring:-/ Regards Cliff, Been there, Done that, Still no damm T shirt! ID: 1637160 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1637161 - Posted: 4 Feb 2015, 3:07:39 UTC - in response to Message 1637147. I am actually a bit surprised that with the bandwidth being used as heavily as it is work seems to be flowing OK. (MB, of course). So far it looks like about 2 TB of data has gone out of the connection over the past 5 hours. Perhaps there is a large chunk of something getting moved to the lab for some purpose. It's moving out of the lab. They seem to be off loading data from the system. Could be a backup because they are going to try something special or they may be reading the data out so they can create a new data base and then write the data back. Some news would be nice. Large spikes in the blue line, bits out, is the data going from the lab to the colo. Such as loading new data sets for splitting and such. Currently the green line, bits in or how much we are downloading, is running 800-900 Mb. Which is much much higher than we see when there is AP data going out. With that high of a level being sustained for 8+ hours now it would be hard to imagine that there is data going outside the campus. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1637161 ·

Dena Wiltsie Volunteer tester Send message Joined: 19 Apr 01 Posts: 1628 Credit: 24,230,968 RAC: 26	Message 1637165 - Posted: 4 Feb 2015, 3:28:46 UTC - in response to Message 1637161. I am actually a bit surprised that with the bandwidth being used as heavily as it is work seems to be flowing OK. (MB, of course). So far it looks like about 2 TB of data has gone out of the connection over the past 5 hours. Perhaps there is a large chunk of something getting moved to the lab for some purpose. It's moving out of the lab. They seem to be off loading data from the system. Could be a backup because they are going to try something special or they may be reading the data out so they can create a new data base and then write the data back. Some news would be nice. Large spikes in the blue line, bits out, is the data going from the lab to the colo. Such as loading new data sets for splitting and such. Currently the green line, bits in or how much we are downloading, is running 800-900 Mb. Which is much much higher than we see when there is AP data going out. With that high of a level being sustained for 8+ hours now it would be hard to imagine that there is data going outside the campus. The cricket program lives in the switch and is looking at the lab. Blue is data going into the lab and green is data leaving the lab. The cricket graph monitors only one communication line. ID: 1637165 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1637184 - Posted: 4 Feb 2015, 4:44:22 UTC - in response to Message 1637165. The cricket program lives in the switch and is looking at the lab. Blue is data going into the lab and green is data leaving the lab. The cricket graph monitors only one communication line. Semantics: not the lab, the switch port that is connected to the rack that all the servers are in at the co-lo. When they send data to the servers (tapes to be split and sent out), it comes from the lab up the hill, typically, and they tend to be limited to 300Mbit. Whether that is a self-imposed limit or otherwise, I don't know. But what I DO know is that ~825Mbit * 9.5 hours (at this moment) = 3.20 TiB that has left the servers and gone... somewhere. Again, I would assume/presume it is going up the hill and TO the lab, but why no average limit of ~300Mbit? I'm kind of thinking they may be copying the AP database from the server in the co-lo and sending it up to the lab to do some offline testing and prototyping for repairs, and once they find what will work to fix it.. do that on the server in the co-lo. Theory B) the tapes that have been split and processed are now being sent to off-site storage archives. They don't really get rid of any tapes, as far as I've read over the years. They all seem to just go to off-site archives in case they are ever needed again, or when the well starts to run dry. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1637184 ·

Speedy Volunteer tester Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89	Message 1637207 - Posted: 4 Feb 2015, 6:03:07 UTC - in response to Message 1637184. But what I DO know is that ~825Mbit * 9.5 hours (at this moment) = 3.20 TiB that has left the servers and gone... somewhere. Again, I would assume/presume it is going up the hill and TO the lab, but why no average limit of ~300Mbit? Can you please provide your calculations on how you got 825 x 9.5 = 3.2? When I do the same calculation I get roughly 7800.5 This is over double what you got ID: 1637207 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1637218 - Posted: 4 Feb 2015, 6:41:47 UTC - in response to Message 1637207. But what I DO know is that ~825Mbit * 9.5 hours (at this moment) = 3.20 TiB that has left the servers and gone... somewhere. Again, I would assume/presume it is going up the hill and TO the lab, but why no average limit of ~300Mbit? Can you please provide your calculations on how you got 825 x 9.5 = 3.2? When I do the same calculation I get roughly 7800.5 This is over double what you got 825,000,000 bits / 8 = 103,125,000 bytes per second. 103,125,000 * 3600 (seconds in one hour) * 9.5 hours = 3,526,875,000,000 bytes. 3,526,875,000,000 / 1024 (KiB) / 1024 (MiB) / 1024 (GiB) / 1024 (TiB) = 3.2076 (without explanations of each step: 825000000/836009.5/1024/1024/1024/1024 = 3.2076) I think a lot of people forget or overlook the fact that the cricket graphs are in megabits, which is often written as Mb, or Mbit or mbit. Little 'b' is supposed to mean bits, big 'B' is supposed to mean bytes. Then you get into that whole fiasco of base-10 or powers of 2. That's where the little 'i' between the letters comes into play. 1 MB != 1 MiB, and so forth, and that's how you end up with a 1 TB hard drive that formats out to 931 GiB. There's still just over a trillion bytes, but 1,000,000,000,000 != 1,099,511,627,776 (2^40, or 102410241024*1024). Taking 10^12 (1 TB) and dividing it by 2^30 (GiB) = 931.322 GiB. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1637218 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304	Message 1637221 - Posted: 4 Feb 2015, 6:46:11 UTC - in response to Message 1637207. But what I DO know is that ~825Mbit * 9.5 hours (at this moment) = 3.20 TiB that has left the servers and gone... somewhere. Again, I would assume/presume it is going up the hill and TO the lab, but why no average limit of ~300Mbit? Can you please provide your calculations on how you got 825 x 9.5 = 3.2? When I do the same calculation I get roughly 7800.5 This is over double what you got I get about 3.53TB. 825Mb/s is roughly 103MB/s. 103MB/s6060*9.5 is roughly 3.53TB Grant Darwin NT ID: 1637221 ·

Speedy Volunteer tester Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89	Message 1637228 - Posted: 4 Feb 2015, 7:19:59 UTC Last modified: 4 Feb 2015, 7:22:42 UTC Thanks for your answers. And my eyes the cricket graph gives a false of how fast data is traveling in and out of the servers. On a side note the actual speed that data travels is about the same speed that data travels on to a USB 3 external hard drive assuming that the speed stated while transferring files is correct in fact it is not slower than what it says ID: 1637228 ·

Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340	Message 1637277 - Posted: 4 Feb 2015, 9:14:59 UTC WELL DONE, Cosmic Ocean. Best and most succinct explanation of why a 1TB hard drive shows up as 931GB on your computer. The reason for this is that early computers didn't have floating point hardware or even divide instructions in some cases, so to do the division you had to do software (ugh) which was s-l-o-w and it was easier (and much faster) to shift 10 bits over than to actually divide by 1000. ID: 1637277 ·

Belthazor Volunteer tester Send message Joined: 6 Apr 00 Posts: 219 Credit: 10,373,795 RAC: 13	Message 1637290 - Posted: 4 Feb 2015, 10:11:57 UTC I suppose they grab an Astropulse database ID: 1637290 ·

Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340	Message 1637347 - Posted: 4 Feb 2015, 13:16:03 UTC - in response to Message 1637290. I suppose they grab an Astropulse database That's an interesting idea. Maybe they are copying the data elsewhere to try to figure out what to do. That makes some sense, I think. ID: 1637347 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1637358 - Posted: 4 Feb 2015, 14:26:51 UTC - in response to Message 1637228. Thanks for your answers. And my eyes the cricket graph gives a false of how fast data is traveling in and out of the servers. On a side note the actual speed that data travels is about the same speed that data travels on to a USB 3 external hard drive assuming that the speed stated while transferring files is correct in fact it is not slower than what it says Well the information we see on the graph isn't so much false. Just more that we don't see the separation between what is moving off the campus & what is inter campus traffic. USB 3.0 is designed for 5Gb/s or 625MB/s. The slower ~120MB/s with an external hard drive would just be due to the limits of a single mechanical hard drive. It would be neat if they could design a low cost portable hard drive with much higher i/o closer to that of a SSD. Maybe a small RAID configuration made of 1.8" drives or something weird like that. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1637358 ·

Dena Wiltsie Volunteer tester Send message Joined: 19 Apr 01 Posts: 1628 Credit: 24,230,968 RAC: 26	Message 1637365 - Posted: 4 Feb 2015, 14:52:00 UTC - in response to Message 1637277. WELL DONE, Cosmic Ocean. Best and most succinct explanation of why a 1TB hard drive shows up as 931GB on your computer. The reason for this is that early computers didn't have floating point hardware or even divide instructions in some cases, so to do the division you had to do software (ugh) which was s-l-o-w and it was easier (and much faster) to shift 10 bits over than to actually divide by 1000. The reason is more basic than that. The selling number is the total storage ability of the drive. The number your system reports to you is after sector headers and inter record gaps have been removed. The reason for this is because soft sectored drives allow you to change the number of sectors per track changing the usable area. Most of the time, smaller sectors are desired because sectors must be transferred intact. Large sectors require more RAM to hold so small sectors tie up less RAM. Now you probably know more about hard drives than you ever wanted to know. By the way, a shift right 10 is a divide by 1024. ID: 1637365 ·

Cherokee150 Send message Joined: 11 Nov 99 Posts: 192 Credit: 58,513,758 RAC: 74	Message 1637388 - Posted: 4 Feb 2015, 16:48:15 UTC - in response to Message 1637218. Well done, Cosmic_Ocean! I've been doing these binary mathematical calculations since the late 1960s, and both your explanation and your calculations are perfect and, possibly more importantly, succinct, which makes it easy for people not versed in binary to understand. Only one last step was omitted, most likely to avoid complicating the explanation. That is scientific rounding. Since the next digit in the result is 7 (3.20767), the final result of 3.2076 is more accurately displayed as 3.2077. Sadly, I have learned that, over the last decade or two, many people (at least in the United States) were either not taught, or do not remember, their lesson on scientific rounding. Therefore, omitting that final part of the equation removes the chance of muddying the effect of such an excellent presentation of the correct steps to determine terabytes from megabits. Once again, VERY well done, Cosmic_Ocean!!! ID: 1637388 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1637412 - Posted: 4 Feb 2015, 17:39:39 UTC Hmmm... I see the bandwidth continues unabated. One heck of a stats dump....LOL. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1637412 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 1637423 - Posted: 4 Feb 2015, 18:07:43 UTC The usual restriction to ~300 Mbps is based on having a single 1Gbps fiber up the hill to SSL, so I had some doubts that was where the data was going. Tracing through the routers seems to indicate they've gotten permission from other users at SSL to use most of the available bandwidth. Leaving inr-211: tengigabitethernet5_5: 128.32.255.127: Tier2 side B to inr-202 te6/1 Arriving at inr-202: tengigabitethernet6_1: 128.32.255.126: Tier2 side B to inr-211 t5/5 Leaving inr-202: tengigabitethernet6_9: 128.32.0.94: inr-304-sut t6/4 Arriving at inr-304: tengigabitethernet6_4: 128.32.0.95: inr-202-reccev t6/9 Leaving inr-304: gigabitethernet8_34: : sslringsut1fes g0/4, SSL_P2P_169.229.0.216/29 vlan591: 169.229.0.217: SSL Firewall Transit net Those last links indicate slightly under 800 Mbps, but there are a lot more hours now. I haven't recalculated the total size. Joe ID: 1637423 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304	Message 1637433 - Posted: 4 Feb 2015, 18:31:30 UTC - in response to Message 1637423. AP is still dead in the water, and although they are keeping up, over the last few hours the MB splitters have been struggling. And just look at all that green on the Cricket graphs... Grant Darwin NT ID: 1637433 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1637438 - Posted: 4 Feb 2015, 18:39:42 UTC - in response to Message 1637433. AP is still dead in the water, and although they are keeping up, over the last few hours the MB splitters have been struggling. And just look at all that green on the Cricket graphs... Well, knock on wood, but so far the kitties have been able to keep the crunchers' caches full of those MB shorties. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1637438 ·

Dena Wiltsie Volunteer tester Send message Joined: 19 Apr 01 Posts: 1628 Credit: 24,230,968 RAC: 26	Message 1637460 - Posted: 4 Feb 2015, 20:13:52 UTC My RAC has dropped from 28,000 to 18,000. The only good thing is it will stop at 5,000 as long as I can get MB records. My decision to turn off one of the GPUs instead of the CPUs has turned out to be the correct decision. ID: 1637460 ·

Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340	Message 1637495 - Posted: 4 Feb 2015, 21:25:25 UTC - in response to Message 1637365. The reason is more basic than that. The selling number is the total storage ability of the drive. The number your system reports to you is after sector headers and inter record gaps have been removed. The reason for this is because soft sectored drives allow you to change the number of sectors per track changing the usable area. Most of the time, smaller sectors are desired because sectors must be transferred intact. Large sectors require more RAM to hold so small sectors tie up less RAM. Now you probably know more about hard drives than you ever wanted to know. By the way, a shift right 10 is a divide by 1024. I believe you are incorrect on that. That basic formatting is built into the size calculation. The drive capacity was always #sectors * 512. If you look at some drive labels, they give the number of sectors ("LBA: nnnnnnn"), so you can multiply it out yourself. NO manufacturer gives drive capacity any other way, to my knowledge. (I assume this applies for the newer 4K sector drives as well, but I don't know that). ID: 1637495 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.