Panic Mode On (95) Server Problems?

Message boards : Number crunching : Panic Mode On (95) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 22 · Next

AuthorMessage
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1637160 - Posted: 4 Feb 2015, 3:02:47 UTC

And guess what?
Another bout of 'cannot connect to server' is occurring:-/

Regards
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1637160 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6533
Credit: 196,805,888
RAC: 57
United States
Message 1637161 - Posted: 4 Feb 2015, 3:07:39 UTC - in response to Message 1637147.  

I am actually a bit surprised that with the bandwidth being used as heavily as it is work seems to be flowing OK.
(MB, of course).

So far it looks like about 2 TB of data has gone out of the connection over the past 5 hours. Perhaps there is a large chunk of something getting moved to the lab for some purpose.

It's moving out of the lab. They seem to be off loading data from the system. Could be a backup because they are going to try something special or they may be reading the data out so they can create a new data base and then write the data back. Some news would be nice.

Large spikes in the blue line, bits out, is the data going from the lab to the colo. Such as loading new data sets for splitting and such. Currently the green line, bits in or how much we are downloading, is running 800-900 Mb. Which is much much higher than we see when there is AP data going out.
With that high of a level being sustained for 8+ hours now it would be hard to imagine that there is data going outside the campus.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the BP6/VP6 User Group today!
ID: 1637161 · Report as offensive
Dena Wiltsie
Volunteer tester

Send message
Joined: 19 Apr 01
Posts: 1628
Credit: 24,230,968
RAC: 26
United States
Message 1637165 - Posted: 4 Feb 2015, 3:28:46 UTC - in response to Message 1637161.  

I am actually a bit surprised that with the bandwidth being used as heavily as it is work seems to be flowing OK.
(MB, of course).

So far it looks like about 2 TB of data has gone out of the connection over the past 5 hours. Perhaps there is a large chunk of something getting moved to the lab for some purpose.

It's moving out of the lab. They seem to be off loading data from the system. Could be a backup because they are going to try something special or they may be reading the data out so they can create a new data base and then write the data back. Some news would be nice.

Large spikes in the blue line, bits out, is the data going from the lab to the colo. Such as loading new data sets for splitting and such. Currently the green line, bits in or how much we are downloading, is running 800-900 Mb. Which is much much higher than we see when there is AP data going out.
With that high of a level being sustained for 8+ hours now it would be hard to imagine that there is data going outside the campus.

The cricket program lives in the switch and is looking at the lab. Blue is data going into the lab and green is data leaving the lab. The cricket graph monitors only one communication line.
ID: 1637165 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1637184 - Posted: 4 Feb 2015, 4:44:22 UTC - in response to Message 1637165.  

The cricket program lives in the switch and is looking at the lab. Blue is data going into the lab and green is data leaving the lab. The cricket graph monitors only one communication line.

Semantics: not the lab, the switch port that is connected to the rack that all the servers are in at the co-lo.

When they send data to the servers (tapes to be split and sent out), it comes from the lab up the hill, typically, and they tend to be limited to 300Mbit. Whether that is a self-imposed limit or otherwise, I don't know.

But what I DO know is that ~825Mbit * 9.5 hours (at this moment) = 3.20 TiB that has left the servers and gone... somewhere. Again, I would assume/presume it is going up the hill and TO the lab, but why no average limit of ~300Mbit?

I'm kind of thinking they may be copying the AP database from the server in the co-lo and sending it up to the lab to do some offline testing and prototyping for repairs, and once they find what will work to fix it.. do that on the server in the co-lo.

Theory B) the tapes that have been split and processed are now being sent to off-site storage archives. They don't really get rid of any tapes, as far as I've read over the years. They all seem to just go to off-site archives in case they are ever needed again, or when the well starts to run dry.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1637184 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1638
Credit: 12,921,799
RAC: 89
New Zealand
Message 1637207 - Posted: 4 Feb 2015, 6:03:07 UTC - in response to Message 1637184.  


But what I DO know is that ~825Mbit * 9.5 hours (at this moment) = 3.20 TiB that has left the servers and gone... somewhere. Again, I would assume/presume it is going up the hill and TO the lab, but why no average limit of ~300Mbit?


Can you please provide your calculations on how you got 825 x 9.5 = 3.2? When I do the same calculation I get roughly 7800.5 This is over double what you got
ID: 1637207 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1637218 - Posted: 4 Feb 2015, 6:41:47 UTC - in response to Message 1637207.  


But what I DO know is that ~825Mbit * 9.5 hours (at this moment) = 3.20 TiB that has left the servers and gone... somewhere. Again, I would assume/presume it is going up the hill and TO the lab, but why no average limit of ~300Mbit?


Can you please provide your calculations on how you got 825 x 9.5 = 3.2? When I do the same calculation I get roughly 7800.5 This is over double what you got

825,000,000 bits / 8 = 103,125,000 bytes per second.

103,125,000 * 3600 (seconds in one hour) * 9.5 hours = 3,526,875,000,000 bytes.

3,526,875,000,000 / 1024 (KiB) / 1024 (MiB) / 1024 (GiB) / 1024 (TiB) = 3.2076

(without explanations of each step:
825000000/8*3600*9.5/1024/1024/1024/1024 = 3.2076)

I think a lot of people forget or overlook the fact that the cricket graphs are in megabits, which is often written as Mb, or Mbit or mbit. Little 'b' is supposed to mean bits, big 'B' is supposed to mean bytes. Then you get into that whole fiasco of base-10 or powers of 2. That's where the little 'i' between the letters comes into play.

1 MB != 1 MiB, and so forth, and that's how you end up with a 1 TB hard drive that formats out to 931 GiB. There's still just over a trillion bytes, but 1,000,000,000,000 != 1,099,511,627,776 (2^40, or 1024*1024*1024*1024). Taking 10^12 (1 TB) and dividing it by 2^30 (GiB) = 931.322 GiB.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1637218 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13399
Credit: 208,696,464
RAC: 304
Australia
Message 1637221 - Posted: 4 Feb 2015, 6:46:11 UTC - in response to Message 1637207.  


But what I DO know is that ~825Mbit * 9.5 hours (at this moment) = 3.20 TiB that has left the servers and gone... somewhere. Again, I would assume/presume it is going up the hill and TO the lab, but why no average limit of ~300Mbit?


Can you please provide your calculations on how you got 825 x 9.5 = 3.2? When I do the same calculation I get roughly 7800.5 This is over double what you got

I get about 3.53TB.

825Mb/s is roughly 103MB/s.
103MB/s*60*60*9.5 is roughly 3.53TB
Grant
Darwin NT
ID: 1637221 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1638
Credit: 12,921,799
RAC: 89
New Zealand
Message 1637228 - Posted: 4 Feb 2015, 7:19:59 UTC
Last modified: 4 Feb 2015, 7:22:42 UTC

Thanks for your answers. And my eyes the cricket graph gives a false of how fast data is traveling in and out of the servers. On a side note the actual speed that data travels is about the same speed that data travels on to a USB 3 external hard drive assuming that the speed stated while transferring files is correct in fact it is not slower than what it says
ID: 1637228 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1637277 - Posted: 4 Feb 2015, 9:14:59 UTC

WELL DONE, Cosmic Ocean. Best and most succinct explanation of why a 1TB hard drive shows up as 931GB on your computer.

The reason for this is that early computers didn't have floating point hardware or even divide instructions in some cases, so to do the division you had to do software (ugh) which was s-l-o-w and it was easier (and much faster) to shift 10 bits over than to actually divide by 1000.
ID: 1637277 · Report as offensive
Profile Belthazor
Volunteer tester
Avatar

Send message
Joined: 6 Apr 00
Posts: 219
Credit: 10,373,795
RAC: 13
Russia
Message 1637290 - Posted: 4 Feb 2015, 10:11:57 UTC

I suppose they grab an Astropulse database
ID: 1637290 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1637347 - Posted: 4 Feb 2015, 13:16:03 UTC - in response to Message 1637290.  

I suppose they grab an Astropulse database


That's an interesting idea. Maybe they are copying the data elsewhere to try to figure out what to do. That makes some sense, I think.
ID: 1637347 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6533
Credit: 196,805,888
RAC: 57
United States
Message 1637358 - Posted: 4 Feb 2015, 14:26:51 UTC - in response to Message 1637228.  

Thanks for your answers. And my eyes the cricket graph gives a false of how fast data is traveling in and out of the servers. On a side note the actual speed that data travels is about the same speed that data travels on to a USB 3 external hard drive assuming that the speed stated while transferring files is correct in fact it is not slower than what it says

Well the information we see on the graph isn't so much false. Just more that we don't see the separation between what is moving off the campus & what is inter campus traffic.
USB 3.0 is designed for 5Gb/s or 625MB/s. The slower ~120MB/s with an external hard drive would just be due to the limits of a single mechanical hard drive. It would be neat if they could design a low cost portable hard drive with much higher i/o closer to that of a SSD. Maybe a small RAID configuration made of 1.8" drives or something weird like that.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the BP6/VP6 User Group today!
ID: 1637358 · Report as offensive
Dena Wiltsie
Volunteer tester

Send message
Joined: 19 Apr 01
Posts: 1628
Credit: 24,230,968
RAC: 26
United States
Message 1637365 - Posted: 4 Feb 2015, 14:52:00 UTC - in response to Message 1637277.  

WELL DONE, Cosmic Ocean. Best and most succinct explanation of why a 1TB hard drive shows up as 931GB on your computer.

The reason for this is that early computers didn't have floating point hardware or even divide instructions in some cases, so to do the division you had to do software (ugh) which was s-l-o-w and it was easier (and much faster) to shift 10 bits over than to actually divide by 1000.

The reason is more basic than that. The selling number is the total storage ability of the drive. The number your system reports to you is after sector headers and inter record gaps have been removed. The reason for this is because soft sectored drives allow you to change the number of sectors per track changing the usable area. Most of the time, smaller sectors are desired because sectors must be transferred intact. Large sectors require more RAM to hold so small sectors tie up less RAM. Now you probably know more about hard drives than you ever wanted to know.
By the way, a shift right 10 is a divide by 1024.
ID: 1637365 · Report as offensive
Cherokee150

Send message
Joined: 11 Nov 99
Posts: 192
Credit: 58,513,758
RAC: 74
United States
Message 1637388 - Posted: 4 Feb 2015, 16:48:15 UTC - in response to Message 1637218.  

Well done, Cosmic_Ocean!
I've been doing these binary mathematical calculations since the late 1960s, and both your explanation and your calculations are perfect and, possibly more importantly, succinct, which makes it easy for people not versed in binary to understand.

Only one last step was omitted, most likely to avoid complicating the explanation. That is scientific rounding. Since the next digit in the result is 7 (3.20767), the final result of 3.2076 is more accurately displayed as 3.2077.

Sadly, I have learned that, over the last decade or two, many people (at least in the United States) were either not taught, or do not remember, their lesson on scientific rounding. Therefore, omitting that final part of the equation removes the chance of muddying the effect of such an excellent presentation of the correct steps to determine terabytes from megabits.

Once again, VERY well done, Cosmic_Ocean!!!
ID: 1637388 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1637412 - Posted: 4 Feb 2015, 17:39:39 UTC

Hmmm...
I see the bandwidth continues unabated.
One heck of a stats dump....LOL.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1637412 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1637423 - Posted: 4 Feb 2015, 18:07:43 UTC

The usual restriction to ~300 Mbps is based on having a single 1Gbps fiber up the hill to SSL, so I had some doubts that was where the data was going. Tracing through the routers seems to indicate they've gotten permission from other users at SSL to use most of the available bandwidth.

Leaving inr-211:
tengigabitethernet5_5: 128.32.255.127: Tier2 side B to inr-202 te6/1
Arriving at inr-202:
tengigabitethernet6_1: 128.32.255.126: Tier2 side B to inr-211 t5/5
Leaving inr-202:
tengigabitethernet6_9: 128.32.0.94: inr-304-sut t6/4
Arriving at inr-304:
tengigabitethernet6_4: 128.32.0.95: inr-202-reccev t6/9
Leaving inr-304:
gigabitethernet8_34: : sslringsut1fes g0/4, SSL_P2P_169.229.0.216/29
vlan591: 169.229.0.217: SSL Firewall Transit net

Those last links indicate slightly under 800 Mbps, but there are a lot more hours now. I haven't recalculated the total size.
                                                                  Joe
ID: 1637423 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13399
Credit: 208,696,464
RAC: 304
Australia
Message 1637433 - Posted: 4 Feb 2015, 18:31:30 UTC - in response to Message 1637423.  

AP is still dead in the water, and although they are keeping up, over the last few hours the MB splitters have been struggling.

And just look at all that green on the Cricket graphs...
Grant
Darwin NT
ID: 1637433 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1637438 - Posted: 4 Feb 2015, 18:39:42 UTC - in response to Message 1637433.  

AP is still dead in the water, and although they are keeping up, over the last few hours the MB splitters have been struggling.

And just look at all that green on the Cricket graphs...

Well, knock on wood, but so far the kitties have been able to keep the crunchers' caches full of those MB shorties.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1637438 · Report as offensive
Dena Wiltsie
Volunteer tester

Send message
Joined: 19 Apr 01
Posts: 1628
Credit: 24,230,968
RAC: 26
United States
Message 1637460 - Posted: 4 Feb 2015, 20:13:52 UTC

My RAC has dropped from 28,000 to 18,000. The only good thing is it will stop at 5,000 as long as I can get MB records. My decision to turn off one of the GPUs instead of the CPUs has turned out to be the correct decision.
ID: 1637460 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1637495 - Posted: 4 Feb 2015, 21:25:25 UTC - in response to Message 1637365.  

The reason is more basic than that. The selling number is the total storage ability of the drive. The number your system reports to you is after sector headers and inter record gaps have been removed. The reason for this is because soft sectored drives allow you to change the number of sectors per track changing the usable area. Most of the time, smaller sectors are desired because sectors must be transferred intact. Large sectors require more RAM to hold so small sectors tie up less RAM. Now you probably know more about hard drives than you ever wanted to know.
By the way, a shift right 10 is a divide by 1024.


I believe you are incorrect on that. That basic formatting is built into the size calculation. The drive capacity was always #sectors * 512. If you look at some drive labels, they give the number of sectors ("LBA: nnnnnnn"), so you can multiply it out yourself. NO manufacturer gives drive capacity any other way, to my knowledge. (I assume this applies for the newer 4K sector drives as well, but I don't know that).
ID: 1637495 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (95) Server Problems?


 
©2022 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.