Monolith (Jun 14 2011) |
![]() |
| log in |
Message boards : Technical News : Monolith (Jun 14 2011)
1 · 2 · 3 · 4 . . . 6 · Next
| Author | Message |
|---|---|
|
Usual outage day. Project goes down, we squeeze and copy databases, project comes back up. It seems the mysql replica is oddly unable to keep up with much success anymore. I think the cause is our ridiculously consistent heavy load lately thus keeping the databases busier than normal. Anybody have any theories about what is causing the ridiculously consistent heavy load? What's also a little strange is the CPU/IO load on jocelyn is low... so what's the bottleneck? I'd have to guess network, but it's copying the logs from the master faster than executing the SQL within those logs. So...? | |
| ID: 1117102 · | |
Anybody have any theories about what is causing the ridiculously consistent heavy load? Yes, you've been splitting practically nothing but "shorties" - very high angle range tasks, from a basketweave survey at Arecibo. Hang on, I'll get you the reference. Edit - try my message 1112964. That covers most of it. | |
| ID: 1117103 · | |
|
Unable to upload results, | |
| ID: 1117115 · | |
|
Thanks for the update Matt, | |
| ID: 1117120 · | |
|
Bare with it Eaglescouter, mine tried and got a can't connect to server then turned around a minute later and got right through. It's catch as catch can right now as everybody fills up after the outage. | |
| ID: 1117121 · | |
|
| |
| ID: 1117129 · | |
|
Matt, thanks for the news! | |
| ID: 1117154 · | |
|
Break a leg! Or is that only for actors? | |
| ID: 1117156 · | |
It sure is nice seeing the network graph for the whole lab going from a baseline of ~50 Mbits/sec to ~250 Mbits/sec when we started that procedure. Too bad we're still currently stuck using the HE connection for our uploads/downloads. Maybe someday that'll change. Thanks for the update Matt,keep up the good work. I glad someone/something opened the flood gates even though it may not last long, d/l usually moving at 3.67Kb - 15Kb taking hours just shot up to 88Kb - 347Kb and minutes. | |
| ID: 1117170 · | |
|
I would have thought that Matt and the rest of the project staff KNEW they were sending out nothing but shorties. Guess not! | |
| ID: 1117185 · | |
|
My pet theory - re-try times are too short for the current "shorty storm". In previous existences I've found that the re-try rate can be very sensitive to the time-out time, small changes in that can have very substantial changes in overall throughput of a system. | |
| ID: 1117233 · | |
I would have thought that Matt and the rest of the project staff KNEW they were sending out nothing but shorties. Guess not! Of course we know shorties are a major problem, but some other numbers just aren't adding up... - Matt ____________ -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude | |
| ID: 1117406 · | |
I would have thought that Matt and the rest of the project staff KNEW they were sending out nothing but shorties. Guess not! No chance some viral meanie has crept into the works? ____________ ****** "Ask not, what your kitty can do for you. Ask what you can do for your kitty." As it is kitten, so shall it be done. | |
| ID: 1117410 · | |
I would have thought that Matt and the rest of the project staff KNEW they were sending out nothing but shorties. Guess not! What numbers would those be, Matt? Maybe we can help, looking at it from this end? | |
| ID: 1117425 · | |
I would have thought that Matt and the rest of the project staff KNEW they were sending out nothing but shorties. Guess not! But of course not! They're running *nix which eradicated viruses long ago, when the Earth was still cooling. | |
| ID: 1117426 · | |
|
Did you find the bottleneck? I just got a herd of downloads and they are coming at me fast and furious! | |
| ID: 1117441 · | |
|
[snip] But of course not! They're running *nix which eradicated viruses long ago, when the Earth was still cooling. *nix is not immune to virii, but few people write viruses for *nix as the damage would be limited - and not as many people are P----d off at Linix or Unix due to them being almost free of cost, as opposed to M$ Windoze... but we're getting off topic... ____________ . | |
| ID: 1117498 · | |
|
Something wrong with this batch of work units. I'm getting a ton of -9s. Was afraid it might be me but they are starting to validate against all types of other machines. | |
| ID: 1117504 · | |
[snip] Very far off topic. And the plural of virus is viruses. And Windows is spelled with a "ows" much like Linux isn't spelled with an "s" as in Linsux. And I don't think that people are pissed off with Windows because its not free. People don't write viruses for Linux because its not worth the small user base to put the amount of effort into breaking it. | |
| ID: 1117548 · | |
Bare with it Eaglescouter, mine tried and got a can't connect to server then turned around a minute later and got right through. It's catch as catch can right now as everybody fills up after the outage. I'm still here. Today my machines are unable to upload completed work. "Project servers may be temporarily down" ____________ It's not too many computers, it's a lack of circuit breakers for this room. But we can fix it :) | |
| ID: 1117567 · | |
Message boards : Technical News : Monolith (Jun 14 2011)
| Copyright © 2013 University of California |