Message boards :
Technical News :
Red Shift (Mar 01 2011)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Happy March to one and all. Haven't have much to write about lately, but here's a round up. We had our usual weekly maintenance outage today during which we took care of all kinds of stuff besides the usual mysql database compression/backup. Early this morning I noticed the replica mysql server had some broken tables, which led me to discover a drive had failed on that system last night - a 73GB fibre channel drive. Not a big deal, as we have tons of these kicking around from older servers at this point. This was easy enough to hot swap, though I got lost in some internal closet networking updates as this disk array is only accessible via telnet. And then the mysql daemon on the replica freaked out a little bit when the new drive was introduced, so I had to reboot the system, re-fix broken tables, etc. etc. etc. The replica is still catching up (will be for a while). Today we also moved synergy off the probably-flakey UPS. Yeah, I know we should have done this earlier, but just haven't gotten around to it yet. If anything this gave us one more data point in the form of yet another automatic biweekly reboot at Sunday around 3pm (a couple days ago). Now the UPS is out of the equation, we have to wait 2 weeks to see if this was indeed the problem. What else... we moved a lot more bits from ptolemy onto thumper. You may notice some general speedups on the website or elsewhere. We hope. And Jeff and I tackled a ton of timing tests for the science database on oscar. We're finding all the bottlenecks and finding ways around them. The good news is the database select throughput has gone from 100 spikes/second to 17,000 spikes/second. However these are under optimal conditions. In reality we'll have to deal with many of the aforementioned bottlenecks. Also: gowron is back to being the main workunit server (the full transition is far from complete, though). That's been my day so far. How's your day? - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0 |
Keep in mind UPS's do need new batteries from time to time. ;) Janice |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Thanks for the update Matt, Claggy |
Thomas Arnold Send message Joined: 14 May 99 Posts: 56 Credit: 61,046,144 RAC: 0 |
Thank you as always for the update Matt. Man Alive you all have a lot on your plate. It is always fascinating to read the stuff you do to keep us happily crunching away. I do want to point out something on the Server status page (like you don't have enough things on the to do list.) On the the bottom of the page there are definitions/explanations for Tasks ready to send, Tasks in progress, etc. but under the Data Distribution State at the top they are referred to as Results ready to send, etc. I think the Tasks terminology is spot on but the Results reference muddies the waters. Thanks again to you and everyone at the Lab. Kind Regards, Tom |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 31013 Credit: 53,134,872 RAC: 32 |
Thanks for the update and please insist Eric get the Beta Status page fixed before V7 work hits the masses. |
Joel Send message Joined: 31 Oct 08 Posts: 104 Credit: 4,838,348 RAC: 13 |
Thanks for the update, and good job keeping everything in order over there! Since the big issues a few weeks ago, things have been looking pretty good. The weekly outages have been short, which is much appreciated by this hobbyist... |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 66353 Credit: 55,293,173 RAC: 49 |
Thanks for the update Matt, Me I just have to pack for a move, Which is being covered in My thread in My sig. Savoir-Faire is everywhere! The T1 Trust, T1 Class 4-4-4-4 #5550, America's First HST |
Black Squirrel Prime Send message Joined: 29 Jul 07 Posts: 8 Credit: 15,317,965 RAC: 0 |
Thanks for the update Matt, Just replaced 2 of mine over the weekend - the UPS software was sensing something initiating shutdowns. randomly. |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
I once had a problem where I was trying to communicate with some other device altogether via a serial cable and somehow the computer kept interpreting this as a shutdown command coming from the UPS. I think I disabled the UPS software until I was done with the other thing. Thanks for the update and all your hard work, Matt. As for me, SSDD. I do notice, however, that my computer hasn't communicated with the project in about 30 hours now. This seems unusual, but I'm sitting on 20 WUs (and no Einstein WUs), so I won't worry about it for another day. David David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
Bernie Vine Send message Joined: 26 May 99 Posts: 9958 Credit: 103,452,613 RAC: 328 |
Keep in mind UPS's do need new batteries from time to time. ;) Around every 2 years. Also keep in mind that at least one well known UPS manufacturer sets it's default self test to "14 days". I know from experience (my company has 150+ sites in the UK all with one or more UPS) that sometimes the self test can cause the UPS to fail, without any actual error in the log. |
GiftedPlacebo Send message Joined: 17 May 99 Posts: 3 Credit: 3,332,514 RAC: 0 |
Keep in mind UPS's do need new batteries from time to time. ;) Indeed. Assuming you have machines with redundant power supplies, I like to split machines over multiple UPS's. You can still set up the UPS software to send shutdown notices for "real" power failure events, but when you have a self-test induced power off, it is instant and no shutdown messages are sent (in my experience). It's not fool proof, but it has saved me many times when I've had a UPS fail on our distributed file servers. |
Tom95134 Send message Joined: 27 Nov 01 Posts: 216 Credit: 3,790,200 RAC: 0 |
Keep in mind UPS's do need new batteries from time to time. ;) And they really need to be exercised about once a month with a fairly deep discharge about twice a year. |
Tom95134 Send message Joined: 27 Nov 01 Posts: 216 Credit: 3,790,200 RAC: 0 |
Keep in mind UPS's do need new batteries from time to time. ;) That's very interesting. I've never had one "burp" the attached equipment due to a test cycle. Even when it is a deep (80~90%) test cycle. All our UPS are APC. |
ivan Send message Joined: 5 Mar 01 Posts: 783 Credit: 348,560,338 RAC: 223 |
Keep in mind UPS's do need new batteries from time to time. ;) I recently had to replace the battery in an APC Smart-UPS 720. It was failing its 14-day self-test... Function: Automatic Self-test Factory Default: Every 14 days (336 hours) User Selectable Choices: Every 7 days(168 hours), On Startup Only, No Self test Description: Set the interval at which the UPS will execute a self-test. "During the self-test, the UPS briefly operates the connected equipment on battery." |
Swibby Bear Send message Joined: 1 Aug 01 Posts: 246 Credit: 7,945,093 RAC: 0 |
Wow! I am frequently amazed at the interesting stuff posted on these forums. Thanks for all the helpful info. Whit |
SockGap Send message Joined: 16 Apr 07 Posts: 14 Credit: 7,700,416 RAC: 0 |
Assuming you have machines with redundant power supplies, I like to split machines over multiple UPS's. Where I work we were told to not put the redundant power supplies on different phases - something about having 415 volts of potential energy if something goes wrong. With one phase you have 240 volts that will give you a nasty kick. When you have two phases interacting you get 415 volts and that is a lot more likely to kill you. I have no idea if it's the same with multiple UPSs - but in theory they are changing the phase and therefore you could get more of a jolt out of two of them. You'd still have to be pretty unlucky to have something go wrong with two power supplies at once. I've never had one "burp" the attached equipment due to a test cycle. Even when it is a deep (80~90%) test cycle. All our UPS are APC. I deal with a few dozen APC UPSs at work and I've seen a faulty battery drop the load during a self test a few times... It seemed to have more to do with the batteries - the ones in some of our hotter cupboards had "dried out" (or at least expanded and cracked the plastic battery case) and were not working at all... |
GiftedPlacebo Send message Joined: 17 May 99 Posts: 3 Credit: 3,332,514 RAC: 0 |
Assuming you have machines with redundant power supplies, I like to split machines over multiple UPS's. All the best practice information I've read suggests putting redundant power supplies on separate UPS and even separate power grids. I think if multiple power supplies failed in such a fashion that there was 415V flowing into the system, your bigger concern would be putting out the fire rather than server maintenance =) But now I'm intrigued, as I've never heard that warning before. Off to Google! |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
I've certainly seen equipment killed by a three-phase grounding fault generating 415v. Fortunately, the main victim was a sacrificial surge protector - the telephone PBX behind it was saved. And that was just equipment plugged into a standard UK 13A ring main - in a medium-sized office block, with, I guess, different phases on different floors. Somebody working on the installation connected, or more likely disconnected, the wrong wire. When I had a couple of redundant PSU servers to look after, knowing that they only need one to run (and in an environment where if the power went out, nobody would need to access the servers anyway), I plugged one PSU into a UPS, and the other direct into the mains. Didn't seem to do any harm. |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
Thanks Matt, for the update, last power outage, I witnessed, was 26 years ago, when a 10KV/500V/380V-3 fase,transformer exploded! Not a big one, though. After this, alot has been changed. Only 400KV 3 fase is above ground, every 10KV line, has been put underground. I remember using an 'antenna' to feed a few Fluorecent lights, close to the 1000KWatt TV transmitter, which now is out off use, since atleast 15 years. Power-outages are also very rare and noone I know, uses an UPS. But the Netherlands are becomming one big city, atleast the west part of it, close to the sea. They already call it the 'Randstad', from Rotterdam to Amsterdam, is already a city with big green (houses) in between. And it's a beautifull day, lots of sunshine and about 7C. (But it still freezes, at night) |
Dena Wiltsie Send message Joined: 19 Apr 01 Posts: 1628 Credit: 24,230,968 RAC: 26 |
Keep in mind UPS's do need new batteries from time to time. ;) Lead acid should not be cycled if possible because it shortens it's life. Some designs are better able to withstand cycling than others but they all age when discharged. Most other battery types do last longer if you cycle them. Battery University Deep cycle batteries |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.