Red Shift (Mar 01 2011)

Message boards : Technical News : Red Shift (Mar 01 2011)
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 1082770 - Posted: 1 Mar 2011, 23:15:19 UTC

Happy March to one and all. Haven't have much to write about lately, but here's a round up.

We had our usual weekly maintenance outage today during which we took care of all kinds of stuff besides the usual mysql database compression/backup. Early this morning I noticed the replica mysql server had some broken tables, which led me to discover a drive had failed on that system last night - a 73GB fibre channel drive. Not a big deal, as we have tons of these kicking around from older servers at this point. This was easy enough to hot swap, though I got lost in some internal closet networking updates as this disk array is only accessible via telnet. And then the mysql daemon on the replica freaked out a little bit when the new drive was introduced, so I had to reboot the system, re-fix broken tables, etc. etc. etc. The replica is still catching up (will be for a while).

Today we also moved synergy off the probably-flakey UPS. Yeah, I know we should have done this earlier, but just haven't gotten around to it yet. If anything this gave us one more data point in the form of yet another automatic biweekly reboot at Sunday around 3pm (a couple days ago). Now the UPS is out of the equation, we have to wait 2 weeks to see if this was indeed the problem.

What else... we moved a lot more bits from ptolemy onto thumper. You may notice some general speedups on the website or elsewhere. We hope. And Jeff and I tackled a ton of timing tests for the science database on oscar. We're finding all the bottlenecks and finding ways around them. The good news is the database select throughput has gone from 100 spikes/second to 17,000 spikes/second. However these are under optimal conditions. In reality we'll have to deal with many of the aforementioned bottlenecks. Also: gowron is back to being the main workunit server (the full transition is far from complete, though).

That's been my day so far. How's your day?

- Matt

-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 1082770 · Report as offensive
Profile soft^spirit
Avatar

Send message
Joined: 18 May 99
Posts: 6497
Credit: 34,134,168
RAC: 0
United States
Message 1082777 - Posted: 1 Mar 2011, 23:28:15 UTC - in response to Message 1082770.  

Keep in mind UPS's do need new batteries from time to time. ;)
Janice
ID: 1082777 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1082781 - Posted: 1 Mar 2011, 23:34:24 UTC - in response to Message 1082770.  

Thanks for the update Matt,

Claggy
ID: 1082781 · Report as offensive
Thomas Arnold
Volunteer tester

Send message
Joined: 14 May 99
Posts: 56
Credit: 61,046,144
RAC: 0
United States
Message 1082782 - Posted: 1 Mar 2011, 23:54:44 UTC - in response to Message 1082770.  


Thank you as always for the update Matt. Man Alive you all have a lot on your plate. It is always fascinating to read the stuff you do to keep us happily crunching away.

I do want to point out something on the Server status page (like you don't have enough things on the to do list.)

On the the bottom of the page there are definitions/explanations for Tasks ready to send, Tasks in progress, etc. but under the Data Distribution State at the top they are referred to as Results ready to send, etc. I think the Tasks terminology is spot on but the Results reference muddies the waters.

Thanks again to you and everyone at the Lab.

Kind Regards,

Tom
ID: 1082782 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30974
Credit: 53,134,872
RAC: 32
United States
Message 1082803 - Posted: 2 Mar 2011, 1:29:17 UTC

Thanks for the update and please insist Eric get the Beta Status page fixed before V7 work hits the masses.

ID: 1082803 · Report as offensive
Profile Joel

Send message
Joined: 31 Oct 08
Posts: 104
Credit: 4,838,348
RAC: 13
United States
Message 1082889 - Posted: 2 Mar 2011, 8:57:05 UTC

Thanks for the update, and good job keeping everything in order over there! Since the big issues a few weeks ago, things have been looking pretty good. The weekly outages have been short, which is much appreciated by this hobbyist...
ID: 1082889 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 66283
Credit: 55,293,173
RAC: 49
United States
Message 1082893 - Posted: 2 Mar 2011, 9:23:46 UTC

Thanks for the update Matt, Me I just have to pack for a move, Which is being covered in My thread in My sig.
Savoir-Faire is everywhere!
The T1 Trust, T1 Class 4-4-4-4 #5550, America's First HST

ID: 1082893 · Report as offensive
Profile Black Squirrel Prime

Send message
Joined: 29 Jul 07
Posts: 8
Credit: 15,317,965
RAC: 0
United States
Message 1082954 - Posted: 2 Mar 2011, 15:35:13 UTC - in response to Message 1082781.  

Thanks for the update Matt,

Claggy


Just replaced 2 of mine over the weekend - the UPS software was sensing something initiating shutdowns. randomly.
ID: 1082954 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1082960 - Posted: 2 Mar 2011, 16:13:28 UTC - in response to Message 1082954.  


Just replaced 2 of mine over the weekend - the UPS software was sensing something initiating shutdowns. randomly.

I once had a problem where I was trying to communicate with some other device altogether via a serial cable and somehow the computer kept interpreting this as a shutdown command coming from the UPS. I think I disabled the UPS software until I was done with the other thing.

Thanks for the update and all your hard work, Matt. As for me, SSDD. I do notice, however, that my computer hasn't communicated with the project in about 30 hours now. This seems unusual, but I'm sitting on 20 WUs (and no Einstein WUs), so I won't worry about it for another day.

David
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1082960 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9958
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1083034 - Posted: 2 Mar 2011, 20:57:00 UTC - in response to Message 1082777.  

Keep in mind UPS's do need new batteries from time to time. ;)

Around every 2 years.

Also keep in mind that at least one well known UPS manufacturer sets it's default self test to "14 days". I know from experience (my company has 150+ sites in the UK all with one or more UPS) that sometimes the self test can cause the UPS to fail, without any actual error in the log.
ID: 1083034 · Report as offensive
GiftedPlacebo
Avatar

Send message
Joined: 17 May 99
Posts: 3
Credit: 3,332,514
RAC: 0
United States
Message 1083069 - Posted: 2 Mar 2011, 22:03:11 UTC - in response to Message 1083034.  

Keep in mind UPS's do need new batteries from time to time. ;)

Around every 2 years.

Also keep in mind that at least one well known UPS manufacturer sets it's default self test to "14 days". I know from experience (my company has 150+ sites in the UK all with one or more UPS) that sometimes the self test can cause the UPS to fail, without any actual error in the log.


Indeed. Assuming you have machines with redundant power supplies, I like to split machines over multiple UPS's. You can still set up the UPS software to send shutdown notices for "real" power failure events, but when you have a self-test induced power off, it is instant and no shutdown messages are sent (in my experience). It's not fool proof, but it has saved me many times when I've had a UPS fail on our distributed file servers.

ID: 1083069 · Report as offensive
Tom95134

Send message
Joined: 27 Nov 01
Posts: 216
Credit: 3,790,200
RAC: 0
United States
Message 1083075 - Posted: 2 Mar 2011, 22:50:33 UTC - in response to Message 1082777.  

Keep in mind UPS's do need new batteries from time to time. ;)

And they really need to be exercised about once a month with a fairly deep discharge about twice a year.
ID: 1083075 · Report as offensive
Tom95134

Send message
Joined: 27 Nov 01
Posts: 216
Credit: 3,790,200
RAC: 0
United States
Message 1083076 - Posted: 2 Mar 2011, 22:54:23 UTC - in response to Message 1083034.  

Keep in mind UPS's do need new batteries from time to time. ;)

Around every 2 years.

Also keep in mind that at least one well known UPS manufacturer sets it's default self test to "14 days". I know from experience (my company has 150+ sites in the UK all with one or more UPS) that sometimes the self test can cause the UPS to fail, without any actual error in the log.

That's very interesting. I've never had one "burp" the attached equipment due to a test cycle. Even when it is a deep (80~90%) test cycle. All our UPS are APC.

ID: 1083076 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1083094 - Posted: 3 Mar 2011, 0:07:59 UTC - in response to Message 1083076.  

Keep in mind UPS's do need new batteries from time to time. ;)

Around every 2 years.

Also keep in mind that at least one well known UPS manufacturer sets it's default self test to "14 days". I know from experience (my company has 150+ sites in the UK all with one or more UPS) that sometimes the self test can cause the UPS to fail, without any actual error in the log.

That's very interesting. I've never had one "burp" the attached equipment due to a test cycle. Even when it is a deep (80~90%) test cycle. All our UPS are APC.


I recently had to replace the battery in an APC Smart-UPS 720. It was failing its 14-day self-test...

Function: Automatic Self-test
Factory Default: Every 14 days (336 hours)
User Selectable Choices: Every 7 days(168 hours), On Startup Only, No Self test
Description: Set the interval at which the UPS will execute a self-test.

"During the self-test, the UPS briefly operates the connected equipment on battery."
ID: 1083094 · Report as offensive
Swibby Bear

Send message
Joined: 1 Aug 01
Posts: 246
Credit: 7,945,093
RAC: 0
United States
Message 1083112 - Posted: 3 Mar 2011, 1:57:54 UTC

Wow! I am frequently amazed at the interesting stuff posted on these forums. Thanks for all the helpful info.

Whit
ID: 1083112 · Report as offensive
SockGap

Send message
Joined: 16 Apr 07
Posts: 14
Credit: 7,700,416
RAC: 0
Australia
Message 1083180 - Posted: 3 Mar 2011, 11:39:38 UTC - in response to Message 1083069.  

Assuming you have machines with redundant power supplies, I like to split machines over multiple UPS's.


Where I work we were told to not put the redundant power supplies on different phases - something about having 415 volts of potential energy if something goes wrong. With one phase you have 240 volts that will give you a nasty kick. When you have two phases interacting you get 415 volts and that is a lot more likely to kill you. I have no idea if it's the same with multiple UPSs - but in theory they are changing the phase and therefore you could get more of a jolt out of two of them. You'd still have to be pretty unlucky to have something go wrong with two power supplies at once.

I've never had one "burp" the attached equipment due to a test cycle. Even when it is a deep (80~90%) test cycle. All our UPS are APC.


I deal with a few dozen APC UPSs at work and I've seen a faulty battery drop the load during a self test a few times... It seemed to have more to do with the batteries - the ones in some of our hotter cupboards had "dried out" (or at least expanded and cracked the plastic battery case) and were not working at all...
ID: 1083180 · Report as offensive
GiftedPlacebo
Avatar

Send message
Joined: 17 May 99
Posts: 3
Credit: 3,332,514
RAC: 0
United States
Message 1083201 - Posted: 3 Mar 2011, 14:48:48 UTC - in response to Message 1083180.  

Assuming you have machines with redundant power supplies, I like to split machines over multiple UPS's.


Where I work we were told to not put the redundant power supplies on different phases - something about having 415 volts of potential energy if something goes wrong. With one phase you have 240 volts that will give you a nasty kick. When you have two phases interacting you get 415 volts and that is a lot more likely to kill you. I have no idea if it's the same with multiple UPSs - but in theory they are changing the phase and therefore you could get more of a jolt out of two of them. You'd still have to be pretty unlucky to have something go wrong with two power supplies at once.

I've never had one "burp" the attached equipment due to a test cycle. Even when it is a deep (80~90%) test cycle. All our UPS are APC.


I deal with a few dozen APC UPSs at work and I've seen a faulty battery drop the load during a self test a few times... It seemed to have more to do with the batteries - the ones in some of our hotter cupboards had "dried out" (or at least expanded and cracked the plastic battery case) and were not working at all...


All the best practice information I've read suggests putting redundant power supplies on separate UPS and even separate power grids. I think if multiple power supplies failed in such a fashion that there was 415V flowing into the system, your bigger concern would be putting out the fire rather than server maintenance =) But now I'm intrigued, as I've never heard that warning before. Off to Google!

ID: 1083201 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14677
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1083205 - Posted: 3 Mar 2011, 14:58:55 UTC

I've certainly seen equipment killed by a three-phase grounding fault generating 415v. Fortunately, the main victim was a sacrificial surge protector - the telephone PBX behind it was saved. And that was just equipment plugged into a standard UK 13A ring main - in a medium-sized office block, with, I guess, different phases on different floors. Somebody working on the installation connected, or more likely disconnected, the wrong wire.

When I had a couple of redundant PSU servers to look after, knowing that they only need one to run (and in an environment where if the power went out, nobody would need to access the servers anyway), I plugged one PSU into a UPS, and the other direct into the mains. Didn't seem to do any harm.
ID: 1083205 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1083226 - Posted: 3 Mar 2011, 17:14:50 UTC - in response to Message 1083221.  

Thanks Matt, for the update, last power outage, I witnessed,
was 26 years ago, when a 10KV/500V/380V-3 fase,transformer
exploded! Not a big one, though.
After this, alot has been changed. Only 400KV 3 fase is above
ground, every 10KV line, has been put underground.
I remember using an 'antenna' to feed a few Fluorecent lights,
close to the 1000KWatt TV transmitter, which now is out off use,
since atleast 15 years.

Power-outages are also very rare and noone I know, uses an UPS.
But the Netherlands are becomming one big city, atleast the west
part of it, close to the sea.
They already call it the 'Randstad', from Rotterdam to Amsterdam,
is already a city with big green (houses) in between.

And it's a beautifull day, lots of sunshine and about 7C.
(But it still freezes, at night)



ID: 1083226 · Report as offensive
Dena Wiltsie
Volunteer tester

Send message
Joined: 19 Apr 01
Posts: 1628
Credit: 24,230,968
RAC: 26
United States
Message 1083245 - Posted: 3 Mar 2011, 18:21:29 UTC - in response to Message 1083075.  

Keep in mind UPS's do need new batteries from time to time. ;)

And they really need to be exercised about once a month with a fairly deep discharge about twice a year.

Lead acid should not be cycled if possible because it shortens it's life. Some designs are better able to withstand cycling than others but they all age when discharged. Most other battery types do last longer if you cycle them.
Battery University
Deep cycle batteries
ID: 1083245 · Report as offensive
1 · 2 · Next

Message boards : Technical News : Red Shift (Mar 01 2011)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.