Recovery (Nov 10 2009)

log in

Advanced search

Message boards : Technical News : Recovery (Nov 10 2009)

Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 1 Mar 99
Posts: 1441
Credit: 210,886
RAC: 0
United States
Message 946418 - Posted: 10 Nov 2009, 22:58:34 UTC

Today's Tuesday - that means we had our normal weekly maintenance outage, and we're recovering from that now. Outside of the normal database compression, backup, and log rotation type tasks we also took care of the following:

1. Replaced the faulty drive on thumper (the primary science database server). This system is on Sun Service so such hardware failures are trivial. A drive fails, we call Sun, they send us a new drive right away, we plop it in, we send back the old drive, done. However there are still nagging problems on thumper at the OS/database level that still require our attention (a corrupt row in the Astropulse signal database and that funky root/RAID configuration that can only be fixed during a clean OS install).

2. Upgraded mysql on both the master and replica servers (mork and jocelyn) to version 5.1.37. This was finally made available in the Fedora distros and from what I've been told may fix those unload/reload formatting bugs. While we were at it, we yum'ed up pretty much everything.

3. Rebooted mork and ptolemy to pick up crash-dump parameters for the kernels. We were going to install debug versions of the kernels but Jeff was having odd results with that while testing one on his desktop, so we're holding off for now. Rebooted jocelyn to pick up a new kernel as well.

That's about it for the outage. Recovery will continue for a while. I'm rebuilding the replica mysql database right now using the dump from today. When that's finished we'll start up the replica (maybe tomorrow morning).

Speaking of tomorrow morning, it's a holiday (Veteran's Day), so I won't be up at the lab (probably just doing the usual "check in from home every few hours and tweak this and that").

- Matt

-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

ClaggyProject Donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4609
Credit: 45,686,526
RAC: 3,912
United Kingdom
Message 946436 - Posted: 11 Nov 2009, 0:35:29 UTC - in response to Message 946418.

Thanks for all the work, and the update Matt.


Send message
Joined: 25 Aug 01
Posts: 77
Credit: 386,336
RAC: 0
United States
Message 946647 - Posted: 12 Nov 2009, 2:54:30 UTC - in response to Message 946436.

thanks for the info.

Enjoy your holiday!

Be Blessed & Be A Blessing,

ron tillman

Profile Geek@Play
Volunteer tester
Send message
Joined: 31 Jul 01
Posts: 2467
Credit: 86,146,931
RAC: 0
United States
Message 947736 - Posted: 16 Nov 2009, 23:42:36 UTC

Is the BOINC database server really crashing on a daily basis now as indicated on Seti's main page?

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 6864
Credit: 84,712,964
RAC: 27,623
Message 947807 - Posted: 17 Nov 2009, 5:44:51 UTC - in response to Message 947736.

Is the BOINC database server really crashing on a daily basis now as indicated on Seti's main page?

No, that note is left over from the day it did crash (Friday i think it was).
Darwin NT.

Profile PKII
Send message
Joined: 28 May 07
Posts: 151
Credit: 1,994,676
RAC: 1,480
United States
Message 947816 - Posted: 17 Nov 2009, 6:52:43 UTC

My unit says it was recorded March 23 2007 we going through old data again or is this something else?

Profile Gundolf Jahn
Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 380,029
RAC: 31
Message 947826 - Posted: 17 Nov 2009, 8:20:53 UTC - in response to Message 947816.

If you advance two days in technical news, you'll find Message 947169.

Message boards : Technical News : Recovery (Nov 10 2009)

Copyright © 2016 University of California