Outage (Jun 02 2009)


log in

Advanced search

Message boards : Technical News : Outage (Jun 02 2009)

Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1384
Credit: 74,079
RAC: 0
United States
Message 902975 - Posted: 2 Jun 2009, 23:29:04 UTC

Had the weekly outage today - the normal database/compression/cleanup stuff was by the book, however we took the time to address some other hardware issues. First and foremost, we replaced the failed drive on thumper. I was griping about this yesterday and how this means we'll have to reboot, which means we're forced to resync the root RAID devices. Well, that's happening now. I also upgraded the kernel on worf. That sort of went well - except upon coming back on line one of the spare drives was marked as failed. We're dealing with that now.

Coming out of these weekly outages has gotten painful given our increased rate of traffic lately, and these web queries that continue to clobber us. I try to aim these at the replica, which helps, but right after outages the replica is effectively offline for many hours as it is still busy recreating the giant tables. So I have to temporarily aim those web queries at the master, which makes recovery even slower. We gotta figure this all out, come up with a better weekly backup/reorg policy, or get that new replica server up and running sooner than later. We did order drives for it - should be here later in the week.

- Matt

____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Profile Johnney Guinness
Volunteer tester
Avatar
Send message
Joined: 11 Sep 06
Posts: 2950
Credit: 2,306,418
RAC: 0
Ireland
Message 902989 - Posted: 2 Jun 2009, 23:41:22 UTC
Last modified: 2 Jun 2009, 23:44:04 UTC

Matt, thank you!

You are the only one who faithfully keeps us up-to-date about whats going on. Without you Matt, we would never know what was happening with our favourite project :) Matt, you give me hope, and isn't that what keeps this project alive, Hope!

John.
____________

Profile Ageless
Avatar
Send message
Joined: 9 Jun 99
Posts: 12130
Credit: 2,525,019
RAC: 653
Netherlands
Message 902992 - Posted: 2 Jun 2009, 23:43:30 UTC - in response to Message 902975.
Last modified: 2 Jun 2009, 23:46:39 UTC

So Commander Worf and Chancellor Gowron are named, but the greatest General 'still alive' (Martok) isn't?

Hu'tegh! (qoH vuvbe' SuS) ;-)

((You do know that Worf killed Gowron, don't you? So perhaps Gowron is having its revenge in your server closet.)

Are those web queries still bots reading stats?
Since the replica database is effectively off line for a couple of hours after the outage, isn't it a good idea then to also disable the stats? Or does that not matter for the bots?
____________
Jord

Loving awareness is free.

Rob
Send message
Joined: 4 Sep 07
Posts: 57
Credit: 2,444,762
RAC: 0
Canada
Message 903047 - Posted: 3 Jun 2009, 1:21:16 UTC - in response to Message 902975.

Had the weekly outage today - the normal database/compression/cleanup stuff was by the book, however we took the time to address some other hardware issues. First and foremost, we replaced the failed drive on thumper. I was griping about this yesterday and how this means we'll have to reboot, which means we're forced to resync the root RAID devices. Well, that's happening now. I also upgraded the kernel on worf. That sort of went well - except upon coming back on line one of the spare drives was marked as failed. We're dealing with that now.

Coming out of these weekly outages has gotten painful given our increased rate of traffic lately, and these web queries that continue to clobber us. I try to aim these at the replica, which helps, but right after outages the replica is effectively offline for many hours as it is still busy recreating the giant tables. So I have to temporarily aim those web queries at the master, which makes recovery even slower. We gotta figure this all out, come up with a better weekly backup/reorg policy, or get that new replica server up and running sooner than later. We did order drives for it - should be here later in the week.

- Matt



Thank you sir. Your updates and constant efforts are greatly appreciated

zpm
Volunteer tester
Avatar
Send message
Joined: 25 Apr 08
Posts: 284
Credit: 1,177,499
RAC: 3,394
United States
Message 903107 - Posted: 3 Jun 2009, 4:23:48 UTC - in response to Message 903047.

get another server to help with the load is really the only solution.
____________

I recommend Secunia PSI: http://secunia.com/vulnerability_scanning/personal/
Go Georgia Tech.

Speedy
Volunteer tester
Avatar
Send message
Joined: 26 Jun 04
Posts: 623
Credit: 4,804,187
RAC: 6,554
New Zealand
Message 903179 - Posted: 3 Jun 2009, 8:58:27 UTC

Thanks team for all the hard work. One thing to note is the outage notice is still showing on the front page

____________

Live in NZ y not join Smile City?

Aurora Borealis
Volunteer tester
Avatar
Send message
Joined: 14 Jan 01
Posts: 2975
Credit: 4,817,003
RAC: 1,299
Canada
Message 903204 - Posted: 3 Jun 2009, 11:42:48 UTC - in response to Message 903179.

Thanks team for all the hard work. One thing to note is the outage notice is still showing on the front page

If your talking about the news notice, that a permanent notice.
It you can't view the front page, flush your browser cache.

Profile Jet
Send message
Joined: 25 Sep 07
Posts: 12
Credit: 1,586,013
RAC: 0
Ukraine
Message 903220 - Posted: 3 Jun 2009, 12:24:56 UTC - in response to Message 903204.

Unfortunately, flushing the browser cache doen't help. Weekly outage notice is still on the front page.

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 903272 - Posted: 3 Jun 2009, 16:35:03 UTC - in response to Message 903107.

get another server to help with the load is really the only solution.

There is always this pesky problem about money.
____________

Speedy
Volunteer tester
Avatar
Send message
Joined: 26 Jun 04
Posts: 623
Credit: 4,804,187
RAC: 6,554
New Zealand
Message 903326 - Posted: 3 Jun 2009, 20:42:10 UTC - in response to Message 903204.

Thanks team for all the hard work. One thing to note is the outage notice is still showing on the front page

If your talking about the news notice, that a permanent notice.
It you can't view the front page, flush your browser cache.

Everything is sorted on front page. Thanks
____________

Live in NZ y not join Smile City?

Message boards : Technical News : Outage (Jun 02 2009)

Copyright © 2014 University of California