Hot Stuff Coming Through (Dec 21 2009)


log in

Advanced search

Message boards : Technical News : Hot Stuff Coming Through (Dec 21 2009)

Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1389
Credit: 74,079
RAC: 0
United States
Message 957941 - Posted: 21 Dec 2009, 22:52:47 UTC

Regarding the science database issues over the past couple of months, let me recap: this is the informix/science database, not the mysql/user database. We noticed that one of the tables in astropulse got corrupted. No big deal - we lost a couple rows out of 80 million. In the process of fixing this we noticed that the astropulse portion of the database hasn't been replicated properly to the secondary informix/science database. So this whole project was to fix two things: the corrupted table, and the broken replication. Ultimately we learned a lot along the way cleaning this up ourselves, but each iteration has been sloooow, and a lot of time was lost trying certain things which seemed obvious, but didn't work like we expected. We're nearing the final stages of this (we hope). One silver lining is that we'll get to test recovery of the secondary from our weekly database backup - these backups are 1.2 terabytes in size at this point, so we don't test this procedure often.

So.. there's been upload issues starting yesterday. Not sure why, maybe we were just being blitzed more than normal. We tweaked our configuration around this morning so that the scheduling server is now also handling 25% of the upload load. Maybe that'll help push the clog through.

In worse news, our server closet temperature shot up way too high this weekend. Machines were running 10 degree (Celsius) hotter than normal, and well beyond spec and in some cases the "danger zone." This isn't good. We're hoping somebody on campus will come up today and inspect our a/c system, but given it's the holidays we might not get anybody until the new year, in which case... we'll have to shut everything down for a while. We shall see.

- Matt

____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4088
Credit: 33,018,447
RAC: 6,708
United Kingdom
Message 957943 - Posted: 21 Dec 2009, 23:00:53 UTC - in response to Message 957941.

Thanks for the update Matt,

Claggy

B-Man
Volunteer tester
Send message
Joined: 11 Feb 01
Posts: 253
Credit: 147,366
RAC: 0
United States
Message 957948 - Posted: 21 Dec 2009, 23:17:49 UTC

Matt I hope the repair techs come in soon. How soon will you need to make the decision to shut down by? Please keep the main page up with a note if you need to shutdown.
____________

Profile Gary CharpentierProject donor
Volunteer tester
Avatar
Send message
Joined: 25 Dec 00
Posts: 12506
Credit: 6,809,638
RAC: 5,696
United States
Message 957969 - Posted: 22 Dec 2009, 0:51:53 UTC - in response to Message 957941.

In worse news, our server closet temperature shot up way too high this weekend. Machines were running 10 degree (Celsius) hotter than normal, and well beyond spec and in some cases the "danger zone." This isn't good. We're hoping somebody on campus will come up today and inspect our a/c system, but given it's the holidays we might not get anybody until the new year, in which case... we'll have to shut everything down for a while. We shall see.

Just make sure someone didn't set some fancy automatic thermostat to shut off over the weekends!

Thanks for the update

____________

Profile RottenMutt
Avatar
Send message
Joined: 15 Mar 01
Posts: 992
Credit: 207,654,737
RAC: 0
United States
Message 957979 - Posted: 22 Dec 2009, 2:02:23 UTC

get a large fan...
____________

Profile T. Bell
Send message
Joined: 27 Mar 06
Posts: 20
Credit: 229,303
RAC: 0
Australia
Message 957981 - Posted: 22 Dec 2009, 2:31:22 UTC

Damn! Servers bit hot under the collar?

I noticed the upload server was down for a bit and wasn't quite sure why. It seems to be up again after the tweaks but now it says the schedule server is disabled. I know ya set it to handle some of the uploads but would this normally disable the entire schedule server?

Once again cheers for the update. They always answer any questions regarding the server status.

-- Tom :-)
____________

Message boards : Technical News : Hot Stuff Coming Through (Dec 21 2009)

Copyright © 2014 University of California