Breathing (Jan 25 2011)


log in

Advanced search

Message boards : Technical News : Breathing (Jan 25 2011)

Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1391
Credit: 74,079
RAC: 10
United States
Message 1070578 - Posted: 25 Jan 2011, 23:56:04 UTC

Progress. We had our regular weekly outage (mysql backup/compression) during which we continue fixing older problems and tackled newer stuff.

To update the bruno status: I think I solved all its disk problems, I "shredded" the the root drives - something lingering vestige of a former partition on there was making the Fedora installer go nuts. Then I was able to successfully get a new OS on there and boot it up. I then managed to upgrade the firmware on the 3ware raid card, which seems to have removed its penchant for making drives go missing upon regular system reboots. So.. without any need for additional hardware we got the old bruno ready to assume its old duties again. Meanwhile synergy has been doing a good job pretending to be bruno. By next week sometime we'll be back to where we were.

Meanwhile, I finally had a moment to add the memory recently donated for synergy - so it's up to a full 96GB of RAM (just like oscar and carolyn).

There was some hardware shuffling in the closet, so both oscar and carolyn were shut down during the outage, which means both databases need to flood their caches for a while before the project gets back up to speed. During the oscar reboot I set the data partition to mount with the "noatime" flag - this may help i/o a little bit. Also still messing with raid configuration on those systems. We may see additional performance improvements over time.

We're also aggressively working on the ptolemy/thumper transformations, which means getting all the stuff on thumper currently off of it so we can reformat everything on that system and have it take over ptolemy's duties (all internal use). I was hoping to do this partition by partition by long ago we decided to make all these partitions on top of a single LVM volume group, which means removing partitions require a major song and dance - unless we just blow it all away at once. I choose the latter.

- Matt

____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

SMWProject donor
Send message
Joined: 16 May 99
Posts: 21
Credit: 11,418,674
RAC: 3,354
United States
Message 1070581 - Posted: 26 Jan 2011, 0:02:38 UTC

MAtt,
Thanks for keeping us up to date:)
____________
"It is better to be hated for what you are then to be loved for what you are not"
- Andre Gide (1869-1951)

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4247
Credit: 34,972,525
RAC: 21,932
United Kingdom
Message 1070583 - Posted: 26 Jan 2011, 0:05:15 UTC - in response to Message 1070578.

Thanks for the update Matt, congrats for making progress on Bruno,

Claggy

Profile Frizz
Volunteer tester
Avatar
Send message
Joined: 17 May 99
Posts: 271
Credit: 5,852,934
RAC: 0
New Zealand
Message 1070585 - Posted: 26 Jan 2011, 0:11:06 UTC - in response to Message 1070583.

Hrm ... is it only me - or do others have problems to report as well?
____________
Petition against 1366x768 glare displays: http://www.facebook.com/home.php?sk=group_153240404724993

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8824
Credit: 53,572,275
RAC: 46,956
United Kingdom
Message 1070608 - Posted: 26 Jan 2011, 0:53:20 UTC - in response to Message 1070585.

Hrm ... is it only me - or do others have problems to report as well?

They were slow at first for me, but seem to be picking up steadily as the server DB caches refill.

Big Bang
Send message
Joined: 7 Jan 10
Posts: 670
Credit: 28,481
RAC: 0
Message 1070620 - Posted: 26 Jan 2011, 1:54:05 UTC

Thankyou Matt. I always keep a good thought for you antipodeans slaving away on fractured HAL over there. Much appreciated. I'll have a beer on this, our auspicious ORSTRAYLYA Day, in your honour. Bewdy mate.

Profile Todd Hebert
Volunteer tester
Avatar
Send message
Joined: 16 Jun 00
Posts: 647
Credit: 217,127,962
RAC: 0
United States
Message 1070684 - Posted: 26 Jan 2011, 4:29:10 UTC

Glad that you got RAID card and array going again. And in the long run it was a good healthy test to see if the GPU Users server was up to the task of a heavy throughput role in the event of a disaster.

There may have been fewer spindles but the drives were also SAS-2 so twice the thru-put. For heavy transactions we only use 15k RPM drives in RAID 10 or 50 and add in SSD cache from Intel or Adaptec - works awesome!

Excellent work Matt!
Todd
____________

Profile Corvid
Avatar
Send message
Joined: 31 Oct 05
Posts: 12
Credit: 4,913,477
RAC: 1,166
United States
Message 1070928 - Posted: 26 Jan 2011, 21:41:19 UTC

Thanks for the update Matt,

I's amazing how much a firmware update can help. You think everything was at least working with the old firmware, but then when it gets updated a lot of sometimes seemingly unrelated problems just go away.
____________

Profile Zapped SparkyProject donor
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 30 Aug 08
Posts: 9393
Credit: 1,336,499
RAC: 803
United Kingdom
Message 1070932 - Posted: 26 Jan 2011, 21:57:29 UTC

I'm glad you've solved Bruno's issues. Hopefully it didn't involve to much hair removal :)

Robert Ribbeck
Avatar
Send message
Joined: 7 Jun 02
Posts: 644
Credit: 5,283,174
RAC: 0
United States
Message 1071147 - Posted: 27 Jan 2011, 15:39:39 UTC

Any chance of fixing "the user of the Day" on the main page
It's been stuck on the same guy for a week
____________

Message boards : Technical News : Breathing (Jan 25 2011)

Copyright © 2014 University of California