Tuesday Edition (Aug 26 2008)

Message boards : Technical News : Tuesday Edition (Aug 26 2008)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 802336 - Posted: 26 Aug 2008, 22:53:45 UTC

Ah, yes - here we go again - the regular Tuesday outage for mysql database backup/compression and other tasks better suited to happen during "quiescent" time.

For example, this week we replaced the failed drive in the workunit storage server with a new drive. That was painless. We also spent a bunch of time experimenting with the new-ish RAID server. I say "new-ish" as it's new to us, but it is an old system. For example, it can't handle logical volumes greater than 2TB. We however today confirmed (a) it can handle physical single drives at least 750GB in size, and (b) physical volumes greater than 2TB (i.e. put three 750GB drive together to make a 1.5TB RAID5).

We also tested that this system is keeping up pretty well doing a continual backup of our upload directory. That is, we're doing a constant rsync with the upload directory to keep a "hot backup" around on a separate system. We didn't have the bandwidth/storage capacity to do this ourselves before (and daily backups to tape were too expensive).

Anyway.. the extended length of the outage today was mostly due to revamping the way we're doing the backups. We're working to include better query blocking (to ensure the database is totally update-free) and figure out the best way to maximize our time, thus ultimately shortening these outages.

- Matt

-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 802336 · Report as offensive
Profile Dr. C.E.T.I.
Avatar

Send message
Joined: 29 Feb 00
Posts: 16019
Credit: 794,685
RAC: 0
United States
Message 802339 - Posted: 26 Aug 2008, 22:57:34 UTC
Last modified: 26 Aug 2008, 22:58:09 UTC

. . . nice work Matt - and that goes out to each of you @ Berkeley as well
BOINC Wiki . . .

Science Status Page . . .
ID: 802339 · Report as offensive
Profile Edward Lee Michau
Avatar

Send message
Joined: 31 Jul 06
Posts: 138
Credit: 9,640,846
RAC: 0
United States
Message 802348 - Posted: 26 Aug 2008, 23:22:21 UTC

Great work Matt. Glad to see you back. Hope your vacation was super.

Ed
ID: 802348 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 802351 - Posted: 26 Aug 2008, 23:42:42 UTC - in response to Message 802336.  

Matt, thanks for the update. A suggestion,update the weekly Tuesday notice --instead of 3 to 4 hours, have it be either 5 to 6 hours (which seems to be more accurate), or 3 to 6 hours (which allows for the relatively rare situation where nothing extra is scheduled and everything runs smoothly). It is all about setting expectations.

Ah, yes - here we go again - the regular Tuesday outage for mysql database backup/compression and other tasks better suited to happen during "quiescent" time.

For example, this week we replaced the failed drive in the workunit storage server with a new drive. That was painless. We also spent a bunch of time experimenting with the new-ish RAID server. I say "new-ish" as it's new to us, but it is an old system. For example, it can't handle logical volumes greater than 2TB. We however today confirmed (a) it can handle physical single drives at least 750GB in size, and (b) physical volumes greater than 2TB (i.e. put three 750GB drive together to make a 1.5TB RAID5).

We also tested that this system is keeping up pretty well doing a continual backup of our upload directory. That is, we're doing a constant rsync with the upload directory to keep a "hot backup" around on a separate system. We didn't have the bandwidth/storage capacity to do this ourselves before (and daily backups to tape were too expensive).

Anyway.. the extended length of the outage today was mostly due to revamping the way we're doing the backups. We're working to include better query blocking (to ensure the database is totally update-free) and figure out the best way to maximize our time, thus ultimately shortening these outages.

- Matt


ID: 802351 · Report as offensive
DJStarfox

Send message
Joined: 23 May 01
Posts: 1066
Credit: 1,226,053
RAC: 2
United States
Message 802394 - Posted: 27 Aug 2008, 1:37:36 UTC - in response to Message 802336.  
Last modified: 27 Aug 2008, 1:38:51 UTC

No one made this comment yet... I just wanted to say great work getting the assimilation queue down to less than 100k. From what I've seen, it's been a backlog for a while until now. That should help storage space a little too, I'd imagine.

PS And I hope donations from Bluff's drive last week make some kind of difference we can see.
ID: 802394 · Report as offensive
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 802466 - Posted: 27 Aug 2008, 7:00:41 UTC

Thank you for this information, now I have a problem I have a small knowledge of computers. So I was wondering if you we can have this with laymans language, so that people like myself can truly understand what a great job you are doing and what you are doing every week to improve this wonderful system.
ID: 802466 · Report as offensive
Profile Keith T.
Volunteer tester
Avatar

Send message
Joined: 23 Aug 99
Posts: 962
Credit: 537,293
RAC: 9
United Kingdom
Message 802681 - Posted: 28 Aug 2008, 0:32:32 UTC - in response to Message 802466.  

Thank you for this information, now I have a problem I have a small knowledge of computers. So I was wondering if you we can have this with laymans language, so that people like myself can truly understand what a great job you are doing and what you are doing every week to improve this wonderful system.


@ madmac

I think Matt does already post mostly in understandable language, after all this is the "Technical News" forum.

May I suggest that if you don't understand a particular technical term, you could try looking it up on a site like http://www.whatis.com or http://www.wikipedia.org or Google etc.

Regards, Keith
Sir Arthur C Clarke 1917-2008
ID: 802681 · Report as offensive

Message boards : Technical News : Tuesday Edition (Aug 26 2008)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.