Wednesday Mound (Feb 06 2008)

Message boards : Technical News : Wednesday Mound (Feb 06 2008)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 708978 - Posted: 6 Feb 2008, 23:04:24 UTC

Recovery from yesterday's outage wasn't so bad after all, but we're hitting another wall. Well, not a wall as much as a mound. That mound is our science database server, thumper. Those watching the status page may have been noticing it's having a harder and harder time to keep up with making work (ready-to-send queue is hardly ever full) and keeping up with assimilation (ready-to-assimilate queue is hardly ever empty - in fact, it's been growing slowly over the past 24 hours).

Of course, it's not the database load - thumper has almost 50 Terabytes of storage on it, so it also serves as our raw data buffer (where we keep all the data images for the splitters to chew on) as well as database backup storage (where we write/archive a 500GB data file every week). In short, we're hitting disk I/O limits on thumper. I fear making the "vertical" splitter (which acts on many raw data files simultaneously to reduce impact of hitting too much noise on a single file) has reduced any benefit of disk caching to zero. Since we're basically keeping up now, I whittled our number of splitters from 10 to 6 - hopefully this will help. I don't want to revert to non-vertical splitting just yet - we'll have greater problems if we do. Bob may also employ so different informix checkpointing parameters to reduce the impact of long checkpoints blocking science database traffic about 25% of the time. We're pretty much in wait-and-see mode on that.

Jeff and I are more or less done hammering out the current set of kinks in our data pipeline from Arecibo to your computer. This will all be automated shortly. We also just threw a very short chunk of data into the splitter queue from last week (28ja08aa). It's already being split, actually. This contains radar blanking data. We're going to process it once without the blanker logic, and again with. It's a data-beta-test. We want to be really make sure it works before processing dozens of whole files. I'll try to remember to throw up some before/after plots comparing the two runs once they are complete.

- Matt

-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 708978 · Report as offensive
Profile [KWSN]John Galt 007
Volunteer tester
Avatar

Send message
Joined: 9 Nov 99
Posts: 2444
Credit: 25,086,197
RAC: 0
United States
Message 708982 - Posted: 6 Feb 2008, 23:23:03 UTC

Thanks for the update...hope your weather is better than ours in Wisconsin. Milwaukee could get 12-18 inches of snow...
Clk2HlpSetiCty:::PayIt4ward

ID: 708982 · Report as offensive
Profile Dr. C.E.T.I.
Avatar

Send message
Joined: 29 Feb 00
Posts: 16019
Credit: 794,685
RAC: 0
United States
Message 709049 - Posted: 7 Feb 2008, 0:44:37 UTC

Thanks for the Post MAtt . . . it's Much Appreciated Sir!
BOINC Wiki . . .

Science Status Page . . .
ID: 709049 · Report as offensive
Jamenjaw

Send message
Joined: 1 Jul 05
Posts: 11
Credit: 217,684
RAC: 0
United States
Message 709109 - Posted: 7 Feb 2008, 2:26:27 UTC

good there hear(see?) things are going somewhat better for you guys.

and to john COULD get 12-18 in my neck of the woods were at 18+ all ready:D
ID: 709109 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 3187
Credit: 57,163,290
RAC: 0
United States
Message 709135 - Posted: 7 Feb 2008, 3:19:39 UTC - in response to Message 709109.  

good there hear(see?) things are going somewhat better for you guys.

and to john: COULD get 12-18 in my neck of the woods were at 18+ all ready:D


Yes, we, (or at least I...) out here on the left coast, have heard about the current weather in the mid-west. I heard that it was below zero in Iowa, for example: (Des Moines) it came up while I was discussing a truck (a big rig) that I may buy. (the truck is in California, but the sales staff at the company is in Iowa...)

The S.F. Bay Area is experiencing a mild "cold snap" in that temperatures are below normal, even for this time of year.
.

Hello, from Albany, CA!...
ID: 709135 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 3187
Credit: 57,163,290
RAC: 0
United States
Message 709136 - Posted: 7 Feb 2008, 3:23:06 UTC

Matt, I know you are busy, but...

the "future plans" page has a number of typos and mis-spellings on it - can someone run a spell-checker on it?
.

Hello, from Albany, CA!...
ID: 709136 · Report as offensive
DJStarfox

Send message
Joined: 23 May 01
Posts: 1066
Credit: 1,226,053
RAC: 2
United States
Message 709188 - Posted: 7 Feb 2008, 6:51:11 UTC - in response to Message 708978.  
Last modified: 7 Feb 2008, 6:51:29 UTC

Is there another storage device you can use instead of thumper for the splitter scratch disk?

Could be also that you're asking the 50TB volume to do both fast random I/O (r/w database files) and fast sequential I/O; and those don't play well together for high performance. Splitters write a lot of sequential I/O, correct? Ideally, they would be separate RAID arrays. Is this on the hardware donation list?
ID: 709188 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 709330 - Posted: 7 Feb 2008, 17:53:34 UTC

Replies:

I fixed the spelling errors on the plans page. I hate bad spelling. Thanks for pointing that out.

Unfortunately thumper is the only available contiguous space of that size - after all the RAIDing and file system overhead the raw data storage on it is 8TB. But you're right - we're fully aware the RAID/LVM configuration on the thing isn't optimal, but there's no immediate hardware solution and even if there was it's gonna be long and painful to move all the data around onto new storage. We'll bite the bullet at some point - in the meantime we're doing more database performance tweaking now. Stay tuned.

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 709330 · Report as offensive
P . P . L .
Volunteer tester

Send message
Joined: 7 Jun 03
Posts: 86
Credit: 161,216
RAC: 0
Australia
Message 709375 - Posted: 7 Feb 2008, 20:54:08 UTC

Hi Matt.

I keep getting http errors when trying to U/L, but D/L are o.k.

is that something your working on, or some fallout from yesterday.

pete.


ID: 709375 · Report as offensive

Message boards : Technical News : Wednesday Mound (Feb 06 2008)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.