Message boards :
Technical News :
Wednesday Mound (Feb 06 2008)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Recovery from yesterday's outage wasn't so bad after all, but we're hitting another wall. Well, not a wall as much as a mound. That mound is our science database server, thumper. Those watching the status page may have been noticing it's having a harder and harder time to keep up with making work (ready-to-send queue is hardly ever full) and keeping up with assimilation (ready-to-assimilate queue is hardly ever empty - in fact, it's been growing slowly over the past 24 hours). Of course, it's not the database load - thumper has almost 50 Terabytes of storage on it, so it also serves as our raw data buffer (where we keep all the data images for the splitters to chew on) as well as database backup storage (where we write/archive a 500GB data file every week). In short, we're hitting disk I/O limits on thumper. I fear making the "vertical" splitter (which acts on many raw data files simultaneously to reduce impact of hitting too much noise on a single file) has reduced any benefit of disk caching to zero. Since we're basically keeping up now, I whittled our number of splitters from 10 to 6 - hopefully this will help. I don't want to revert to non-vertical splitting just yet - we'll have greater problems if we do. Bob may also employ so different informix checkpointing parameters to reduce the impact of long checkpoints blocking science database traffic about 25% of the time. We're pretty much in wait-and-see mode on that. Jeff and I are more or less done hammering out the current set of kinks in our data pipeline from Arecibo to your computer. This will all be automated shortly. We also just threw a very short chunk of data into the splitter queue from last week (28ja08aa). It's already being split, actually. This contains radar blanking data. We're going to process it once without the blanker logic, and again with. It's a data-beta-test. We want to be really make sure it works before processing dozens of whole files. I'll try to remember to throw up some before/after plots comparing the two runs once they are complete. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
[KWSN]John Galt 007 Send message Joined: 9 Nov 99 Posts: 2444 Credit: 25,086,197 RAC: 0 |
Thanks for the update...hope your weather is better than ours in Wisconsin. Milwaukee could get 12-18 inches of snow... Clk2HlpSetiCty:::PayIt4ward |
Dr. C.E.T.I. Send message Joined: 29 Feb 00 Posts: 16019 Credit: 794,685 RAC: 0 |
Thanks for the Post MAtt . . . it's Much Appreciated Sir! BOINC Wiki . . . Science Status Page . . . |
Jamenjaw Send message Joined: 1 Jul 05 Posts: 11 Credit: 217,684 RAC: 0 |
good there hear(see?) things are going somewhat better for you guys. and to john COULD get 12-18 in my neck of the woods were at 18+ all ready:D |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
good there hear(see?) things are going somewhat better for you guys. Yes, we, (or at least I...) out here on the left coast, have heard about the current weather in the mid-west. I heard that it was below zero in Iowa, for example: (Des Moines) it came up while I was discussing a truck (a big rig) that I may buy. (the truck is in California, but the sales staff at the company is in Iowa...) The S.F. Bay Area is experiencing a mild "cold snap" in that temperatures are below normal, even for this time of year. . Hello, from Albany, CA!... |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
Matt, I know you are busy, but... the "future plans" page has a number of typos and mis-spellings on it - can someone run a spell-checker on it? . Hello, from Albany, CA!... |
DJStarfox Send message Joined: 23 May 01 Posts: 1066 Credit: 1,226,053 RAC: 2 |
Is there another storage device you can use instead of thumper for the splitter scratch disk? Could be also that you're asking the 50TB volume to do both fast random I/O (r/w database files) and fast sequential I/O; and those don't play well together for high performance. Splitters write a lot of sequential I/O, correct? Ideally, they would be separate RAID arrays. Is this on the hardware donation list? |
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Replies: I fixed the spelling errors on the plans page. I hate bad spelling. Thanks for pointing that out. Unfortunately thumper is the only available contiguous space of that size - after all the RAIDing and file system overhead the raw data storage on it is 8TB. But you're right - we're fully aware the RAID/LVM configuration on the thing isn't optimal, but there's no immediate hardware solution and even if there was it's gonna be long and painful to move all the data around onto new storage. We'll bite the bullet at some point - in the meantime we're doing more database performance tweaking now. Stay tuned. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
P . P . L . Send message Joined: 7 Jun 03 Posts: 86 Credit: 161,216 RAC: 0 |
Hi Matt. I keep getting http errors when trying to U/L, but D/L are o.k. is that something your working on, or some fallout from yesterday. pete. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.