Hemiola (Aug 12 2010)


log in

Advanced search

Message boards : Technical News : Hemiola (Aug 12 2010)

1 · 2 · Next
Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1389
Credit: 74,079
RAC: 0
United States
Message 1024344 - Posted: 12 Aug 2010, 20:58:48 UTC

Wrapping up the weekly "extended outage." Jeff's actually out today, but will be back to turn the servers on tomorrow (i.e. Friday, when I'm usually out).

I finally got around to testing a drive on mork (the mysql server) that the RAID card deemed "failed" at some point, but maybe that was a transient problem as it seems fine now. Nevertheless I went through the rigamarole of pulling that drive, putting a new on in, testing it, making it a new hot spare, etc.

That's all good, but the week in general has been tainted by mork issues in general. It had one of its regular mystery crashes on Tuesday (followed by a long recovery). Then last night, and again this morning, the RAID mirror of two solid state drives (where we keep the innodb logs) started going flakey on us. The partition would just disappear, sending mysql into fits. We were able to quickly recover, but we're abandoning the solid state drives for now. Honestly, they weren't adding all that much to the i/o picture because we were cautious about how we were implementing them. Now I'm glad we were cautious. The upshot of all the above meant that we had to recovery the replica as many as four times so far from the weekly backup. What a pain. The latest replica recovery is happening as I type this. All I hope is that all systems are normal and stable by tomorrow.

Everything else is fine. In fact, more than fine as a set of very generous participants donated $6000 towards a new server that will become the new science database server. THANK YOU!! We're still spec'ing out said server, but will go ahead sooner than later now that we don't have to set up a funding drive!

Meanwhile I'm still chipping away at various data analysis projects, Jeff's been fighting with data syncronization issues that have been creeping in more and more lately. We also had a "design meeting" regarding where to go with the public involvement of candidate selection. I'm finding some plug-n-play visualization utilities on line, but pretty much I'm finding (like always) it might just be easier and better if I do it all myself with tools I already know. However, some improvements go beyond that scope, so I'm digging into AJAX which is good stuff to know, I guess.

- Matt

____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Profile Ageless
Avatar
Send message
Joined: 9 Jun 99
Posts: 12324
Credit: 2,626,601
RAC: 951
Netherlands
Message 1024346 - Posted: 12 Aug 2010, 21:07:11 UTC - in response to Message 1024344.

Shouldn't you name that new server after the benefactors? Or is MRJHJT too difficult to pronounce in the office? ;-)
____________
Jord

Fighting for the correct use of the apostrophe, together with Weird Al Yankovic

Profile Scarecrow
Avatar
Send message
Joined: 15 Jul 00
Posts: 4385
Credit: 459,193
RAC: 1
United States
Message 1024348 - Posted: 12 Aug 2010, 21:09:07 UTC - in response to Message 1024346.

MRJHJT

Oddly enough, that's the noise a solid state drive makes when it augers in.

DJStarfox
Send message
Joined: 23 May 01
Posts: 1044
Credit: 559,966
RAC: 582
United States
Message 1024351 - Posted: 12 Aug 2010, 21:15:53 UTC - in response to Message 1024344.

With $6k, would be nice to squeeze a real RAID card out for the new server.

Profile [seti.international] Dirk SadowskiProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7101
Credit: 60,867,954
RAC: 17,251
Germany
Message 1024352 - Posted: 12 Aug 2010, 21:19:19 UTC - in response to Message 1024344.

Matt, thanks for the news!

____________
BR

SETI@home Needs your Help ... $10 & U get a Star!

Team seti.international

Das Deutsche Cafe. The German Cafe.

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4139
Credit: 33,461,439
RAC: 21,450
United Kingdom
Message 1024353 - Posted: 12 Aug 2010, 21:20:17 UTC - in response to Message 1024344.

Thanks for the update Matt,

Claggy

Profile Bill Walker
Avatar
Send message
Joined: 4 Sep 99
Posts: 3395
Credit: 2,143,076
RAC: 2,187
Canada
Message 1024354 - Posted: 12 Aug 2010, 21:20:44 UTC

So, if Hocket referred to the way you guys share work at Berkely, does Hemiola refer to the days between server failures lately? (1 2 3, 1 2 3, 1 2, 1 2, 1 2)

Oh, and thanks for the update.
____________

Profile Gary CharpentierProject donor
Volunteer tester
Avatar
Send message
Joined: 25 Dec 00
Posts: 12705
Credit: 7,197,618
RAC: 15,740
United States
Message 1024362 - Posted: 12 Aug 2010, 21:38:06 UTC

Thanks for the update.

____________

Profile soft^spirit
Avatar
Send message
Joined: 18 May 99
Posts: 6374
Credit: 28,631,148
RAC: 3
United States
Message 1024369 - Posted: 12 Aug 2010, 22:15:21 UTC

As one of the donations (yet to be delivered.. plans are in progress)..
I would vote for a name like "Planters" Cause its from a bunch of nuts.
____________

Janice

Profile soft^spirit
Avatar
Send message
Joined: 18 May 99
Posts: 6374
Credit: 28,631,148
RAC: 3
United States
Message 1024377 - Posted: 12 Aug 2010, 22:45:47 UTC - in response to Message 1024344.

Matt.. if I might share.. from my experiences of keeping together antiques that were often poorly "refurbished"..

many problems clear permanently upon "re-seating" unplugging, and plugging back in. Other times taking things out, some surprise drops loose(seen or unseen 50/50).. and are then magically "fixed". Whether they were dirty connections, a bit of dust, someones raisinette.. does not really matter as long as they clear. a bad connection invisible to the eye might nearly need "bumped".. and could be gone forever.

We came up with things such as "pencil test".. where while monitoring the signal we tapped the outside case and see if it had effects. And some of the equipment was old enough to even contain mercury relays, where the mercury would vaporize, re-solidify in obscure pieces, and refuse to work until we "bounced" (hold edge of component 3-4" above anti-static surface, drop and catch on first bounce, re-insert) to clear.

These are also good reasons why "fault tolerance" is a good(although expensive) principle.

On the reports going back.. all of these were jotted down as "re-seat to clear."

Because if we told the truth, the whole truth, and nothing but the truth... it would have been the Salem Witch trials all over again.
____________

Janice

zoom314Project donor
Avatar
Send message
Joined: 30 Nov 03
Posts: 46492
Credit: 36,845,024
RAC: 5,224
United States
Message 1024378 - Posted: 12 Aug 2010, 22:47:29 UTC - in response to Message 1024369.

As one of the donations (yet to be delivered.. plans are in progress)..
I would vote for a name like "Planters" Cause its from a bunch of nuts.

Planters sound good to Me too.

@ Matt: Thanks for the update on Morks Odyssey.
____________
My Facebook, War Commander, 2015

John McLeod VII
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 15 Jul 99
Posts: 24681
Credit: 522,659
RAC: 40
United States
Message 1024427 - Posted: 13 Aug 2010, 1:17:46 UTC - in response to Message 1024378.

As one of the donations (yet to be delivered.. plans are in progress)..
I would vote for a name like "Planters" Cause its from a bunch of nuts.

Planters sound good to Me too.

@ Matt: Thanks for the update on Morks Odyssey.

How about Bedlam? As in a house full of fruits, nuts and flakes.
____________


BOINC WIKI

zoom314Project donor
Avatar
Send message
Joined: 30 Nov 03
Posts: 46492
Credit: 36,845,024
RAC: 5,224
United States
Message 1024449 - Posted: 13 Aug 2010, 3:03:51 UTC - in response to Message 1024427.

As one of the donations (yet to be delivered.. plans are in progress)..
I would vote for a name like "Planters" Cause its from a bunch of nuts.

Planters sound good to Me too.

@ Matt: Thanks for the update on Morks Odyssey.

How about Bedlam? As in a house full of fruits, nuts and flakes.

I'm sure someone will find a name somewhere.
____________
My Facebook, War Commander, 2015

Profile Jack Zhang
Volunteer tester
Avatar
Send message
Joined: 2 Jul 06
Posts: 206
Credit: 6,111,036
RAC: 942
Canada
Message 1024492 - Posted: 13 Aug 2010, 8:58:57 UTC

I hear SSD talk in this news post...

Avoid Kingston and consumer OCZ products when it comes to SSDs. Intel is only good if it's SLC memory and if there was an SSD move made, that SSD must have a supercapacitor to handle Server IOs per second. Pretty much the only choice when it comes to Server SSDs is the Sandforce SF-1500 controller chips with supercapacitor.
____________
What if Fiction was Fact and Fact was Fiction and vice versa?

Profile HelliProject donor
Volunteer tester
Avatar
Send message
Joined: 15 Dec 99
Posts: 704
Credit: 90,011,759
RAC: 78,562
Germany
Message 1024514 - Posted: 13 Aug 2010, 12:34:53 UTC - in response to Message 1024346.

Shouldn't you name that new server after the benefactors? Or is MRJHJT too difficult to pronounce in the office? ;-)


Well i don't believe that we would find a word that's representing the six Sponsors.

But - i would love to see a Sticker on the Server with written on it like "Mainly sponsored by Mark, Richard, Josef, Helli, John and T.A." ;-)
A Picture in the SETI@home Photo Album would also be fine so we can say years later: "Hey, look, a 1/6 of this Rig was sponsored by me". :-)

Only my 2c. :-)

Helli
____________
A loooong time ago: My first Credits

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4298
Credit: 1,067,168
RAC: 1,010
United States
Message 1024557 - Posted: 13 Aug 2010, 14:42:37 UTC - in response to Message 1024369.

soft^spirit wrote:
As one of the donations (yet to be delivered.. plans are in progress)..
I would vote for a name like "Planters" Cause its from a bunch of nuts.

I never doubted your pledge for August 28, and believe the project should be considering $7000 as donated to the cause.

Stretching the allusion to peanuts a bit further, perhaps Carver would be an intersting name possibility.
Joe

Profile soft^spirit
Avatar
Send message
Joined: 18 May 99
Posts: 6374
Credit: 28,631,148
RAC: 3
United States
Message 1024563 - Posted: 13 Aug 2010, 15:02:37 UTC - in response to Message 1024557.

honestly until a couple of posts ago, it never occured to me that it was not considered part of the 6K. There was an after the goal reached announcement donation of 1K..ahh well.

In any case.. add it however they want. "Hardware" is the only stipulation to it.
____________

Janice

Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar
Send message
Joined: 20 Dec 05
Posts: 1958
Credit: 10,429,780
RAC: 8,343
United States
Message 1024565 - Posted: 13 Aug 2010, 15:04:24 UTC - in response to Message 1024377.

Matt.. if I might share.. from my experiences of keeping together antiques that were often poorly "refurbished"..

many problems clear permanently upon "re-seating" unplugging, and plugging back in. Other times taking things out, some surprise drops loose(seen or unseen 50/50).. and are then magically "fixed". Whether they were dirty connections, a bit of dust, someones raisinette.. does not really matter as long as they clear. a bad connection invisible to the eye might nearly need "bumped".. and could be gone forever.

We came up with things such as "pencil test".. where while monitoring the signal we tapped the outside case and see if it had effects. And some of the equipment was old enough to even contain mercury relays, where the mercury would vaporize, re-solidify in obscure pieces, and refuse to work until we "bounced" (hold edge of component 3-4" above anti-static surface, drop and catch on first bounce, re-insert) to clear.

These are also good reasons why "fault tolerance" is a good(although expensive) principle.

On the reports going back.. all of these were jotted down as "re-seat to clear."

Because if we told the truth, the whole truth, and nothing but the truth... it would have been the Salem Witch trials all over again.


One thing that used to work on CRT terminals, back in the '80s, was to give them a "slap upside the screen". Some terminals would come back to life for a time after the slap. Location (and force) was brand-dependent, and with one of the brands, there were two methods that worked, depending on symptom: the slap, directed at the upper right of the CRT, and lifting the front of the CRT about an inch, and dropping. IBM 3278's were pretty reliable, but when they went, they could (sometimes...) be brought back by slapping the back right corner, and picking up the back about .5 inch, and dropping...

____________
.

Profile Bill Walker
Avatar
Send message
Joined: 4 Sep 99
Posts: 3395
Credit: 2,143,076
RAC: 2,187
Canada
Message 1024574 - Posted: 13 Aug 2010, 15:24:57 UTC - in response to Message 1024565.

Matt.. if I might share.. from my experiences of keeping together antiques that were often poorly "refurbished"..

many problems clear permanently upon "re-seating" unplugging, and plugging back in. Other times taking things out, some surprise drops loose(seen or unseen 50/50).. and are then magically "fixed". Whether they were dirty connections, a bit of dust, someones raisinette.. does not really matter as long as they clear. a bad connection invisible to the eye might nearly need "bumped".. and could be gone forever.

We came up with things such as "pencil test".. where while monitoring the signal we tapped the outside case and see if it had effects. And some of the equipment was old enough to even contain mercury relays, where the mercury would vaporize, re-solidify in obscure pieces, and refuse to work until we "bounced" (hold edge of component 3-4" above anti-static surface, drop and catch on first bounce, re-insert) to clear.

These are also good reasons why "fault tolerance" is a good(although expensive) principle.

On the reports going back.. all of these were jotted down as "re-seat to clear."

Because if we told the truth, the whole truth, and nothing but the truth... it would have been the Salem Witch trials all over again.


One thing that used to work on CRT terminals, back in the '80s, was to give them a "slap upside the screen". Some terminals would come back to life for a time after the slap. Location (and force) was brand-dependent, and with one of the brands, there were two methods that worked, depending on symptom: the slap, directed at the upper right of the CRT, and lifting the front of the CRT about an inch, and dropping. IBM 3278's were pretty reliable, but when they went, they could (sometimes...) be brought back by slapping the back right corner, and picking up the back about .5 inch, and dropping...


AS a (mostly) mechanical engineer, it does my heart good to see my electronic colleagues adapting the time honoured and tested ways of the mech eng.
____________

Speedy
Volunteer tester
Avatar
Send message
Joined: 26 Jun 04
Posts: 678
Credit: 5,918,151
RAC: 4,255
New Zealand
Message 1024911 - Posted: 14 Aug 2010, 5:33:56 UTC
Last modified: 14 Aug 2010, 5:34:46 UTC

What great news re the $6k donation. From Staycation (Jul 01 2010)

Data wise, we were able to get back to merging our various spike tables together full bore
How far through merging the spike tables are you now?

BOINC replica database saying running on the left hand side of the Server Status page yet beside Replica seconds behind master it says Offline. Is it still recovering after it's various crashes throughout the week?

Thanks so much for the update
____________

Live in NZ y not join Smile City?

1 · 2 · Next

Message boards : Technical News : Hemiola (Aug 12 2010)

Copyright © 2014 University of California