The Owl in Daylight (Aug 24 2011)


log in

Advanced search

Message boards : Technical News : The Owl in Daylight (Aug 24 2011)

1 · 2 · 3 · 4 · Next
Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1384
Credit: 74,079
RAC: 0
United States
Message 1144668 - Posted: 24 Aug 2011, 20:34:46 UTC

I'm still here, but this is probably my last tech news item for a long while. Eric/Jeff will try to keep you up to date on the nerdy behind the scenes stuff while I'm gone. They are equally (if not far more) qualified to do so.

So.. regarding this current dearth of workunits. We had a routine drive swap on thumper (our file server, where we keep all the raw data among other things) after one drive started showing signs of impending failure. This unexpectedly caused three problems: 1. the drive swap confused the RAID and we couldn't easily get it out of degraded state, 2. this somehow in turn corrupted the xfs filesystem on said RAID, causing us to lose our on-line cache of raw data, and 3. other systems couldn't mount this filesystem anymore, even after it seemed to be in a stable enough state.

Tie all that together, and you can't make workunits. The good news is we didn't really lose any data, as it's all archived elsewhere, so the weekend was spent copying a lot of raw data back onto systems in our lab. Anyway the long and the short of it is after the dust settled it was easy to un-degrade the RAID (though once again I'm annoyed by the wonky/unpredictable nature of linux software RAID). That took a day to resync. Then I spent a day copying everything off the xfs-corrupted filesystem, made a fresh new reformatted partition, and just started copying everything back. I also kicked all the other machines enough to start mounting this new, remade partition.

All you really need to know is: it's all looking pretty good, and we'll start making workunits again probably by sometime tomorrow morning, if not sooner.

Meanwhile everything else is pretty much fine. I'm actually mostly busy helping Dan/Eric cobble together a spate of NASA grant proposals. Keep your fingers crossed on those.

- Matt
____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Profile Gary Charpentier
Volunteer tester
Avatar
Send message
Joined: 25 Dec 00
Posts: 11732
Credit: 5,969,877
RAC: 0
United States
Message 1144679 - Posted: 24 Aug 2011, 20:52:58 UTC
Last modified: 24 Aug 2011, 20:53:17 UTC

Thanks for the update, fantastic luck on the grants, and have a blast with the gigs.
Oh set a nag on Eric and Jeff's calendar to post something.
____________

tbret
Volunteer tester
Avatar
Send message
Joined: 28 May 99
Posts: 2390
Credit: 164,184,283
RAC: 69,249
United States
Message 1144681 - Posted: 24 Aug 2011, 20:55:15 UTC - in response to Message 1144668.
Last modified: 24 Aug 2011, 20:57:29 UTC

Break a leg on both the proposals and on stage. Drop us a line in the cafe once in a while, will you?

I *really* appreciate your taking the time to explain what the issues are. For us out here in the cold, sometimes it feels like sitting outside a surgical suite waiting for some word on a long and difficult surgery being done to a loved-one.

Just a word of encouragement or an update on progress goes a long, long way toward relieving anxiety out in the hall where we have nothing but last week's newspaper to look at ...and that's regardless of the reputation of the surgeon.

We know we're in good hands, but please encourage Jeff and Eric to check-in. They aren't as accustomed to doing it as you are.

edit] and an added thanks for conquering that RAID you hate.

rob smith
Volunteer moderator
Send message
Joined: 7 Mar 03
Posts: 7668
Credit: 44,756,681
RAC: 75,233
United Kingdom
Message 1144682 - Posted: 24 Aug 2011, 21:00:36 UTC

Matt,
Many thanks for your update on the state of play.

Slightly off topic - where can I find your European dates?
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Claggy
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 3963
Credit: 31,848,141
RAC: 10,784
United Kingdom
Message 1144686 - Posted: 24 Aug 2011, 21:07:15 UTC - in response to Message 1144668.

Thanks for the update Matt, hope the NASA grant proposals come good, and have a good tour,

Claggy

Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1384
Credit: 74,079
RAC: 0
United States
Message 1144688 - Posted: 24 Aug 2011, 21:13:59 UTC - in response to Message 1144682.

Slightly off topic - where can I find your European dates?


That tour hit some booking snags, so it's still in flux. But I know we're slated to play the Airwaves Festival (in Iceland on October 13th) and the Supersonic Festival in Birmingham, UK (on October 21st). And a bunch of France/UK in between those dates. Just keep checking http://www.webofmimicry.com for (hopefully current) details.

- Matt
____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 6969
Credit: 57,077,207
RAC: 22,649
Germany
Message 1144694 - Posted: 24 Aug 2011, 21:19:45 UTC - in response to Message 1144668.

Thanks for the news!


You have maybe news because of the damaged router?
A few members still can't connect to the S@h server (my machine since ~ 2 days).


- Best regards! - Sutaru Tsureku, team seti.international founder. - Optimize your PC for higher RAC. - SETI@home needs your help. -
____________
BR



>Das Deutsche Cafe. The German Cafe.<

Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1384
Credit: 74,079
RAC: 0
United States
Message 1144709 - Posted: 24 Aug 2011, 21:43:42 UTC - in response to Message 1144694.

You have maybe news because of the damaged router?


We sent a message to the donor or the router (and the rack space) requesting a plan about what to do next with it, and haven't yet gotten a response. Still I don't think it's damaged as much as not having enough memory to deal with full-pipe traffic, which normally hasn't been the case. I'd rather we focus on reducing the traffic first before replacing/upgrading hardware. Plans are being enacted to do this (including a better splitter to throw away noisy workunits if it can spot them during creation time).

- Matt
____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Profile Khangollo
Avatar
Send message
Joined: 1 Aug 00
Posts: 245
Credit: 36,410,524
RAC: 0
Slovenia
Message 1144714 - Posted: 24 Aug 2011, 22:09:31 UTC
Last modified: 24 Aug 2011, 22:11:42 UTC

This time it happened with no high levels of traffic (at least bandwidth-wise). All traffic was briefly interrupted for 15 minutes and since then some of us can not reach any of the servers. There is a visible drop in incoming traffic, so right now there might be a lot of us being "blacklisted". I made a snapshot of Cricket graph just in case it might be of any help (I was just monitoring my Boinc the moment it happened).

____________

Profile Jeff Mercer
Send message
Joined: 14 Aug 08
Posts: 90
Credit: 152,055
RAC: 674
United States
Message 1144746 - Posted: 24 Aug 2011, 23:27:51 UTC

Thanks for all the info Matt. Good luck with everything you are doing ! I've backed out of the project for a while, but I check in every few days to see how things are going. I'll be back later to continue crunching, but right now, just to much going on. Hope everything gets fixed and that the project continues on ! Enjoy your music !

OzzFan
Volunteer tester
Avatar
Send message
Joined: 9 Apr 02
Posts: 13307
Credit: 27,846,955
RAC: 15,733
United States
Message 1144765 - Posted: 25 Aug 2011, 0:37:02 UTC

Have fun on tour Matt.

Profile Zapped Sparky
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 30 Aug 08
Posts: 5625
Credit: 1,120,350
RAC: 1,623
United Kingdom
Message 1144825 - Posted: 25 Aug 2011, 4:14:50 UTC

I'll keep my fingers crossed for the grant proposals, and for the RAID as well.

rob smith
Volunteer moderator
Send message
Joined: 7 Mar 03
Posts: 7668
Credit: 44,756,681
RAC: 75,233
United Kingdom
Message 1144844 - Posted: 25 Aug 2011, 5:36:04 UTC

Thanks Matt, for your hard work, and hopefully I'll be able to snag a gig somewhere, now I know where to look :)




(Hopefully your babies won't cry too much while you're away)
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Profile S@NL Etienne Dokkum
Volunteer tester
Avatar
Send message
Joined: 11 Jun 99
Posts: 155
Credit: 12,793,334
RAC: 24,737
Netherlands
Message 1144853 - Posted: 25 Aug 2011, 5:45:46 UTC

Hey Matt,

have a good tour !!! Hope you come to Holland too. Then we'll try to come and see...

I have complete confidence that the RAID issues will be sorted out !

What would the NASA grant implicate for the entire project ????


____________

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 6969
Credit: 57,077,207
RAC: 22,649
Germany
Message 1144962 - Posted: 25 Aug 2011, 13:49:00 UTC - in response to Message 1144709.

You have maybe news because of the damaged router?

We sent a message to the donor or the router (and the rack space) requesting a plan about what to do next with it, and haven't yet gotten a response. Still I don't think it's damaged as much as not having enough memory to deal with full-pipe traffic, which normally hasn't been the case. I'd rather we focus on reducing the traffic first before replacing/upgrading hardware. Plans are being enacted to do this (including a better splitter to throw away noisy workunits if it can spot them during creation time).

- Matt


It's now the 2nd time that my machine can't connect.

The first time was 09 Aug 2011, 14:46 UTC and it lasts ~ 1 1/2 days.
This was before the weekly maintenance.
So I guess during the maintenance something gone wrong.
Then the next day somewhere was in the lab I guess because IIRC the website was not reachable for some time. Then after the website and the server were again reachable.

This time since 22 Aug 2011, ~ 21:30 UTC no contact to the server.

Maybe the router need a reboot once a week until the problem is solved?


- Best regards! - Sutaru Tsureku, team seti.international founder. - Optimize your PC for higher RAC. - SETI@home needs your help. -
____________
BR



>Das Deutsche Cafe. The German Cafe.<

Profile kepan
Send message
Joined: 17 Sep 99
Posts: 7
Credit: 27,368,474
RAC: 0
Sweden
Message 1145021 - Posted: 25 Aug 2011, 16:34:05 UTC

Hello
It seems that I'm not alone to be unable to upload workunits. I'm still crunching but are soon out of work and a lot to upload.
/Per (Sweden)
____________

Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1384
Credit: 74,079
RAC: 0
United States
Message 1145066 - Posted: 25 Aug 2011, 17:52:28 UTC

Despite better judgement (being late in the week, and I'm the only computer geek at the lab today and nobody else will be in until Monday) I did just reboot the router. Maybe that'll fix some of y'alls connections.

- Matt
____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

msattler
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 37300
Credit: 498,907,038
RAC: 503,116
United States
Message 1145067 - Posted: 25 Aug 2011, 17:54:12 UTC - in response to Message 1145066.

Despite better judgement (being late in the week, and I'm the only computer geek at the lab today and nobody else will be in until Monday) I did just reboot the router. Maybe that'll fix some of y'alls connections.

- Matt

Thanks for giving it the ol' college try.
Hopefully Eric will be able to attempt that RAM upgrade in the near future with positive results.
____________
******************
Crunching Seti, loving all of God's kitties.

I have met a few friends in my life.
Most were cats.

Profile Sirius B
Volunteer tester
Avatar
Send message
Joined: 26 Dec 00
Posts: 9295
Credit: 1,360,871
RAC: 1,550
United Kingdom
Message 1145079 - Posted: 25 Aug 2011, 18:16:27 UTC

Thanks for the info Matt.....if you do make it to the Supersonic Festival, I'll look forward to seeing you there as it's "just up the road from me"....
____________

Terror Australis
Volunteer tester
Send message
Joined: 14 Feb 04
Posts: 1626
Credit: 200,788,926
RAC: 66,301
Australia
Message 1145080 - Posted: 25 Aug 2011, 18:23:00 UTC - in response to Message 1145066.

Despite better judgement (being late in the week, and I'm the only computer geek at the lab today and nobody else will be in until Monday) I did just reboot the router. Maybe that'll fix some of y'alls connections.

- Matt

Thanks for trying Mat but still no joy here.

T.A.

1 · 2 · 3 · 4 · Next

Message boards : Technical News : The Owl in Daylight (Aug 24 2011)

Copyright © 2014 University of California