Advance (Jun 25 2009)


log in

Advanced search

Message boards : Technical News : Advance (Jun 25 2009)

1 · 2 · 3 · 4 · Next
Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1391
Credit: 74,079
RAC: 10
United States
Message 911379 - Posted: 25 Jun 2009, 20:59:16 UTC

Fallout continues from the outage on Tuesday. Turns out the minor corruption in various MyISAM tables is messing up replication. Every so often a duplicate entry appears on the replica queue which is easy to remove but requires human intervention. This is causing the replica to fall further and futher behind. I'm loathe to give up on it, though, as that means being forced to point all queries, including non-essential ones, at the master. And that'll break everything.

We also had to fall back to using two download servers, but we did so using simple DNS round-robin load balancing. Obviously this wasn't working out so well. DNS rollout/caching is never balanced (we saw this several times before, especially during the feeder mod polarity issues a year or two ago). So this morning we fell all the way back to using "pound" - which forces exactly 50% of all incoming connections to go to the first server, and the rest to the second one. This immediate broke the current download log jam, though of course we're still maxed out bandwidth-wise as I write this paragraph.

Seems like there are a lot of frustrated people on these threads. There's no right or wrong way to feel about these outages. We're kind of a special case. At the core we're an academic project with no deadlines - normally nobody gets hurt if science is delayed a day or a month or a decade. On the other hand, we're forced to be "professional" since we're asking for various forms of support from many thousands of people, and you can't have that large a number of people involved without some sort of professional grade management and public relations. It's a daily puzzle marrying the two completely separate worlds.

- Matt

____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

aplayer
Send message
Joined: 26 Apr 00
Posts: 13
Credit: 12,618,297
RAC: 0
United States
Message 911403 - Posted: 25 Jun 2009, 21:35:37 UTC - in response to Message 911379.

Thanks for the news and good luck with marrying 2 worlds. lol.

Profile Bill Walker
Avatar
Send message
Joined: 4 Sep 99
Posts: 3459
Credit: 2,215,081
RAC: 1,019
Canada
Message 911410 - Posted: 25 Jun 2009, 21:54:08 UTC

It all depends on your point of view Matt. I subscribe to the This Is Science, Really, school, so I'm still here, coming up on 10 years, and I don't complain about the occasional outage (well, not much, anyway).

It is kind of Darwinian, in a way. Those who subscribe to This Is A Service, And Its Not Very Reliable, school wind up drifting away, to the next fad or passion of the day. Good bye and good luck to the trendies, I'll stay here for a bit longer. Can't wait to see what happens next.
____________

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3769
Credit: 21,495,442
RAC: 15,554
Sweden
Message 911435 - Posted: 25 Jun 2009, 22:40:44 UTC - in response to Message 911379.


Seems like there are a lot of frustrated people on these threads. There's no right or wrong way to feel about these outages. We're kind of a special case. At the core we're an academic project with no deadlines - normally nobody gets hurt if science is delayed a day or a month or a decade. On the other hand, we're forced to be "professional" since we're asking for various forms of support from many thousands of people, and you can't have that large a number of people involved without some sort of professional grade management and public relations. It's a daily puzzle marrying the two completely separate worlds.

- Matt


Thanks for the update.

Well, Of course there will always be frustrated people here. However as I see it, this kind of science is a long term project, and as far as I am concerned outages does not make me want to run away screaming, looking for another project. I will be here for a long time. You're all making a hell of a good job with the small resources you have to work with.

If there aren't any WU's available, and/or if I can't download or upload for a prolonged time, I can always shut down the computers for a while, and then turn them on again when things works again. If I can't see my stats in real time, if the replica db falls behind or is offline, that surely will not make me lose any sleep.

Sten-Arne

Cameron
Avatar
Send message
Joined: 27 Nov 02
Posts: 71
Credit: 1,055,668
RAC: 101
Australia
Message 911473 - Posted: 26 Jun 2009, 0:21:00 UTC

Thanks Matt for your regular updates.

It's difficult to balance the two worlds although SETI as are all the volunteer computing projects that we subscribe to is Science. SETI as one of the leaders within the volunteer computing world also needs to be professional when dealing it's volunteers and supporters which it does very well Matt :-D.

I wonder if SETI was concratulated by the guys over at Einstein@Home for the ten year anniversary (as I remeber you mentioned SETI Outages coensided with contact from the Einstein Project).

I've consistently run SETI from the day I've joined and I'm not considering stopping

CryptokiD
Avatar
Send message
Joined: 2 Dec 00
Posts: 134
Credit: 2,814,936
RAC: 0
United States
Message 911478 - Posted: 26 Jun 2009, 1:05:57 UTC

I appreciate the nearly daily updates on you're progress, the servers, etc. It keeps bringing me back looking for more. A lot of admins wouldn't bother telling the users the tech details of why something is not working, or why I personally can't ul/dl for 2 days now. You guys are honest and do not sugar coat things. I like that.

Good luck getting things sorted out. If I had a way to help other then encouragement I surely would.

Doubt you guys would need an MCSE since you run some form of unix on the servers.

zpm
Volunteer tester
Avatar
Send message
Joined: 25 Apr 08
Posts: 284
Credit: 1,616,654
RAC: 336
United States
Message 911507 - Posted: 26 Jun 2009, 4:12:17 UTC - in response to Message 911478.

hey, chill, take a breather... and relax a little.... i've leanred, being a tv station, to relax even when your to the point of shooting a piece of crap machine thats a p3 doing a dual-core job....
____________

I recommend Secunia PSI: http://secunia.com/vulnerability_scanning/personal/
Go Georgia Tech.

Profile Wiggo
Avatar
Send message
Joined: 24 Jan 00
Posts: 8603
Credit: 99,374,508
RAC: 54,832
Australia
Message 911530 - Posted: 26 Jun 2009, 6:02:50 UTC - in response to Message 911379.
Last modified: 26 Jun 2009, 6:04:04 UTC

Well it is frustratin' to a lot of ppl i spose when a person has say 7 pc's runnin' it but only 2 have ample work (1 of them, old P4C, is still keepin' pace while the other 1, older E6300, is bein' emptied out as to try Win7 on) but the other 5 are consistantly runnin' out of work every day (yes a 10 day cache is set) so if there was some way to keep these pc's fed & be able to catch up with the data base later would to a lot be a blessing.

To me, I'm savin' on greenhouse gases & my powerbill. ;)
____________

Profile KWSN Ekky Ekky Ekky
Avatar
Send message
Joined: 25 May 99
Posts: 928
Credit: 12,587,945
RAC: 11,283
United Kingdom
Message 911537 - Posted: 26 Jun 2009, 6:32:35 UTC - in response to Message 911379.

Seems like there are a lot of frustrated people on these threads. There's no right or wrong way to feel about these outages. We're kind of a special case. At the core we're an academic project with no deadlines - normally nobody gets hurt if science is delayed a day or a month or a decade. On the other hand, we're forced to be "professional" since we're asking for various forms of support from many thousands of people, and you can't have that large a number of people involved without some sort of professional grade management and public relations. It's a daily puzzle marrying the two completely separate worlds.

- Matt

These wise words should be placed at the top of the NC board permanently.. This is a serious proposal. It could save a large proportion of the unhappiness.
____________

Profile PT
Send message
Joined: 19 May 99
Posts: 231
Credit: 902,910
RAC: 0
United Kingdom
Message 911551 - Posted: 26 Jun 2009, 7:31:55 UTC

Keep up the good work Matt.
Much appreciated.

PT
____________
Happy crunching

Profile MarkJProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 08
Posts: 944
Credit: 25,177,936
RAC: 793
Australia
Message 911562 - Posted: 26 Jun 2009, 7:45:33 UTC

Thanks for the updates Matt.

I'm one of those that subscribes to the view this is science and there is no point getting upset if something breaks. Boinc will sort itself out.

I had a CPDN wu that uploaded all except one file. It took almost 2 weeks to get it uploaded due to issues on the server (apparently ran out of disk space). No point stressing.

Einstein's file server has crashed. That might be why Seti is getting more traffic than normal, apart from the outage recovery.

Cheers,
MarkJ
____________
BOINC blog

Profile Hammeh
Volunteer tester
Avatar
Send message
Joined: 21 May 01
Posts: 135
Credit: 1,143,316
RAC: 0
United Kingdom
Message 911584 - Posted: 26 Jun 2009, 9:37:43 UTC

Thanks for letting us know about all the issues, it does help to prevent people getting annoyed when they know what is wrong. Computers break, simple as, no point in getting annoyed at all in my opinion. We have been with seti for long enough to know what the norm is, just set a larger cache than you would for other projects (I use 7 days) and then you are not effected by these small problems. My computers will be happily crunching into the middle of next week =)

Ignore those who complain, as long as you guys are doing your best, which we all know you are, then people will just have to accept these things happen.
____________

Profile Will Malven
Avatar
Send message
Joined: 2 Jun 99
Posts: 52
Credit: 1,879,157
RAC: 0
United States
Message 911618 - Posted: 26 Jun 2009, 11:58:10 UTC

C'mon Matt, buck up old bean.

I for one (and I suspect many) have spent my entire life working with scientific equipment and computers. Face it, a day without something breaking down is like a gift from above.

I'm sure there are many for whom this project is a source of ego-stroking, but for the rest of us...we are here as volunteers...we chose any pain we suffer :).

I have been here off and on since June of '99 because I thought it a great idea and an important experiment with a huge potential for the future of mankind...gee, makes me kind of misty-eyed.

There are a bunch of projects to which people can attach to occupy their computers. Afterall, self-inflicted misery is the easy misery to cure.

So "God bless America!" "God save the Queen!" "Don't shoot till you see the whites of their eyes!" Damn the torpedoes!" and "Full speed ahead!"

Illegitimus non caborundum est!
____________
Man's future lies in the stars, not on Earth. It is each successive generation's responsibility to humanity to expand the knowledge and understanding of our Universe so that we may one day venture forth to meet our neighbors.

Houston, Texas

davor [SETI Team Croatia]
Send message
Joined: 20 Jan 03
Posts: 10
Credit: 71,165,746
RAC: 540
Croatia
Message 911619 - Posted: 26 Jun 2009, 11:59:10 UTC - in response to Message 911379.

I agree with all the other messages of support - keep up the good work Matt and we'll all be here waiting for the system to be repaired - and when it is back up again, eagerly downloading new workunits to be processed. All the best with your good work,

davor
____________

Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,520
RAC: 119
Netherlands
Message 911622 - Posted: 26 Jun 2009, 12:11:32 UTC - in response to Message 911537.

Seems like there are a lot of frustrated people on these threads. There's no right or wrong way to feel about these outages. We're kind of a special case. At the core we're an academic project with no deadlines - normally nobody gets hurt if science is delayed a day or a month or a decade. On the other hand, we're forced to be "professional" since we're asking for various forms of support from many thousands of people, and you can't have that large a number of people involved without some sort of professional grade management and public relations. It's a daily puzzle marrying the two completely separate worlds.

- Matt

These wise words should be placed at the top of the NC board permanently.. This is a serious proposal. It could save a large proportion of the unhappiness.


I can only agree with the above statement!
Thanks for your information on this hot item!

____________

PhonAcq
Send message
Joined: 14 Apr 01
Posts: 1624
Credit: 22,614,387
RAC: 4,524
United States
Message 911634 - Posted: 26 Jun 2009, 12:53:26 UTC

I, too, fully agree with Matt's statement quoted below. Yet, I believe there is always something to learn from reasoned criticism. Hopefully, Matt and others bite the lemon, lick the salt, down the tequilla, and read the comments from time to time.

elgar
Send message
Joined: 21 May 99
Posts: 69
Credit: 2,687,478
RAC: 0
United States
Message 911637 - Posted: 26 Jun 2009, 13:00:21 UTC

Shouldn't the weekly outage be changed to 'weeklong outage'? Tell us again how the project needs more people crunching, please. Oh, and be sure to ask for $$$.

Marius
Volunteer tester
Send message
Joined: 11 Mar 00
Posts: 12
Credit: 16,655,085
RAC: 0
Netherlands
Message 911646 - Posted: 26 Jun 2009, 13:35:13 UTC - in response to Message 911637.

Shouldn't the weekly outage be changed to 'weeklong outage'? Tell us again how the project needs more people crunching, please. Oh, and be sure to ask for $$$.


I dont know how you find the time to give us updates!

@Elgar lol, i think Matt will settle for a decent database servers with proper synchronisation ;) Personally i never thrusted mysql for anything but light web stuff


____________

OzzFan
Volunteer tester
Avatar
Send message
Joined: 9 Apr 02
Posts: 13706
Credit: 31,738,003
RAC: 13,029
United States
Message 911700 - Posted: 26 Jun 2009, 15:32:55 UTC - in response to Message 911634.

I, too, fully agree with Matt's statement quoted below. Yet, I believe there is always something to learn from reasoned criticism. Hopefully, Matt and others bite the lemon, lick the salt, down the tequilla, and read the comments from time to time.


I'm not Matt, and I really can't speak for him, but I don't think Matt minds our comments, and I think he's hinted on more than one ocassion that he reads plenty of them. He's even popped into threads that I didn't expect him to be reading! :)
____________

OzzFan
Volunteer tester
Avatar
Send message
Joined: 9 Apr 02
Posts: 13706
Credit: 31,738,003
RAC: 13,029
United States
Message 911701 - Posted: 26 Jun 2009, 15:37:10 UTC - in response to Message 911637.

Looking at your past post history, I really don't think you're just trying to be cute, Elgar. In fact, it seems your comment was downright "trollish" because of everyone else offering their support for Matt which is the exact opposite of your agenda you've been on for quite some time now. And I see through browsing your past comments, you often like to "mask" your trollishness in the same way every other troll I've ever met does by trying to create an aura of legitimacy around your very forked-tongue questions.
____________

1 · 2 · 3 · 4 · Next

Message boards : Technical News : Advance (Jun 25 2009)

Copyright © 2014 University of California