Message boards :
Technical News :
Advance (Jun 25 2009)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Fallout continues from the outage on Tuesday. Turns out the minor corruption in various MyISAM tables is messing up replication. Every so often a duplicate entry appears on the replica queue which is easy to remove but requires human intervention. This is causing the replica to fall further and futher behind. I'm loathe to give up on it, though, as that means being forced to point all queries, including non-essential ones, at the master. And that'll break everything. We also had to fall back to using two download servers, but we did so using simple DNS round-robin load balancing. Obviously this wasn't working out so well. DNS rollout/caching is never balanced (we saw this several times before, especially during the feeder mod polarity issues a year or two ago). So this morning we fell all the way back to using "pound" - which forces exactly 50% of all incoming connections to go to the first server, and the rest to the second one. This immediate broke the current download log jam, though of course we're still maxed out bandwidth-wise as I write this paragraph. Seems like there are a lot of frustrated people on these threads. There's no right or wrong way to feel about these outages. We're kind of a special case. At the core we're an academic project with no deadlines - normally nobody gets hurt if science is delayed a day or a month or a decade. On the other hand, we're forced to be "professional" since we're asking for various forms of support from many thousands of people, and you can't have that large a number of people involved without some sort of professional grade management and public relations. It's a daily puzzle marrying the two completely separate worlds. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
aplayer Send message Joined: 26 Apr 00 Posts: 13 Credit: 15,217,341 RAC: 0 |
Thanks for the news and good luck with marrying 2 worlds. lol. |
Bill Walker Send message Joined: 4 Sep 99 Posts: 3868 Credit: 2,697,267 RAC: 0 |
It all depends on your point of view Matt. I subscribe to the This Is Science, Really, school, so I'm still here, coming up on 10 years, and I don't complain about the occasional outage (well, not much, anyway). It is kind of Darwinian, in a way. Those who subscribe to This Is A Service, And Its Not Very Reliable, school wind up drifting away, to the next fad or passion of the day. Good bye and good luck to the trendies, I'll stay here for a bit longer. Can't wait to see what happens next. |
Cameron Send message Joined: 27 Nov 02 Posts: 110 Credit: 5,082,471 RAC: 17 |
Thanks Matt for your regular updates. It's difficult to balance the two worlds although SETI as are all the volunteer computing projects that we subscribe to is Science. SETI as one of the leaders within the volunteer computing world also needs to be professional when dealing it's volunteers and supporters which it does very well Matt :-D. I wonder if SETI was concratulated by the guys over at Einstein@Home for the ten year anniversary (as I remeber you mentioned SETI Outages coensided with contact from the Einstein Project). I've consistently run SETI from the day I've joined and I'm not considering stopping |
CryptokiD Send message Joined: 2 Dec 00 Posts: 150 Credit: 3,216,632 RAC: 0 |
I appreciate the nearly daily updates on you're progress, the servers, etc. It keeps bringing me back looking for more. A lot of admins wouldn't bother telling the users the tech details of why something is not working, or why I personally can't ul/dl for 2 days now. You guys are honest and do not sugar coat things. I like that. Good luck getting things sorted out. If I had a way to help other then encouragement I surely would. Doubt you guys would need an MCSE since you run some form of unix on the servers. |
zpm Send message Joined: 25 Apr 08 Posts: 284 Credit: 1,659,024 RAC: 0 |
hey, chill, take a breather... and relax a little.... i've leanred, being a tv station, to relax even when your to the point of shooting a piece of crap machine thats a p3 doing a dual-core job.... I recommend Secunia PSI: http://secunia.com/vulnerability_scanning/personal/ Go Georgia Tech. |
Wiggo Send message Joined: 24 Jan 00 Posts: 36590 Credit: 261,360,520 RAC: 489 |
Well it is frustratin' to a lot of ppl i spose when a person has say 7 pc's runnin' it but only 2 have ample work (1 of them, old P4C, is still keepin' pace while the other 1, older E6300, is bein' emptied out as to try Win7 on) but the other 5 are consistantly runnin' out of work every day (yes a 10 day cache is set) so if there was some way to keep these pc's fed & be able to catch up with the data base later would to a lot be a blessing. To me, I'm savin' on greenhouse gases & my powerbill. ;) |
KWSN Ekky Ekky Ekky Send message Joined: 25 May 99 Posts: 944 Credit: 52,956,491 RAC: 67 |
Seems like there are a lot of frustrated people on these threads. There's no right or wrong way to feel about these outages. We're kind of a special case. At the core we're an academic project with no deadlines - normally nobody gets hurt if science is delayed a day or a month or a decade. On the other hand, we're forced to be "professional" since we're asking for various forms of support from many thousands of people, and you can't have that large a number of people involved without some sort of professional grade management and public relations. It's a daily puzzle marrying the two completely separate worlds. These wise words should be placed at the top of the NC board permanently.. This is a serious proposal. It could save a large proportion of the unhappiness. |
PT Send message Joined: 19 May 99 Posts: 231 Credit: 902,910 RAC: 0 |
Keep up the good work Matt. Much appreciated. PT Happy crunching |
MarkJ Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 |
Thanks for the updates Matt. I'm one of those that subscribes to the view this is science and there is no point getting upset if something breaks. Boinc will sort itself out. I had a CPDN wu that uploaded all except one file. It took almost 2 weeks to get it uploaded due to issues on the server (apparently ran out of disk space). No point stressing. Einstein's file server has crashed. That might be why Seti is getting more traffic than normal, apart from the outage recovery. Cheers, MarkJ BOINC blog |
Hammeh Send message Joined: 21 May 01 Posts: 135 Credit: 1,143,316 RAC: 0 |
Thanks for letting us know about all the issues, it does help to prevent people getting annoyed when they know what is wrong. Computers break, simple as, no point in getting annoyed at all in my opinion. We have been with seti for long enough to know what the norm is, just set a larger cache than you would for other projects (I use 7 days) and then you are not effected by these small problems. My computers will be happily crunching into the middle of next week =) Ignore those who complain, as long as you guys are doing your best, which we all know you are, then people will just have to accept these things happen. |
Will Malven Send message Joined: 2 Jun 99 Posts: 52 Credit: 4,441,977 RAC: 0 |
C'mon Matt, buck up old bean. I for one (and I suspect many) have spent my entire life working with scientific equipment and computers. Face it, a day without something breaking down is like a gift from above. I'm sure there are many for whom this project is a source of ego-stroking, but for the rest of us...we are here as volunteers...we chose any pain we suffer :). I have been here off and on since June of '99 because I thought it a great idea and an important experiment with a huge potential for the future of mankind...gee, makes me kind of misty-eyed. There are a bunch of projects to which people can attach to occupy their computers. Afterall, self-inflicted misery is the easy misery to cure. So "God bless America!" "God save the Queen!" "Don't shoot till you see the whites of their eyes!" Damn the torpedoes!" and "Full speed ahead!" Illegitimus non caborundum est! Man's future lies in the stars, not on Earth. It is each successive generation's responsibility to humanity to expand the knowledge and understanding of our Universe so that we may one day venture forth to meet our neighbors. Houston, Texas |
davor [SETI Team Croatia] Send message Joined: 20 Jan 03 Posts: 10 Credit: 71,166,788 RAC: 0 |
I agree with all the other messages of support - keep up the good work Matt and we'll all be here waiting for the system to be repaired - and when it is back up again, eagerly downloading new workunits to be processed. All the best with your good work, davor |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
Seems like there are a lot of frustrated people on these threads. There's no right or wrong way to feel about these outages. We're kind of a special case. At the core we're an academic project with no deadlines - normally nobody gets hurt if science is delayed a day or a month or a decade. On the other hand, we're forced to be "professional" since we're asking for various forms of support from many thousands of people, and you can't have that large a number of people involved without some sort of professional grade management and public relations. It's a daily puzzle marrying the two completely separate worlds. I can only agree with the above statement! Thanks for your information on this hot item! |
PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1 |
I, too, fully agree with Matt's statement quoted below. Yet, I believe there is always something to learn from reasoned criticism. Hopefully, Matt and others bite the lemon, lick the salt, down the tequilla, and read the comments from time to time. |
elgar Send message Joined: 21 May 99 Posts: 69 Credit: 2,687,478 RAC: 0 |
Shouldn't the weekly outage be changed to 'weeklong outage'? Tell us again how the project needs more people crunching, please. Oh, and be sure to ask for $$$. |
Marius Send message Joined: 11 Mar 00 Posts: 12 Credit: 16,655,085 RAC: 0 |
Shouldn't the weekly outage be changed to 'weeklong outage'? Tell us again how the project needs more people crunching, please. Oh, and be sure to ask for $$$. I dont know how you find the time to give us updates! @Elgar lol, i think Matt will settle for a decent database servers with proper synchronisation ;) Personally i never thrusted mysql for anything but light web stuff |
OzzFan Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28 |
I, too, fully agree with Matt's statement quoted below. Yet, I believe there is always something to learn from reasoned criticism. Hopefully, Matt and others bite the lemon, lick the salt, down the tequilla, and read the comments from time to time. I'm not Matt, and I really can't speak for him, but I don't think Matt minds our comments, and I think he's hinted on more than one ocassion that he reads plenty of them. He's even popped into threads that I didn't expect him to be reading! :) |
OzzFan Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28 |
Looking at your past post history, I really don't think you're just trying to be cute, Elgar. In fact, it seems your comment was downright "trollish" because of everyone else offering their support for Matt which is the exact opposite of your agenda you've been on for quite some time now. And I see through browsing your past comments, you often like to "mask" your trollishness in the same way every other troll I've ever met does by trying to create an aura of legitimacy around your very forked-tongue questions. |
Edywison Send message Joined: 31 May 09 Posts: 13 Credit: 20,563 RAC: 0 |
Guess... computers are not as capable as what we imagine. Can't blame anyone though. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.