Message boards :
Technical News :
Retreat! (Jun 24 2009)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Despite efforts to reduce the outage time yesterday, the database was bloated enough (for various reasons) to take all day compressing/backing up. The replica wasn't even close to being ready to done by the time I left the lab, and still wasn't done before I went to bed last night. That meant all queries had to be aimed at the master, including all the read-only stuff that usually hits the replica - stats collection scripts, result state count scripts, the daily credit multiplier calculation (which is rather expensive), and lots of annoying web scraping queries. All those excess things pretty much killed us throughout the evening. The replica was finally available in the morning, albeit fairly far behind the master. Nevertheless I was able to start cleaning up the mess. However, two other problems were revealed. First, going to one download server wasn't a good thing. It seems impossible to me that apache can't handle all the downloads on one system - especially given the abundance of free resources. It drops connections regardless of how much network/httpd.conf tweaking I do. So we fell back to using two download servers, and that immediately solved everything. Of course, we've been offline for 24 hours, so there's gonna be lots of traffic for a while making it hard to upload/download anything. Second, there was minor corruption in the MyISAM tables in the mysql database. Not sure what caused that but given the database was clogged all night all bets are off. The most notable effect of this was some weird behavior in the forums. Some simple "repair table" commands found the problems and claims to have fixed them. Anyway.. it's clear we still have much work to do cleaning up our current mysql situation. Sigh. In better news, looks like me and Jeff are going to the OSCON 2009 in San Jose in July - the O'Reilly open source convention. Maybe we'll get some hot tips about improving the linux/apache/mysql/php performance around here. Tim O'Reilly himself helped hook us up with free passes (he's been nice to us over the years). - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
Infomage Send message Joined: 28 Dec 00 Posts: 1 Credit: 1,216,140 RAC: 0 |
Glad to know that things will be back up and running soon. :) |
aplayer Send message Joined: 26 Apr 00 Posts: 13 Credit: 15,217,341 RAC: 0 |
thank you.... things may start improving soon.l good to hear.... |
David Send message Joined: 2 Jun 08 Posts: 3 Credit: 268,609 RAC: 0 |
Ok, the down times are getting old. I prefered to have only one project running and it appears this is not the one to to maximize my computers when they are not in use. Guess it is time to find a different and more productive project, it has been a fun run. |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 30927 Credit: 53,134,872 RAC: 32 |
Ok, the down times are getting old. I prefered to have only one project running and it appears this is not the one to to maximize my computers when they are not in use. Guess it is time to find a different and more productive project, it has been a fun run. No SINGLE project can do that. I've got 10 attached and all of them have had extended down times. It is the nature of the beast. |
Jack Zhang Send message Joined: 2 Jul 06 Posts: 206 Credit: 6,142,449 RAC: 0 |
Downloads have been on the frisk for quite a while, even after the re-introduction of 2 download servers. It's symptoms are similar to the problem when uploads are maxed out. What if Fiction was Fact and Fact was Fiction and vice versa? |
ivan Send message Joined: 5 Mar 01 Posts: 783 Credit: 348,560,338 RAC: 223 |
Downloads have been on the frisk for quite a while, even after the re-introduction of 2 download servers. It's symptoms are similar to the problem when uploads are maxed out. Have you tried stopping BOINC and flushing your DNS cache? I seemed to have a stale address in several of my machines this morning and enormous download backlogs; flushing the cache cured the problem. |
Administrator Send message Joined: 2 Oct 06 Posts: 2 Credit: 203,504 RAC: 0 |
Is this downtime a common theme with Seti@home now? After been inactive for a few years I decided to join back up after I received an email from the Seti team. My client has been running for 48 hours now, half of which it has been down. Then I come here and see from the threads listed that there has been numerous amounts of downtime due to glitches. Have I made a mistake by rejoining? |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
Is this downtime a common theme with Seti@home now? No more than before. Like most things you tend to get it in groups. Have I made a mistake by rejoining? Can't see how. Grant Darwin NT |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Before someone says it was all better in Classic times (I'm just waiting for it... and am heading you off. ;-)), remember that they had these down times in those days as well, there was just less communication about it. All the work out there was then just recycled over and over and over and over - up to 50 times over - if at least you were connected to one of the servers doing all that recycling. And else your Seti program wasn't doing anything either. |
tullio Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1 |
Yes, and many associate SETI@home with the SETI Institute. I had to explain to an Italian newspaper that it is not so, but I saw that Scientific American made the same mistake in an article promoting Docking@home. Tullio |
David @ TPS Send message Joined: 30 Sep 04 Posts: 70 Credit: 11,323,275 RAC: 0 |
It's all about timing. Rejoining in the middle of an outage would be unnerving to say the least. Give it some time to heal, and all will be well until the next hiccup. Seti is my primary project, but I have others in reserve if it drops off for a while. As was said above S*** Happens, AND USUALLY AT THE MOST INAPPROPRIATE TIME. My 10 day cache's have weathered everything I have encountered so far as far as outages, and letting other projects run helps too. Sure my RAC has dropped a few thousand, but it will be back! (lost a quad as well!) (no, I am not the poster above with the same username) Give Matt and the guys some time to sort it all out. |
Administrator Send message Joined: 2 Oct 06 Posts: 2 Credit: 203,504 RAC: 0 |
BMgoau: Um yes I did donate, the day I rejoined actually, 48 hours ago. The sum I donated is none of your concern. Secondly, only returning 48 hours ago, it's a bit hard to go through all the technical news posts. Thirdly, you are right, I have no idea of the "perfect storm" right now. Would you care to elaborate? No I thought so, it would take too long one would assume. I acknowledge Seti@home is run on grants and donations, that's why I donated when I rejoined, like I used to periodically when I used to run it years ago before this BOINC! I'm not putting down, or having a go at the team at Seti, they do a terrific job. I was only asking or really inquiring about these downtimes, as I don't recall there being as many. Now please step off of your soapbox. |
PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1 |
Perhaps the question to ask is why the glitch rate remains (apparently) high after all this time (years). It appears to those of us who know nothing that things are tweaked frequently and almost as frequently the first tweak begets a second tweak or an untweak, and so on. Yet after I don't know how many years the data we've processed remains un-analyzed. It is very frustrating. My remedy has been to connect only in the wee hours of each Berkeley evening and keep a 10 day cache, so as to minimize the server chaos impacting my hosts' productivity. I also try to down shift my attitude before reading these boards so that I don't emotionally red line. Afterall, it is a 'hobby'. And, I think I'm going to turn off some of my old beasts for a while or forever; unlike the original premise of s@h, they are probably just adding more thermodynamic entropy to the universe than they can justify with seti "science" |
lostrego Send message Joined: 18 Mar 03 Posts: 1 Credit: 4,450,432 RAC: 12 |
Just a simple hint: Restarting the BOINC client seems to temporally solve the problem, at least for me. I'm crunching now again with my old piece o junk ;-) |
Stick Send message Joined: 26 Feb 00 Posts: 100 Credit: 5,283,449 RAC: 5 |
Just a simple hint: Me too. All my stuck transfers cleared immediately. Thanks! |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Restarted Boinc last night and all my downloads started then. Claggy |
PeterD Send message Joined: 2 Jan 01 Posts: 3 Credit: 18,283 RAC: 0 |
Cleared my browser cache and restarted BOINC and everything is downloading properly now. Thx for the info |
Stephen Motuel Send message Joined: 18 May 99 Posts: 1 Credit: 632,206 RAC: 0 |
Well I'll be a monkys uncle! I 've been waiting for Bionic to download 3 new jobs for the past 3 days with some silly messages like wrong size, server down and so on... now I know! just Exit Bionic restart Bionic and HELLO!!!! everything is alright! there seems to be nothing really wrong with seti (breath easier guys) it seem the problem lays with Bionic! I may be wrong here but what the hell... |
Bounce Send message Joined: 3 Apr 99 Posts: 66 Credit: 5,604,569 RAC: 0 |
>O'Reilly open source convention WOW! Seti/OpenSource are sponsoring an O'Reily race car? Kewl! Does it run really, really fast but require a complete engine replacement every lap? |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.