Message boards :
Number crunching :
Power Outage mayhem: Feb 24/05
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
Cochise Send message Joined: 3 Apr 99 Posts: 62 Credit: 3,079 RAC: 0 |
Put yer $$$ where yer mouth is ;-) Put yer $$$ where yer Mouth is <img src="http://www.boincstats.com/stats/banner.php?cpid=b3c0c2639ea110901bd0970a1c22efcd"> |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
> > The BOINC system is designed so that 100% uptime at the servers is not a > > requirement. > > Down time is not the issue. Lost data is. No actual "data" was lost, not as far as the Search for Extra-Terrestrial Intelligence is concerned. The lost results will be re-crunched. It's lost time, it's lost work, but it isn't really lost data. |
EclipseHA Send message Joined: 28 Jul 99 Posts: 1018 Credit: 530,719 RAC: 0 |
I just gota ask.. As the "main DB server" wasn't on the UPS and got hosed, was it close enough that a 100' extention cord could have provided power from the UPS for an outage like this? (to shut it down gracefully) It just seems that for a box as important as the the main DB, running without a UPS could have been avoided by someone just bringing in an extention cord for a few days or until the server cabinet was reorged. (surely someone in the lab has a 100' extention cord that wasn't needed for a couple weeks..Hey - worst case, $10 that someone needed anyway!) The DB lost "30 minutes" of it's data, but in that 30 minutes, it could have lost chrunching done on 5000 WU at 4 hours each! Yes, the science will still works, but the reality is that 2 years of CP time could result in nothing more than "providing heat to the room", as the lost work will just need to be re-crunched... |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
> Yes, the science will still works, but the reality is that 2 years of CP time > could result in nothing more than "providing heat to the room", as the lost > work will just need to be re-crunched... When you take those 5000 work units at four hours each, and divide them by amount of processing available (63,992 hosts, according to BOINC Synergy), it's still just a half-hour of clock time. Put another way, it is 0.005% wasted over the course of a year. |
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
> As the "main DB server" wasn't on the UPS and got hosed, was it close enough > that a 100' extention cord could have provided power from the UPS for an outage > like this? (to shut it down gracefully) Yes and no. Close enough yes. Room on the UPSes? No. So to get the main DB server on UPS would require buying another UPS strictly for the few weeks the main DB server would be in another lab. Should we have put the main DB server on UPS? Yes and no. On hindsight, well duh. File that under "coulda, woulda, shoulda." But given the situation before the outage? What a waste of time/money. We have a replica database that is on UPS. It doesn't keep up all the time (hence the 30 minute offset), but it would catch up most of the time and was a good "hot backup" in the interim. On top of that we back everything up to tape every week. Anyway, this was all a situation brought out of immediate necessity (the replica machine couldn't keep up by itself, so we had to prematurely force the new machine to be the master), so carefully laid out plans had to be revised on the spot. Of course, more funds are always welcome. Having an extra few hundred dollars a week ago wouldn't have turned into a UPS on the master db, though. See above. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
PT Send message Joined: 19 May 99 Posts: 231 Credit: 902,910 RAC: 0 |
> Well, looks like the guys worked their magic and got us back up and running. > > Thank you !!!! > > > :) > Yup, it looks like they've done it again. Well done! Happy crunching |
EclipseHA Send message Joined: 28 Jul 99 Posts: 1018 Credit: 530,719 RAC: 0 |
> Put another way, it is 0.005% wasted over the course of a year. > or ~ 2% wasted in a day (with 100k systems, that's 2000 computer's just "heating the room" for a day) It's not "nothing" for the people that spent 20,000 hr's of computer time that got tossed for a $10 extention cord! Jeeze... It seems that to you the project can do no wrong! It's not only "the science", but also the time that people volenteer, because without that, the main DB server would just be heating the room..... To run the main DB server without a UPS (one of the most important systems), when the solution would be for someone to lend an extention cord, was downright foolish! Especially, given the problems over the last week! Heck, someone has now brought in a UPS from home, and if this was done a few days back would have saved 20000 hours of "reprocessing"! While the Seti folks are not professional IT, you'd think a UPS on the most important box kind of made sense to them..... |
EclipseHA Send message Joined: 28 Jul 99 Posts: 1018 Credit: 530,719 RAC: 0 |
> > Well, looks like the guys worked their magic and got us back up and > running. > > > > Thank you !!!! > > > > > > :) > > > Yup, it looks like they've done it again. Well done! > They shoot themselves in the foot and you folks say "nice bandage! Good Work! Well Done!"? |
EclipseHA Send message Joined: 28 Jul 99 Posts: 1018 Credit: 530,719 RAC: 0 |
> > As the "main DB server" wasn't on the UPS and got hosed, was it close > enough > > that a 100' extention cord could have provided power from the UPS for an > outage > > like this? (to shut it down gracefully) > > Yes and no. Close enough yes. Room on the UPSes? No. Ace HW.. a multi-outlet adapter ($1.99)(single plug so it wouldn't impact other sockets) or just unplug something that didn't need to be on UPS! > So to get the main DB > server on UPS would require buying another UPS strictly for the few weeks the > main DB server would be in another lab. Or getting the guy to bring in a UPS from home, as he's now done.. A sixpack of beer might have made this happen before the problem! > > Should we have put the main DB server on UPS? Yes and no. On hindsight, well > duh. File that under "coulda, woulda, shoulda." It's a no brainer... Wasn't there another box that 1) wasn't as critical, 2) if it got trashed could be rebuilt quickly, 3) could handle a power fail a bit easier (windows). Worst case, a $10 extention cord and a $2 multi-head adapter (probably no cost at all it someone asked "got one we can borrow for a couple weeks?") > But given the situation before > the outage? What a waste of time/money. We have a replica database that is on > UPS. It doesn't keep up all the time (hence the 30 minute offset), but it > would catch up most of the time and was a good "hot backup" in the interim. It would be in sync if all was ok... Well, when alls not ok, the 30min old replica is well, 30 mins old! (20 THOUSAND hours of crunching lost!). Like I said, for $20, the primary DB could have been on the UPS, and that's if no one had an extention cord they could loan to the lab for a week or two.. > On > top of that we back everything up to tape every week. Anyway, this was all a > situation brought out of immediate necessity (the replica machine couldn't > keep up by itself, so we had to prematurely force the new machine to be the > master), so carefully laid out plans had to be revised on the spot. And you expect to handle the Seti Classic load this year? > > Of course, more funds are always welcome. Having an extra few hundred dollars > a week ago wouldn't have turned into a UPS on the master db, though. See > above. > Post a mailing address, and I will send you a 100' extention cord with an adapter so you can plug in two devices to a single UPS outlet..... You can write "azwoody" on the cord every few feet, and people can feel free to step on it when the mood fits! > - Matt > > > > |
PT Send message Joined: 19 May 99 Posts: 231 Credit: 902,910 RAC: 0 |
Wow, you really got out hard on the guys. As a professional I do agree on some of the things you're writing. Lacks of UPSes are always a bad thing - and should by law be forbidden. ;-) But if there are no UPSes available? Since this is funded purely on contribution I can somewhat have an understanding for lacking resources. That's why I don't agree with banning the guys! I don't think we have to teach them what to do. They played a high game and they lost. I think that’s punishing enough. Yes, this punishment pores over on all volunteers all around the world but do remember that we are just volunteers and know very well what can happen during such project like SETI@Home. This is a calculated risk we all take! So I stick to me earlier post on this message board. Guys - well done! You brought it back online....again! ;-D Happy crunching! Happy crunching |
LEX LETHAL Send message Joined: 3 Apr 99 Posts: 22 Credit: 423 RAC: 0 |
I'm glad SETI's back up and healthy. While things were down, I took a gander over to SETI Classic. After my visit, I thought about how we are all here for a reason, and that reason is to work together, to do volunteer work that makes us happy. It's weird, because I was thinking about how one day this will be the only SETI. Not the most important, but the most recent. At some future date when SETI changes again and I am destined to migrate, so be it. To be a part of something bigger than myself is a privledge. As SETI grows and matures, we also grow and mature as a community. LEX |
EclipseHA Send message Joined: 28 Jul 99 Posts: 1018 Credit: 530,719 RAC: 0 |
> But if there are no UPSes available? Since this is funded purely on > contribution I can somewhat have an understanding for lacking resources. They have funding from NSF - the "National Science Foundation" in the US (like $500000. It's just a question of how the funding is spent! I can post a link, if you are interested! Also, all it took was an extention cord that could have been loaned, or for someone to bring in a UPS from home, as is the current solution! That's why I don't agree with banning the guys! I don't think we have to teach > them what to do. They played a high game and they lost. I think that’s > punishing enough. They've been losing the game for the better part of a year! Were you around during the "Snap Appliance" days? It's one move like this after another! > > Yes, this punishment pores over on all volunteers all around the world but do > remember that we are just volunteers and know very well what can happen during > such project like SETI@Home. This is a calculated risk we all take! But it's been like this since LAST JUNE! There used to be a joke that the servers were down every weekend, and it was true! This is NOT the first time that work got tossed, by any means! > |
KWSN - MajorKong Send message Joined: 5 Jan 00 Posts: 2892 Credit: 1,499,890 RAC: 0 |
Jeez, woody... not this s##t again... Every time there is an outage at berkeley, you just HAVE to crow over it. Please stop. |
Captain Avatar Send message Joined: 17 May 99 Posts: 15133 Credit: 529,088 RAC: 0 |
> > As the "main DB server" wasn't on the UPS and got hosed, was it close > enough > > that a 100' extention cord could have provided power from the UPS for an > outage > > like this? (to shut it down gracefully) > > Yes and no. Close enough yes. Room on the UPSes? No. So to get the main DB > server on UPS would require buying another UPS strictly for the few weeks the > main DB server would be in another lab. > > Should we have put the main DB server on UPS? Yes and no. On hindsight, well > duh. File that under "coulda, woulda, shoulda." But given the situation before > the outage? What a waste of time/money. We have a replica database that is on > UPS. It doesn't keep up all the time (hence the 30 minute offset), but it > would catch up most of the time and was a good "hot backup" in the interim. On > top of that we back everything up to tape every week. Anyway, this was all a > situation brought out of immediate necessity (the replica machine couldn't > keep up by itself, so we had to prematurely force the new machine to be the > master), so carefully laid out plans had to be revised on the spot. > > Of course, more funds are always welcome. Having an extra few hundred dollars > a week ago wouldn't have turned into a UPS on the master db, though. See > above. > > - Matt > > > Maybe we could get a fund going to send azwoody out there to stragten you all out,,, After all he knows everything,,, Oh wait scratch that...Woody does know everything so he must be rich! and can fly out there on his private jet! his design of course,,, Couldn't help it,,,,, Smart ass Timmy |
Toby Send message Joined: 26 Oct 00 Posts: 1005 Credit: 6,366,949 RAC: 0 |
You have got to be kidding me! you want to run a power hungry, vital system off of a 100' extension cord?? That is just asking for trouble. How many hallways does it have to go through? How many doorways? How many opportunities for people to trip over it, causing not only a power outage but possibly physical damage to the server and/or UPS? What if facilities decides to clean the carpet? The machines could damage the cord causing a short circuit which, once again, could do temendous damage to the system, beyond a trashed database. Running a large system on a 100' extension cord is completely out of the question. And for crying out loud, quit your whining and moaning over 30 minutes of data. You are worse than my 1 year old niece who is teething! Just be glad we are only missing 30 minutes instead of a week (if they would have had to restore from tapes). Do you ALWAYS have to look at the worst side of things? A 2% loss of data on a single day is pretty minimal really. Get over yourself. A member of The Knights Who Say NI! For rankings, history graphs and more, check out: My BOINC stats site |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
> Jeeze... It seems that to you the project can do no wrong! ... and it seems that to you the project can do no right. Nothing justifies the kind of abuse you're dishing out, Woody. |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
> You have got to be kidding me! you want to run a power hungry, vital > system off of a 100' extension cord?? The new database server is a Sun Fire V40z. According to Sun, a fully loaded V40z draws 760w. Specifications here. This one isn't fully loaded, but it's still going to draw some power. |
PT Send message Joined: 19 May 99 Posts: 231 Credit: 902,910 RAC: 0 |
> But it's been like this since LAST JUNE! There used to be a joke that the > servers were down every weekend, and it was true! This is NOT the first time > that work got tossed, by any means! > I honestly think you aggressiveness’ is out of scope. I’ve been around for a long time and yes I do get fed up as well at times and I also lost WUs many times. I can still trace back to June to see my pending credits. But even if you or I wailing it’ll not change the situation, will it! Screaming and asking people why didn’t you this and that is not very constructive especially when you’re not in the project team. If you were you wouldn’t be screaming in this forum! I will not take a discussion about their funding since I do not have any details and am because of that not able to make a judgment. And I honestly don’t think that’s my business to judge. I am here doing this crunching as a volunteer and if I don’t like the situation I have the options and crunch somewhere else - and so do you! So, whatever reason caused the last outage they where able to bring it back online again and I am fine with that and I do think they deserve some gratitude from all of us. As I can understand, when I'm reading the board, it was some losses (half an hour or so). That is absolutely not good, but tough shit – bad things happen everyday! It could have been 30 days of lost work. So cheer up and do some crunching as long as it works! ;-D Happy crunching |
The Psychotic One Send message Joined: 22 May 00 Posts: 50 Credit: 4,099,029 RAC: 0 |
This post is O/T Timmy, OH MY GAWD! Hysterical avatar. I dropped my pop while LMAO, when I saw your avatar. You owe me a new Dew. j/k I had to try to lighten the mood in here... :) PS. What am I doing wrong sith my sig? I'm trying to show both... William D. Gagliardi |
Mike Send message Joined: 17 Feb 01 Posts: 34271 Credit: 79,922,639 RAC: 80 |
Hi @Peder I totaly agree. BTW: I say they´ve done a great job. greetz from Germany Mike With each crime and every kindness we birth our future. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.