Power Outage mayhem: Feb 24/05

Message boards : Number crunching : Power Outage mayhem: Feb 24/05
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13753
Credit: 208,696,464
RAC: 304
Australia
Message 82533 - Posted: 26 Feb 2005, 10:37:36 UTC - in response to Message 82483.  

> Jeez, woody... not this s##t again...
>
> Every time there is an outage at berkeley, you just HAVE to crow over it.
> Please stop.

Some people are dickheads.
Such is life, don't sweat it.
Grant
Darwin NT
ID: 82533 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13753
Credit: 208,696,464
RAC: 304
Australia
Message 82535 - Posted: 26 Feb 2005, 10:41:45 UTC - in response to Message 82491.  

> The new database server is a Sun Fire V40z. According to Sun, a fully loaded
> V40z draws 760w. Specifications <a> href="http://www.sun.com/servers/entry/v40z/specs.jsp#Environment">here.[/url]
>
> This one isn't fully loaded, but it's still going to draw some power.

Those are really, really, really nice pieces of equipment.
If you want to read a review on one (with lots of nice pictures), have a look here.
Grant
Darwin NT
ID: 82535 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 82635 - Posted: 26 Feb 2005, 16:48:47 UTC - in response to Message 82535.  


> Those are really, really, really nice pieces of equipment.
> If you want to read a review on one (with lots of nice pictures), have a look
> here.

They do look very nice. I particularly like the whole idea of redundant power supplies (although personally, I'd just buy two cheaper servers and do redundant servers).

... and I think part of what Matt was saying WRT room on the UPS is that you can't just add another 750w of draw to a UPS if it doesn't have another 750w to give.

My APC 2200va UPSes are each good for about an hour with a 550w load (one of my projects for this weekend is to do run time tests -- there but for the grace of God go I).
ID: 82635 · Report as offensive
Profile LarryB56
Volunteer tester
Avatar

Send message
Joined: 22 Apr 02
Posts: 73
Credit: 4,402,310
RAC: 0
United States
Message 82645 - Posted: 26 Feb 2005, 17:15:30 UTC - in response to Message 82465.  

Well Said LEX... Amen to all of it!!!

> I'm glad SETI's back up and healthy.
>
> While things were down, I took a gander over to SETI Classic. After my visit,
> I thought about how we are all here for a reason, and that reason is to work
> together, to do volunteer work that makes us happy. It's weird, because I was
> thinking about how one day this will be the only SETI. Not the most
> important, but the most recent. At some future date when SETI changes again
> and I am destined to migrate, so be it. To be a part of something bigger than
> myself is a privledge.
>
> As SETI grows and matures, we also grow and mature as a community.
>
> LEX
>
LarryB56
ID: 82645 · Report as offensive
Profile MattDavis
Volunteer tester
Avatar

Send message
Joined: 11 Nov 99
Posts: 919
Credit: 934,161
RAC: 0
United States
Message 82646 - Posted: 26 Feb 2005, 17:18:58 UTC

Just ignore Woody. His main purpose in life is to make fun of the Seti crew when something goes sour. We've established months ago that he's a lonely middle-aged man who has nothing better to do. He's a real slimeball.
-----
ID: 82646 · Report as offensive
Bill & Patsy
Avatar

Send message
Joined: 6 Apr 01
Posts: 141
Credit: 508,875
RAC: 0
United States
Message 82648 - Posted: 26 Feb 2005, 17:21:18 UTC
Last modified: 26 Feb 2005, 17:22:18 UTC

You know, I think Woody is on to something here. I don't agree with the specific points he is complaining about. But "between the lines" I think he may be saying that Berkeley doesn't really care anything at all about losing some of our crunch time. Now, before you flame me, please just consider this:

They have recently and needlessly added 33% additional crunch time to each work unit ("WU"), by initially sending out 4 when only 3 are needed. Why did they do this? Because a very small group of very loud people were upset because some of their credits didn't arrive by overnight express.

If Berkeley felt that they needed to conserve all the available crunch power, they would never have made the decision to add a 33% useless overhead just to please a few impatient participants.

Bottom line, they know they have plenty of crunch time to waste, and they really aren't worried in the least about losing a half hour of it.

In fact, this surplus crunch time is one of the primary reasons BOINC was born - to find a better way to utilize the surplus.

--Bill

ID: 82648 · Report as offensive
Profile MattDavis
Volunteer tester
Avatar

Send message
Joined: 11 Nov 99
Posts: 919
Credit: 934,161
RAC: 0
United States
Message 82651 - Posted: 26 Feb 2005, 17:25:56 UTC

See, Bill, you're a good example of how it is possible to criticize Berkeley but do it constructively and respectfully. Nobody likes Woody because he sits at his computer and gets an erection out of excitement when something goes bad at Seti HQ so he has something to type about. Perhaps this is why his name is "Woody"... but I digress.

So, I see what you're saying but it does seem a bit of a conspiracy. Even if Seti did send out 4 instead of 3 work units for this reason I can't imagine it'd be because they didn't care about wasting our crunch time. Maybe it was because they relized that "things go wrong in science" and were just protecting themselves from the wrath of Murphy.
-----
ID: 82651 · Report as offensive
Bill & Patsy
Avatar

Send message
Joined: 6 Apr 01
Posts: 141
Credit: 508,875
RAC: 0
United States
Message 82656 - Posted: 26 Feb 2005, 17:52:59 UTC - in response to Message 82651.  
Last modified: 26 Feb 2005, 17:57:01 UTC

> So, I see what you're saying but it does seem a bit of a conspiracy. Even if
> Seti did send out 4 instead of 3 work units for this reason I can't imagine
> it'd be because they didn't care about wasting our crunch time. Maybe it was
> because they relized that "things go wrong in science" and were just
> protecting themselves from the wrath of Murphy.
>
Thanks, Matt.

If they were concerned about efficiency, they would deal with Murphy only when he rang the doorbell, which is what they used to do. I.e., they would resend a WU a 4th time (and a 5th, 6th, etc.) only when the first (2nd, 3rd, etc.) attempt failed. My general impression and observation was that the majority of WU's sent to just 3 crunchers completed successfully without needing any followup. So, why waste time on a fourth expect after Murphy has appeared?

One reason might be because Berkeley wants to get results back quickly. But I don't believe that's the case. To the contrary, my understanding is that they are way, way, way, way, way, way, way, way, way behind in processing and analyzing competed WU's. So, they are not in any hurry at all to hear back from us. (An exception might be when they have a special project going, such as when they went back to look at promising results. But that's not the normal modus operandi.)

And I'm not suggesting conspiracy. I think it's just the case that Berkeley actually does listen to us in the "cruncher community" (which is good!), and that they decided to expend (I wanted to say "squander") some crunch power to make a few impatient participants a little happier - because Berkeley has crunch power to spare (may I say "waste"?). (Personally, as you can tell, I disagree with what they are doing in that regard.)

So, that is why I'm suggesting that they aren't really concerned about the risk that they'll lose a little crunch time and have to send it out again. It seems pretty clear that their priorities are elsewhere - namely, getting things stabilized, and getting the cross-platform GUI out, so that they can get everyone on BOINC and shut down classic. And I agree with that prioritization.

--Bill

ID: 82656 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 82672 - Posted: 26 Feb 2005, 18:14:25 UTC

Here's another random clarification post:

1. When I said there was no room on the UPSes to just plug an extension cord in, I meant there wasn't enough power headroom. I didn't mean there wasn't a free plug. Any "experienced IT manager" would have found what I said obvious. Plus the two rooms are far enough apart the building director would not allow any cord crossing through the hallway like that. Also obvious.

2. Re: the idea that $500000 is a lot of funding. People see a lot of zeros and think it's a big enough number. Let's say for the sake of example this is what was currently in our budget (I believe it's actually less). First off, the UC Berkeley overhead on such grants is 36.4%. So we're already down to $318000. We have about 7 people on staff. Overhead on salaries is about 23% (for vacation/benefits/etc.). So let's assume this is all salary money for a moment: We're down to about $245 for just salaries. That's about $35000/year per person. I won't divulge what I make around here, but here's a fun fact: I make more money working only 4 days a week at the lab (and being a professional musician on the side) than I do working full time. This saves the project a few precious dollars. We have long discussions to decide whether or not we should splurge and get a 24 port unmanaged switch rather than a slightly cheaper 16 port. So, in short, if there's any high priority, it's not buying a UPS, it's filling out some new grant proposals so I can still have a job!

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 82672 · Report as offensive
Profile MattDavis
Volunteer tester
Avatar

Send message
Joined: 11 Nov 99
Posts: 919
Credit: 934,161
RAC: 0
United States
Message 82690 - Posted: 26 Feb 2005, 18:59:48 UTC

I'll donate $10 to Seti@home if you delete AZ Woody's account 8)
-----
ID: 82690 · Report as offensive
Bill & Patsy
Avatar

Send message
Joined: 6 Apr 01
Posts: 141
Credit: 508,875
RAC: 0
United States
Message 82723 - Posted: 26 Feb 2005, 19:58:12 UTC - in response to Message 82672.  

> ...We have long discussions to decide whether or
> not we should splurge and get a 24 port unmanaged switch rather than a
> slightly cheaper 16 port.
>

So, Matt, what you just said disturbs me. Does anyone on your team have management training? Perhaps Berkeley has a management school that could help you guys.

To wit: how much did you save in those discussions on the cost of the switch vs. how much it cost you in terms of wages (number of people in the discussion times the time of the "long discussions" times the aggregate fully loaded hourly wages)? I'll bet your process operated to a considerable net loss.

I'm not asking this just to "tweak" you. This is a common failure in organizations - not taking into account the high cost of a decision making process that at best yields only a minor savings.

If you folks are victims of this on any regular basis, you need to stop it.

;-)

--Bill

ID: 82723 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 82736 - Posted: 26 Feb 2005, 20:15:47 UTC - in response to Message 82648.  

> Bottom line, they know they have plenty of crunch time to waste, and they
> really aren't worried in the least about losing a half hour of it.
>
> In fact, this surplus crunch time is one of the primary reasons BOINC was born
> - to find a better way to utilize the surplus.

If anything, it's even more than this: they have created this incredibly ravenous monster that is growling constantly to be fed.

... and while losing some work isn't something that anyone really wants, because of this snarling beast that must be constantly fed, well, how much does a lost half-hour really mean?

That said, these are people, and a few folks here are incredibly tough on them.
ID: 82736 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 82741 - Posted: 26 Feb 2005, 20:18:50 UTC - in response to Message 82690.  

> I'll donate $10 to Seti@home if you delete AZ Woody's account 8)

Maybe we can pass the plate -- I'm in for at least $10.
ID: 82741 · Report as offensive
Profile Dominique
Volunteer tester
Avatar

Send message
Joined: 3 Mar 05
Posts: 1628
Credit: 74,745
RAC: 0
United States
Message 82763 - Posted: 26 Feb 2005, 21:06:35 UTC

To the SETI team,

Good work getting things back together. If people weren't saddled with OCD it wasn't that bad out here.

BOTY 2005,
Dominique

ID: 82763 · Report as offensive
Profile Byron Leigh Hatch @ team Carl Sagan
Volunteer tester
Avatar

Send message
Joined: 5 Jul 99
Posts: 4548
Credit: 35,667,570
RAC: 4
Canada
Message 82775 - Posted: 26 Feb 2005, 21:33:19 UTC
Last modified: 26 Feb 2005, 22:28:00 UTC

yes indeed _ I would like to add my _ thanks and congratulations _ to _ Matt _ David _ Jeff _ Eric _ Dan _ and all of the Berkeley team ...

who have work so hard __ for the last 6 year to keep SETI@home alive _ and _ with _ limited __ money __ equipment _ computers _ Personnel _ and _ time ....




friendly and respectful
byron ... _Earth_Flag

<B>S@h_ Berkeley's Staff Friends Club member m2 ©[/b]
ID: 82775 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 82779 - Posted: 26 Feb 2005, 21:52:14 UTC

Okay this is my last post on the subject. I appreciate the kind words of support and always welcome constructive criticism/post-trauma analysis, all while rolling my eyes at the know-it-alls who add nothing to the discussion, really.

I gotta tell ya it's been really difficult biting my tongue. But my frank opinions don't really have a place on this forum.

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 82779 · Report as offensive
Profile Byron Leigh Hatch @ team Carl Sagan
Volunteer tester
Avatar

Send message
Joined: 5 Jul 99
Posts: 4548
Credit: 35,667,570
RAC: 4
Canada
Message 82781 - Posted: 26 Feb 2005, 21:57:12 UTC - in response to Message 82779.  
Last modified: 26 Feb 2005, 22:39:06 UTC

> Okay this is my last post on the subject. I appreciate the kind words of
> support and always welcome constructive criticism/post-trauma analysis, all
> while rolling my eyes at the know-it-alls who add nothing to the discussion,
> really.
>
> I gotta tell ya it's been really difficult biting my tongue. But my frank
> opinions don't really have a place on this forum.
>
> - Matt
>
============================



thanks Matt ..... for all your long hours ... and hard work ... on SETI@home ... for almost 7 years now ...


My Very Best Wishes ... and a pat on the back .. for your professionalism and dedication to a job well done !


from ...

friendly and respectful
byron ... ---- <B>Greetings --- from --- the Pacific West Coast --- Canada</B>


ID: 82781 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13753
Credit: 208,696,464
RAC: 304
Australia
Message 82801 - Posted: 26 Feb 2005, 22:51:47 UTC - in response to Message 82648.  
Last modified: 26 Feb 2005, 22:54:53 UTC

> If Berkeley felt that they needed to conserve all the available crunch power,
> they would never have made the decision to add a 33% useless overhead just to
> please a few impatient participants.

Sorry, but given all the other processing that goes on in the background i can't see how having a Work Unit processed 4 times would result in a 33% increase in overheads on the system as a whole.


> Bottom line, they know they have plenty of crunch time to waste, and they
> really aren't worried in the least about losing a half hour of it.

While the loss of 30 min of data has no effect what so ever on the project itself any loss of work shows a failure in systems; a problem that wasn't prevented or resolved correctly.
But that's what pre production work is all about. Finding the problems, resolving them & then preventing them from occuring again in the future.
So when the system finally becomes fully operation & goes online for all & sundry most bases that can be covered are, and those that can't be covered have secondary measures to help in case of problems should they occur.

The priority of the project at present is to get it working. The work done buy the testers (us) is important, but the most important thing is getting the project itself fully functional, within the constraints imposed by the resources available.

If they didn't give a stuff about loosing some of the work done, i doubt they would have spent nearly as much affort as they did in trying to recover it, hell they just would have written it off & got the backup off & running in a few hours instead of making everyone wait for a couple of days while they tried to get the full dataset back...
Grant
Darwin NT
ID: 82801 · Report as offensive
Dean

Send message
Joined: 19 Aug 99
Posts: 6
Credit: 3,364,725
RAC: 0
United States
Message 82841 - Posted: 27 Feb 2005, 0:11:28 UTC

Wow, I missed something when I signed up, some of you sound like you are getting paid for each work unit you complete, and I missed that part. Some of you sound like "loosing" some work units is a world wide catastrophe. So what! No one died, the world didn't stop turning, and, most importantly, it didn't cost you any money.

Speaking as an IT manager, if my team recovered from a catastrophic DB failure with a loss of only 30 minutes work, I'd pat them on the back, and buy the beer. It could have been so much worse, as you DB admins know.

These guys are breaking new ground, and, as NASA has learned, there are sometimes painful lessons to learn out there where no one has gone before. Not all IT departments, even in corporate environments, are funded as well as some of you seem to think. I never seem to have enough time, money, or personnel to do what everyone wants the way they want it, or the way it should be done, so we do the best we can with what we have at the time. That's the way it is in the real world, get use to it. This may not be the situation in the fantasy world some seem to live in, but here on Earth.......

Now, I'll save most of you the time and trouble of posting back to this, and go somewhere and f*%@ myself.
ID: 82841 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 82867 - Posted: 27 Feb 2005, 0:52:15 UTC - in response to Message 82841.  

> Wow, I missed something when I signed up, some of you sound like you are
> getting paid for each work unit you complete, and I missed that part.

You mean you haven't been getting your checks??? <ducking>
ID: 82867 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Power Outage mayhem: Feb 24/05


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.