Out of the Frying Pan (Feb 17 2010)

Message boards : Technical News : Out of the Frying Pan (Feb 17 2010)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 6 · Next

AuthorMessage
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 971096 - Posted: 18 Feb 2010, 5:06:31 UTC - in response to Message 971093.  

Nice to hear everything is almost back to normal. Unfortunate that alot of work units were aborted while trying to upload them as their deadline had passed during the downtime. A have a feeling more will be aborted as they are still unable to be uploaded..

Kinda dissapointed but what can ya do aye? You win some, you lose some - gotta keep on truckin' ! :)

You should always let those ride -- you likely would still get credit.
ID: 971096 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 971098 - Posted: 18 Feb 2010, 5:09:25 UTC - in response to Message 971095.  

The A/C died and it's too hot? It's winter, it's 25 degrees and snowing...open the windows. That'll cool you off.

Assuming the server room is near an outside wall and has openable windows, of course.

And assuming you want moist sea-air to enter your server room, wreaking havoc with all the electrics in there. ;-)
ID: 971098 · Report as offensive
Dena Wiltsie
Volunteer tester

Send message
Joined: 19 Apr 01
Posts: 1628
Credit: 24,230,968
RAC: 26
United States
Message 971099 - Posted: 18 Feb 2010, 5:09:27 UTC - in response to Message 971079.  

For a while my job depended on a system cooled by an air conditioner that I could not depend on. My solution was to get one of these and wire it into an extension cord so I could connect all the non-replaceable equipment to it. I then set it to about 80 F and had no worries about failed hardware. The catch is you must make sure your backups are up to date as the power down will be very hard and in my case the raid lost a drive often when it was powered down (very old drives).

Most anything semi-modern supports some sort of "dumb" signaling from a UPS.

It uses a normal serial port, and only the handshake lines. A line goes "low" to signal "low battery" and the UPS waits for the system to drop a handshake line back when it is safe for the UPS to turn off.

One could build a "UPS" whose only job was to signal low battery when the temperature got above a certain temperature, and kill power when the system said "okay."

Power would be restored when it got cold enough. Or not.

The P390 came with a latching power switch. The software was unable to cut the power and the only way the power could be turned off was to push the button or pull the power cord. I don't think Warp has power support in it and even if it did, VM/ESA didn't have that type of support in a P390. Running on real hardware VM/ESA might but the P390 was a strange animal for IBM. My job wasn't to spend a few month getting what you suggest to work, I needed something quick and dirty to protect the hardware because we couldn't afford to replace it.
I have what you suggest all set up and functional on my MAC but the P390 is about 15 year old hardware pressed into service long after IBM considered it obsolete.
ID: 971099 · Report as offensive
Nate Itkin

Send message
Joined: 29 Jun 99
Posts: 4
Credit: 1,804,607
RAC: 3
United States
Message 971100 - Posted: 18 Feb 2010, 5:22:38 UTC

I concur with Mr. Haselgrove. Something was wrong with the scheduler before the Tuesday shutdown. My crunchers (located in Texas, California, and Hawaii) all had entries like this in their logs:
15-Feb-2010 22:07:14 [SETI@home] Scheduler request failed: Timeout was reached
This particular entry was GMT -10.


ID: 971100 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 3187
Credit: 57,163,290
RAC: 0
United States
Message 971105 - Posted: 18 Feb 2010, 6:08:47 UTC - in response to Message 971080.  
Last modified: 18 Feb 2010, 6:13:42 UTC

The smart-ass in me made me write this.....

The A/C died and it's too hot? It's winter, it's 25 degrees and snowing...open the windows. That'll cool you off.


...You forget that the project is in Berkeley... where (during the day at this time of year...) it is about 60-65ºF and only goes down to 50-55ºF at night... NTM that yesterday morning, and this morning, there was a heavy fog (at least in my location, 2½ miles away...)

Besides, Matt always refers to it as the "server closet", which implies that it doesn't have a window... (I think I've read that it is a re-purposed janitor closet...)
.

Hello, from Albany, CA!...
ID: 971105 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30591
Credit: 53,134,872
RAC: 32
United States
Message 971112 - Posted: 18 Feb 2010, 6:55:10 UTC - in response to Message 970983.  

So how much is a second A/C unit installed?

Perhaps time to add up the thermal load and retire some hot equipment for some cooler equipment.

Yes, you need to get thermal cut out switches. As you have UPC's, that makes it much easier for a controlled shutdown.

Now if you could automate the door opening and a couple of big fans coming on ...


ID: 971112 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 971135 - Posted: 18 Feb 2010, 8:55:21 UTC - in response to Message 971112.  

So how much is a second A/C unit installed?

Perhaps time to add up the thermal load and retire some hot equipment for some cooler equipment.

Yes, you need to get thermal cut out switches. As you have UPC's, that makes it much easier for a controlled shutdown.

Now if you could automate the door opening and a couple of big fans coming on ...


I don't think a second A/C system would be ideal. From what I remember hearing, power distribution/availability is already pretty much at maximum capacity as it is. Every time a new server is installed in the closet, it means one or two old ones being re-purposed elsewhere. Last I knew, there was still plenty of rack space, but it's a problem of power availability.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 971135 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 971136 - Posted: 18 Feb 2010, 8:59:59 UTC - in response to Message 971135.  

So how much is a second A/C unit installed?

Perhaps time to add up the thermal load and retire some hot equipment for some cooler equipment.

Yes, you need to get thermal cut out switches. As you have UPC's, that makes it much easier for a controlled shutdown.

Now if you could automate the door opening and a couple of big fans coming on ...


I don't think a second A/C system would be ideal. From what I remember hearing, power distribution/availability is already pretty much at maximum capacity as it is. Every time a new server is installed in the closet, it means one or two old ones being re-purposed elsewhere. Last I knew, there was still plenty of rack space, but it's a problem of power availability.


If I remember correctly, the AC and electricity is part of what Berkeley supplies out of the 'cut' they take from donations......
So I don't think this cuts into the puny Seti budget.

I don't think power availability ever came into the equation.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 971136 · Report as offensive
Profile Ronal E. Zepeda Trujillo
Avatar

Send message
Joined: 14 Jul 05
Posts: 9
Credit: 3,167,018
RAC: 0
Chile
Message 971150 - Posted: 18 Feb 2010, 10:49:43 UTC

At least, was a little disaster... could it be worst...
Only a boy with responsabilities of an old man
ID: 971150 · Report as offensive
Bounce

Send message
Joined: 3 Apr 99
Posts: 66
Credit: 5,604,569
RAC: 0
United States
Message 971191 - Posted: 18 Feb 2010, 14:24:16 UTC - in response to Message 971082.  
Last modified: 18 Feb 2010, 14:29:43 UTC

>I started a chkdsk over 10 hours ago and it's less than halfway through!

Try SpinRite (http://www.grc.com - just a satisfied customer). Much better at recovering data and grooming a hdd than what M$ includes for free.

>So how much is a second A/C unit installed?

At my last agency, a sub-unit (which was required due to how the building did its HVAC) was $10,000.00. These folks are begging for second hand servers to do their projects. I suspect that a real budget item like that is considered a little spendy. Even if UCB is taking a cut for basic facility management, extras like this are often done on the customer's dime.
ID: 971191 · Report as offensive
Profile RottenMutt
Avatar

Send message
Joined: 15 Mar 01
Posts: 1011
Credit: 230,314,058
RAC: 0
United States
Message 971197 - Posted: 18 Feb 2010, 14:53:43 UTC

SETI STAF, WE HAVE BEEN DOWN SINCE SUNDAY!

there has been no acknowledgment in the postings, other then BBQ servers. please fix the problem

thank you


ID: 971197 · Report as offensive
Profile FrostKing9
Avatar

Send message
Joined: 20 Oct 01
Posts: 39
Credit: 23,815,960
RAC: 0
United States
Message 971206 - Posted: 18 Feb 2010, 15:11:25 UTC

Yep... the upload and report process is still malfunctioning. Can barely upload completed WU's... only by repeatedly hitting the RETRY NOW on the TRANSFERS window. Then it only UPLOADS from 1 to 3 WU's at a time. And reporting all of those WU's isn't working at all. Not even after over 100-clicks over 8-hours on the UPDATE button on the PROJECTS window. <sigh>




I DONATE money to SETI@home.... DO YOU?

I'm just slowly BOINC'ing along.

Hey... ET... you have a sister who likes earthlings?
ID: 971206 · Report as offensive
Dave

Send message
Joined: 29 Mar 02
Posts: 778
Credit: 25,001,396
RAC: 0
United Kingdom
Message 971208 - Posted: 18 Feb 2010, 15:17:27 UTC

Patience people...
ID: 971208 · Report as offensive
DJStarfox

Send message
Joined: 23 May 01
Posts: 1066
Credit: 1,226,053
RAC: 2
United States
Message 971222 - Posted: 18 Feb 2010, 15:56:58 UTC - in response to Message 970983.  

Matt,

That is insane. You urgently need some kind of automated thermal shutdown or emergency ventilation for that closet. The Linux kernel will shutdown the system when the CPU overheats but not hard drives or other components. If there were to be some kind of fire or failure of most drives, the next failure could mean the end of SETI@Home.

My brother configured a monitoring program called Nagios to sense his data center's temperature and email his cell phone above a certain temp. If you're interested, I could get more implementation details.
ID: 971222 · Report as offensive
Profile Marc F.
Volunteer tester
Avatar

Send message
Joined: 7 Apr 05
Posts: 4
Credit: 3,613,183
RAC: 0
United States
Message 971230 - Posted: 18 Feb 2010, 16:07:57 UTC - in response to Message 971208.  
Last modified: 18 Feb 2010, 16:15:54 UTC

Patience people...


I agree -- when looking back at Matt's original update ("Off the Beach") after returning from vacation, I was reminded that he does acknowledge that there were some problems even before the A/C failure (e.g. the uploading issues we've all been facing). So there's no need to get riled up about that right now. The way I see it, I'm going to give SETI@home a full week to get back to normal before any of us is really justified in panicking.

Actually, come to think of it, we might all do well to heed the wisdom of a certain "Guide" that proclaims in large, friendly letters: DON'T PANIC!

By the way, SETI@home staff: I really like the plan to have SETI@home and Astropulse on separate servers.


"That's no moon. It's a space station." -Obi-Wan Kenobi
...If there's a Galactic Empire out there with a Death Star that's about to destroy us all, SETI will find it.
ID: 971230 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 971263 - Posted: 18 Feb 2010, 17:47:37 UTC - in response to Message 971135.  

So how much is a second A/C unit installed?

Perhaps time to add up the thermal load and retire some hot equipment for some cooler equipment.

Yes, you need to get thermal cut out switches. As you have UPC's, that makes it much easier for a controlled shutdown.

Now if you could automate the door opening and a couple of big fans coming on ...


I don't think a second A/C system would be ideal. From what I remember hearing, power distribution/availability is already pretty much at maximum capacity as it is. Every time a new server is installed in the closet, it means one or two old ones being re-purposed elsewhere. Last I knew, there was still plenty of rack space, but it's a problem of power availability.

.... and there is the issue of where do you dump the heat?

A/C doesn't make cold, it absorbs heat on the cold side and dumps it into a heatsink someplace else.

The easiest type of installation would be a "ductless split" but you still have to route some refrigerant tubing between the two units, and there is a distance limit.

Campus provides the A/C, so they probably either take what Campus provides, or pay for the installation, and like the gigibit fiber up the hill, SETI@Home is perenially short on cash.

Load shedding (automatically powering down the servers) based on temperature is probably more practical.
ID: 971263 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 971264 - Posted: 18 Feb 2010, 17:50:58 UTC - in response to Message 971136.  

I don't think power availability ever came into the equation.

Matt has said that there is a finite amount of power delivered to the closet. I don't know if the issue is the cost of a new branch circuit, or if there is some rule saying these closets come with a certain sized branch circuit.....

... but obviously, if they could pump more energy into the closet, at some point it'd be a fire hazard.

ID: 971264 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 971280 - Posted: 18 Feb 2010, 18:36:22 UTC - in response to Message 971098.  

The A/C died and it's too hot? It's winter, it's 25 degrees and snowing...open the windows. That'll cool you off.

Assuming the server room is near an outside wall and has openable windows, of course.

And assuming you want moist sea-air to enter your server room, wreaking havoc with all the electrics in there. ;-)

Campus is much farther from the ocean than my server room, which is kept cool by keeping the windows open.

This is much less expensive (and much greener) than A/C.
ID: 971280 · Report as offensive
Profile Peter Moss
Avatar

Send message
Joined: 15 Nov 99
Posts: 14
Credit: 3,434,017
RAC: 12
United Kingdom
Message 971303 - Posted: 18 Feb 2010, 19:35:24 UTC - in response to Message 971280.  

I have almost 50 stuck items with - Upload Pending.

18/02/2010 18:19:20 SETI@home Reporting 1 completed tasks, not requesting new tasks
18/02/2010 18:19:42 Project communication failed: attempting access to reference site
18/02/2010 18:19:43 Internet access OK - project servers may be temporarily down.


These are UK times... Will they clear soon??
ID: 971303 · Report as offensive
Rick
Avatar

Send message
Joined: 3 Dec 99
Posts: 79
Credit: 11,486,227
RAC: 0
United States
Message 971309 - Posted: 18 Feb 2010, 19:39:13 UTC - in response to Message 971303.  

I have almost 50 stuck items with - Upload Pending.

18/02/2010 18:19:20 SETI@home Reporting 1 completed tasks, not requesting new tasks
18/02/2010 18:19:42 Project communication failed: attempting access to reference site
18/02/2010 18:19:43 Internet access OK - project servers may be temporarily down.


These are UK times... Will they clear soon??


Hard to say. It could be soon or it could be a day or so. One of my systems got lucky about 30 minutes ago and was able to download a few tasks. My other system is still waiting for tasks. Best thing to do is just leave it alone and eventually things will get back to normal.
ID: 971309 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 6 · Next

Message boards : Technical News : Out of the Frying Pan (Feb 17 2010)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.