Out of the Frying Pan (Feb 17 2010) |
![]() |
| log in |
Message boards : Technical News : Out of the Frying Pan (Feb 17 2010)
1 · 2 · 3 · 4 . . . 6 · Next
| Author | Message |
|---|---|
|
Well, shoot. Right at the end of the work day yesterday the air conditioning unit failed. What's worse is that the cause is still a complete mystery. When the campus A/C techs came up in the early evening they just pressed the reset button and it came back to life. | |
| ID: 970983 · | |
|
Thanks for the update Matt. | |
| ID: 970987 · | |
|
Thanks Matt and all the rest of the crew too. | |
| ID: 970989 · | |
|
Thanks Matt. | |
| ID: 970996 · | |
smelled burned plastic, heard broken fans How hot was it in there? Are the systems not automatically shuting down, when overheating? ____________ . | |
| ID: 970998 · | |
|
Matt, 17/02/2010 23:09:28|SETI@home|[file_xfer] Started upload of file 25fe07ac.28421.12751.16.10.119_1_0 17/02/2010 23:09:29||[http_debug] [ID#14] info: About to connect() to setiboincdata.ssl.berkeley.edu port 80 (#0) 17/02/2010 23:09:29||[http_debug] [ID#14] info: Trying 208.68.240.16... 17/02/2010 23:09:29||[http_debug] [ID#14] info: Connected to setiboincdata.ssl.berkeley.edu (208.68.240.16) port 80 (#0) 17/02/2010 23:09:29||[http_debug] [ID#14] Sent header to server: POST /sah_cgi/file_upload_handler HTTP/1.1 User-Agent: BOINC client (windows_intelx86 5.10.13) Host: setiboincdata.ssl.berkeley.edu Accept: */* Accept-Encoding: deflate, gzip Content-Type: application/x-www-form-urlencoded Content-Length: 288 17/02/2010 23:09:29||[http_debug] [ID#14] Received header from server: HTTP/1.0 503 Service Unavailable 17/02/2010 23:09:29||[http_debug] [ID#14] Received header from server: Content-Type: text/html 17/02/2010 23:09:29||[http_debug] [ID#14] Received header from server: Content-Length: 53 17/02/2010 23:09:29||[http_xfer_debug] HTTP: wrote 53 bytes 17/02/2010 23:09:29||[http_debug] [ID#14] info: Expire cleared 17/02/2010 23:09:29||[http_debug] [ID#14] info: Closing connection #0 17/02/2010 23:09:30|SETI@home|[file_xfer] Temporarily failed upload of 25fe07ac.28421.12751.16.10.119_1_0: http error That HTTP/1.0 503 Service Unavailable suggests something might still need kicking. | |
| ID: 971000 · | |
|
For a while my job depended on a system cooled by an air conditioner that I could not depend on. My solution was to get one of these and wire it into an extension cord so I could connect all the non-replaceable equipment to it. I then set it to about 80 F and had no worries about failed hardware. The catch is you must make sure your backups are up to date as the power down will be very hard and in my case the raid lost a drive often when it was powered down (very old drives). | |
| ID: 971002 · | |
For a while my job depended on a system cooled by an air conditioner that I could not depend on. My solution was to get one of these and wire it into an extension cord so I could connect all the non-replaceable equipment to it. I then set it to about 80 F and had no worries about failed hardware. The catch is you must make sure your backups are up to date as the power down will be very hard and in my case the raid lost a drive often when it was powered down (very old drives). Plug a UPS into it that has the ability to trigger a graceful shutdown of the systems when the power fails. So long as the UPS has the capacity to keep power to the systems during the shutdown you should be in good shape. ____________ | |
| ID: 971007 · | |
... they just pressed the reset button... It's the Microsoft way.. and heck it works more ofthen then one would think ;) ____________ The SETI@Home Gauntlet 2012 april 16 - 30| info / chat | STATS | |
| ID: 971008 · | |
For a while my job depended on a system cooled by an air conditioner that I could not depend on. My solution was to get one of these (...) I think there are more than enough software based solutions, which will nicely power down the system, if something is overheating. Alternatively, if software not possible, one could try to simulate pressing the power button. That will also gracefully shut down the system. ____________ . | |
| ID: 971011 · | |
For a while my job depended on a system cooled by an air conditioner that I could not depend on. My solution was to get one of these and wire it into an extension cord so I could connect all the non-replaceable equipment to it. I then set it to about 80 F and had no worries about failed hardware. The catch is you must make sure your backups are up to date as the power down will be very hard and in my case the raid lost a drive often when it was powered down (very old drives). It was a P390 running OS2 Warp and VM/ESA. It was so old it didn't have any idea what a smart UPS was. The hard drive failure would happen just because it stopped turning. On the other hand, I would have to do a cold start on VM/ESA but we never lost a byte of data with that set up. I am not sure other operating systems would be as forgiving so I provided a warning. We did have a UPS but it's main function was to filter power glitches. One danger of putting the switch on the UPS is additional heat will be generated while the UPS reaches it's shutdown point. My room was not much large than a closet so when things overheated, they needed to be shut down fast. The system was up 24 hours a day and often would be unattended so the failure would most likely happen when no one was around to lay hands on the system. ____________ | |
| ID: 971012 · | |
When the campus A/C techs came up in the early evening they just pressed the reset button and it came back to life. i sure hope you took note as to where the reset switch is... cricket still shows little activity, you must still be down and some of my rigs are out of work for the GPU's and others will be out soon (hours)... ____________ | |
| ID: 971015 · | |
I remember that box!! <g> In my 'previous life' we were running one of those and we had a 'UPS on steroids' that would power the machine for, I think, 2 hours. It might even have powered our 'server farm', but that was 6.5 years ago and my memory is iffy. Doug | |
| ID: 971017 · | |
|
thanks matt. was sure worried why its been off so long. thank you for your work their | |
| ID: 971053 · | |
For a while my job depended on a system cooled by an air conditioner that I could not depend on. My solution was to get one of these and wire it into an extension cord so I could connect all the non-replaceable equipment to it. I then set it to about 80 F and had no worries about failed hardware. The catch is you must make sure your backups are up to date as the power down will be very hard and in my case the raid lost a drive often when it was powered down (very old drives). Most anything semi-modern supports some sort of "dumb" signaling from a UPS. It uses a normal serial port, and only the handshake lines. A line goes "low" to signal "low battery" and the UPS waits for the system to drop a handshake line back when it is safe for the UPS to turn off. One could build a "UPS" whose only job was to signal low battery when the temperature got above a certain temperature, and kill power when the system said "okay." Power would be restored when it got cold enough. Or not. ____________ | |
| ID: 971079 · | |
|
The smart-ass in me made me write this..... | |
| ID: 971080 · | |
|
Blasted A/C! Now we're out of the frying pan, can we just avoid the fire this time? ;-) | |
| ID: 971082 · | |
The smart-ass in me made me write this..... Not from that part of the world, but i don't think it snows too often at Berkeley. And most server rooms don't have windows, let alone ones that open. ____________ Grant Darwin NT. | |
| ID: 971089 · | |
|
Nice to hear everything is almost back to normal. Unfortunate that alot of work units were aborted while trying to upload them as their deadline had passed during the downtime. A have a feeling more will be aborted as they are still unable to be uploaded.. | |
| ID: 971093 · | |
The A/C died and it's too hot? It's winter, it's 25 degrees and snowing...open the windows. That'll cool you off. Assuming the server room is near an outside wall and has openable windows, of course. ____________ | |
| ID: 971095 · | |
Message boards : Technical News : Out of the Frying Pan (Feb 17 2010)
| Copyright © 2013 University of California |