server / AC status

Message boards : Technical News : server / AC status
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Jeff Cobb Project Donor
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Mar 99
Posts: 122
Credit: 40,367
RAC: 0
United States
Message 1033125 - Posted: 15 Sep 2010, 20:21:10 UTC

This has been a very difficult several days and we are still far from out of the woods.

This past Saturday morning the air conditioning in our server closet started acting up, apparently cycling on and off. Around noon that day, we deemed it bad enough to come to the lab. It's a good thing, because the AC was completely down when we got here. We shut most machines down and restarted the AC. It seemed to hold. But later that day our monitors showed the temperature increasing again, even with a small number of machines running. We came back to the lab and shut down everything except the web servers. That small load is OK even with no AC.

That's the way it has been, off and on, since. The physical plant people have been here several times. They have been doing a good job, even though low staffing levels have cut into the time that they can give us. The current diagnosis is that the AC has a bad condenser fan. Now it is a mater of getting the part - not trivial, unfortunately. In the meantime, they rigged up a piggyback fan, which did help some. Just not enough to run the project.

We're hoping that the new fan gets here soon.
ID: 1033125 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1033137 - Posted: 15 Sep 2010, 21:10:51 UTC - in response to Message 1033125.  

Jeff, thanks for the news.

A pity..

..I hope also the fan will be sent and installed soon. ;-)

ID: 1033137 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1033140 - Posted: 15 Sep 2010, 21:18:16 UTC - in response to Message 1033137.  

Hello Jeff, thanks for the update, glad no real damage has occurred!
Hat of for coming in the weekend to the lab, to see everything is OK :)

Hope that the UPLoad Servers will be online soon.
Regards
Fred.

ID: 1033140 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30985
Credit: 53,134,872
RAC: 32
United States
Message 1033145 - Posted: 15 Sep 2010, 21:29:30 UTC

Thanks for the update Jeff. Maybe you need an ambient temp to the server stats page! :)


ID: 1033145 · Report as offensive
alan
Avatar

Send message
Joined: 18 Feb 00
Posts: 131
Credit: 401,606
RAC: 0
United Kingdom
Message 1033147 - Posted: 15 Sep 2010, 21:30:55 UTC

Fingers crossed for you here.

Back when I ran VAX's in an airconditioned computer room (I know, I'm showing my age) we had temperature sensing attached to the UPS which could be set to signal the dependent servers to shut down if a temperature emergency arose (too high for too long).

I'd be surprised if there was not something available relatively cheaply nowadays to give a warning to signal the servers to shut down gracefully if a temperature emergency arises. Save you having to dash into work, and would be a safeguard for your data integrity too, removing the need for you guys to have to jump when the rude mechanics says "frog"
ID: 1033147 · Report as offensive
Berserker
Volunteer tester

Send message
Joined: 2 Jun 99
Posts: 105
Credit: 5,440,087
RAC: 0
United Kingdom
Message 1033161 - Posted: 15 Sep 2010, 22:10:51 UTC - in response to Message 1033147.  
Last modified: 15 Sep 2010, 22:11:36 UTC

I'd be surprised if there was not something available relatively cheaply nowadays to give a warning to signal the servers to shut down gracefully if a temperature emergency arises. Save you having to dash into work, and would be a safeguard for your data integrity too, removing the need for you guys to have to jump when the rude mechanics says "frog"

How about free?

Most systems have temperature monitoring built in these days, and many Linux distributions have the tools either available or built in.

The problem of course is that it's an inexact science. Monitoring a CPU doesn't help you if a hard disk gets too hot (and they tend to be the first to 'die' now as modern CPUs usually have built-in thermal protection).

Then there's the lack of standards. There's a communication standard (I2C), but no standard as to what sensors should be implemented or what values represent a particular temperature. I got caught out by this - implemented thermal monitoring on a system which promptly shut itself down because the seemingly reasonable high temperature limit I'd set was misinterpreted as a value that was lower than the ambient temperature in the server room.

Anyway, glad to see the project survived the AC woes - that was too close for comfort. Thanks to the team for being on top of the problem even at the weekend.
Stats site - http://www.teamocuk.co.uk - still alive and (just about) kicking.
ID: 1033161 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 66298
Credit: 55,293,173
RAC: 49
United States
Message 1033198 - Posted: 16 Sep 2010, 2:49:14 UTC

Thanks for the News Jeff.
Savoir-Faire is everywhere!
The T1 Trust, T1 Class 4-4-4-4 #5550, America's First HST

ID: 1033198 · Report as offensive
Jeff Cobb Project Donor
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Mar 99
Posts: 122
Credit: 40,367
RAC: 0
United States
Message 1033202 - Posted: 16 Sep 2010, 3:30:56 UTC

We're taking the project off line for the night, partly (mostly) for temperature concerns, but also to let the back end queues drain and give more I/O to the thumper root mirror re-sync that became necessary.

ID: 1033202 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30985
Credit: 53,134,872
RAC: 32
United States
Message 1033203 - Posted: 16 Sep 2010, 3:35:02 UTC - in response to Message 1033202.  

We're taking the project off line for the night, partly (mostly) for temperature concerns, but also to let the back end queues drain and give more I/O to the thumper root mirror re-sync that became necessary.

Thanks for the heads up.

ID: 1033203 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1033270 - Posted: 16 Sep 2010, 14:53:28 UTC
Last modified: 16 Sep 2010, 14:59:50 UTC

Thanks for the updated information Jeff.

I do find it hard to believe that the condenser fan is hard to get.....
Most AC equipment uses fairly standardized motors, unless this unit is either very old or a one-off odd duck.
Grainger has about every motor used in the refrigeration field in their online catalog. HVAC motors...
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1033270 · Report as offensive
Jeff Cobb Project Donor
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Mar 99
Posts: 122
Credit: 40,367
RAC: 0
United States
Message 1033325 - Posted: 16 Sep 2010, 21:41:18 UTC

OK then, I think we may be up. The condenser fan motor was replaced today and we brought the projects up. It's good that we can test under load while we're still at the lab.

Assuming that there are no problems, we'll just call this the start of the server run for this week.
ID: 1033325 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30985
Credit: 53,134,872
RAC: 32
United States
Message 1033343 - Posted: 16 Sep 2010, 22:23:21 UTC

Thanks for the update. Hope the bailing wire and duct tape holds! :)


ID: 1033343 · Report as offensive
TheFreshPrince a.k.a. BlueTooth76
Avatar

Send message
Joined: 4 Jun 99
Posts: 210
Credit: 10,315,944
RAC: 0
Netherlands
Message 1033345 - Posted: 16 Sep 2010, 22:27:42 UTC

My queue's were empty and I have about 3500 finished WU's to upload...
I hope I don't crash the servers :P
Rig name: "x6Crunchy"
OS: Win 7 x64
MB: Asus M4N98TD EVO
CPU: AMD X6 1055T 2.8(1,2v)
GPU: 2x Asus GTX560ti
Member of: Dutch Power Cows
ID: 1033345 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1033398 - Posted: 17 Sep 2010, 0:31:58 UTC - in response to Message 1033325.  
Last modified: 17 Sep 2010, 0:32:48 UTC

Jeff, thanks for the news!

It's nice to see everything again in green at the 'server status page'.. ;-)
ID: 1033398 · Report as offensive
B-Man
Volunteer tester

Send message
Joined: 11 Feb 01
Posts: 253
Credit: 147,366
RAC: 0
United States
Message 1033413 - Posted: 17 Sep 2010, 1:13:27 UTC

Any news on when the Beta project will come online? I can't get any of it to go.
ID: 1033413 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1033429 - Posted: 17 Sep 2010, 2:25:32 UTC - in response to Message 1033413.  

Any news on when the Beta project will come online? I can't get any of it to go.


I can't get any Seti Beta uploads to go though, i can connect to the scheduler, but that doesn't help when uploads won't go through.

Claggy
ID: 1033429 · Report as offensive
B-Man
Volunteer tester

Send message
Joined: 11 Feb 01
Posts: 253
Credit: 147,366
RAC: 0
United States
Message 1033449 - Posted: 17 Sep 2010, 3:46:44 UTC - in response to Message 1033429.  

Any news on when the Beta project will come online? I can't get any of it to go.


I can't get any Seti Beta uploads to go though, i can connect to the scheduler, but that doesn't help when uploads won't go through.

Claggy

I managed to pull down one WU. I have one still stuck in upload mode.
ID: 1033449 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1033583 - Posted: 17 Sep 2010, 15:51:34 UTC

How are the currently 'WU limit in progress'?

It's look like much less than here mentioned: 'server run, September 3-6 2010'.

OTOH, nearly 'no new WUs' available for DL.

ID: 1033583 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1033587 - Posted: 17 Sep 2010, 16:00:17 UTC - in response to Message 1033583.  

How are the currently 'WU limit in progress'?

It's look like much less than here mentioned: 'server run, September 3-6 2010'.

OTOH, nearly 'no new WUs' available for DL.


I think, OTOH, you answered your own question.


Donald
Infernal Optimist / Submariner, retired
ID: 1033587 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1033588 - Posted: 17 Sep 2010, 16:05:50 UTC - in response to Message 1033587.  

How are the currently 'WU limit in progress'?

It's look like much less than here mentioned: 'server run, September 3-6 2010'.

OTOH, nearly 'no new WUs' available for DL.

I think, OTOH, you answered your own question.


No.. this are 'two different pair of shoes'..


Message from server: No work sent
Message from server: This computer has reached a limit on tasks in progress


Scheduler request completed: got 0 new tasks
Message from server: Project has no tasks available

ID: 1033588 · Report as offensive
1 · 2 · 3 · Next

Message boards : Technical News : server / AC status


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.