Message boards :
News :
We are now mostly recovered from a campus wide power outage.
Message board moderation
Author | Message |
---|---|
Jeff Cobb Send message Joined: 1 Mar 99 Posts: 122 Credit: 40,367 RAC: 0 |
Last night there was a power outage that affected the entire Berkeley campus. The data center, where our servers are located, does have facility wide UPS so all servers stayed up. Unfortunately, the data center air conditioning did not stay up. Machines were getting hot, so the data center staff had to bring them down. |
Jeff Cobb Send message Joined: 1 Mar 99 Posts: 122 Credit: 40,367 RAC: 0 |
All of our servers seem to be OK with exception of marvin, the AstroPulse database server. Marvin appears to have been rendered nonfunctional, although we have reason to believe that the disks are OK. In any case, we have a backup of the database. We will be bringing marvin back to the lab tomorrow for a postmortem. |
QSilver Send message Joined: 26 May 99 Posts: 232 Credit: 6,452,764 RAC: 0 |
Thanks for the updates, Jeff. |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Welcome back! Ah, and you do know it'll only be temporarily? That Thursday California will rattle to a 9.7 Richter earthquake? (moves hands in a conjuring manner) ;-) |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
Thanks for the info, Jeff. Is the effort to revive marvin the reason the outage was longer than it has usually been lately? [edit] Well, okay, I know you had to bring all the servers up, but since they were shut down properly (right?) that shouldn't have been too bad. David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
Donald L. Johnson Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 |
Welcome back! My Wiccan friends on Mt. Tam say "Nay, nay, moosebreath". But I have both my cars gassed up and ready to head up the hill if need be. Donald Infernal Optimist / Submariner, retired |
Thomas Send message Joined: 9 Dec 11 Posts: 1499 Credit: 1,345,576 RAC: 0 |
Thanks for the heads-up Jeff ! :) Glad to see everyone ! From France about this out(r)age >> http://www.meltycampus.fr/etats-unis-explosion-sur-le-campus-de-berkeley-video-a215177.html :( |
rob smith Send message Joined: 7 Mar 03 Posts: 22506 Credit: 416,307,556 RAC: 380 |
All of our servers seem to be OK with exception of marvin, the AstroPulse database server. Marvin appears to have been rendered nonfunctional, although we have reason to believe that the disks are OK. In any case, we have a backup of the database. We will be bringing marvin back to the lab tomorrow for a postmortem. Thanks for the news, and your efforts. Your description of performing a post-mortem brings all sorts of strange images to mind involving, yourself, Eric, Matt and surgical garb.... I hope it goes well, and doesn't turn out to be a post-mortem, but a successful surgical intervention leading to a complete recovery by the patient. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3804 Credit: 1,114,826,392 RAC: 3,319 |
Last night there was a power outage that affected the entire Berkeley campus. Considering that both NASA and the NSF are down right now due to the Fed shutdown, this outage was more than a little disturbing at the time, and I'm glad it was only a coincidence! Hope the NSF downtime doesn't impact this project. All of our servers seem to be OK with exception of marvin, the AstroPulse database server. And to add to the coincidence, there's the NSF-funded part of the project, too. |
S@NL Etienne Dokkum Send message Joined: 11 Jun 99 Posts: 212 Credit: 43,822,095 RAC: 0 |
Thanks for keeping us up to date Jeff ! Best of luck bringing Marvin back from the after life... Hope to hear good news soon ! |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
Good luck with Marvin, but........ I gotta ask. Isn't this the kind of thing that the colo was supposed to prevent? I thought along with better power conditioning, better cooling, and better bandwidth, the other part of the puzzle was 24/7 babysitting. Seems they failed in that last regard, and let temps get too high before they started to shut servers down. They surely knew that although the UPS systems would keep all the servers going, the AC was down. In all other regards, the move to the colo has indeed been a very, very good thing for the project. I am just wondering why things went astray in this particular crisis situation. It's what they are paid to protect, isn't it? "Time is simply the mechanism that keeps everything from happening all at once." |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
Are the problems with Marvin preventing Beta from coming up? If not, could someone please start it??? . Hello, from Albany, CA!... |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
Are the problems with Marvin preventing Beta from coming up? If not, could someone please start it??? Not if you look at Beta's "Server Status" page... . Hello, from Albany, CA!... |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Are the problems with Marvin preventing Beta from coming up? If not, could someone please start it??? Seti Beta's Server Status page has looked like that for years, the only entries that ever show as green are: data-driven web pages vote_monitor splitter_throttle Sometimes a splitter will show as green, but since Seti Beta is small scale not a lot of Wu's need to be split, so mostly stay orange or red, the rest don't point to the actual process being run, even with beta_validate_v7 showing red, v7 Wu still validate there: server status The last conversation I had with Eric about the Server Status page. Beta is running on Bruno. The scripts that control the Deamons for the various services were conflicting with Seti Main. By disabling the reporting protion of the scripts they could run the Service Deamons, the Beta status page would not reflect what is or is not running. Claggy |
Cornhusker Send message Joined: 20 Apr 09 Posts: 41 Credit: 45,415,265 RAC: 37 |
I still think you guys need to run an extension cord out here to Nebraska! :) |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
I would have expected that by now, they would have the AP science database up and running on some other machine, if marvin is beyond repair. David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
Wiggo Send message Joined: 24 Jan 00 Posts: 36619 Credit: 261,360,520 RAC: 489 |
Well it's showing as running again and likely performing a catch up before turning the other associated functions on. Cheers. |
Wiggo Send message Joined: 24 Jan 00 Posts: 36619 Credit: 261,360,520 RAC: 489 |
AP's are now validating. Cheers. |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
Can we get a report on what was found to be wrong with marvin and what was done about it? David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
rspolo Send message Joined: 11 Oct 99 Posts: 3 Credit: 7,697,331 RAC: 0 |
I have been supporting SETI for more than 15 years and have seen this type of issue more than once. Burkley is supposed to have smart people running the show..What engineer in their right mind would specify a UPS and or Gen set that could not handle environmental support along with the servers.. Lights and temp are just as important as the servers unless it is only designed to handle power while the servers shut down automatically. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.