MundayWeb temporarily down

Message boards : Number crunching : MundayWeb temporarily down
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Neil Munday

Send message
Joined: 10 Apr 01
Posts: 102
Credit: 244,709
RAC: 0
United Kingdom
Message 326329 - Posted: 4 Jun 2006, 18:02:10 UTC

Hi all,

Looks like the RAID controller in the new server has bombed. The array is currently being rebuilt and I hope the site will be back up in an hour or two.

Apologies on behalf of my host,

Neil.
ID: 326329 · Report as offensive
Profile Logan 5@SETI.USA
Avatar

Send message
Joined: 7 May 01
Posts: 54
Credit: 1,275,043
RAC: 0
United States
Message 326333 - Posted: 4 Jun 2006, 18:12:54 UTC

Thanks for the heads up.... it's appreciated.


ID: 326333 · Report as offensive
KWSN Sir Clark
Volunteer tester

Send message
Joined: 17 Aug 02
Posts: 139
Credit: 1,002,493
RAC: 8
United Kingdom
Message 326466 - Posted: 4 Jun 2006, 20:48:43 UTC

I knew a smooth transition was too good to be true.
ID: 326466 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65745
Credit: 55,293,173
RAC: 49
United States
Message 326470 - Posted: 4 Jun 2006, 20:51:07 UTC

old man murphy raises Hi head again. oh well, Life is nothing If It's not interesting. And It certainly keeps My interest. ;)
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 326470 · Report as offensive
Neil Munday

Send message
Joined: 10 Apr 01
Posts: 102
Credit: 244,709
RAC: 0
United Kingdom
Message 326488 - Posted: 4 Jun 2006, 21:07:50 UTC
Last modified: 4 Jun 2006, 21:22:56 UTC

Update:

Two of the drives in the RAID array bombed and as a result, the RAID array is still building.

It looks like the server will not be back until tomorrow. This also means that I can't access my e-mail either - I will answer any e-mails as soon as I can.

Apologies again on behalf of my web host - hopefully, it's just a one off!

Neil.
ID: 326488 · Report as offensive
Hans Dorn
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2262
Credit: 26,448,570
RAC: 0
Germany
Message 326494 - Posted: 4 Jun 2006, 21:18:10 UTC - in response to Message 326488.  

Update:

Two of the drives in the RAID array bombed and as a result, the RAID array is still building.

It looks like the server will not be back until tomorrow.

Apologies again on behalf of my web host - hopefull, it's just a one off!

Neil.


Hi Neil, did you ever notice that single drives _might_ fail, but RAID arrays _will_ fail :o)

Regards Hans
ID: 326494 · Report as offensive
Neil Munday

Send message
Joined: 10 Apr 01
Posts: 102
Credit: 244,709
RAC: 0
United Kingdom
Message 326499 - Posted: 4 Jun 2006, 21:24:08 UTC - in response to Message 326494.  

Hi Neil, did you ever notice that single drives _might_ fail, but RAID arrays _will_ fail :o)

Regards Hans


Yep! Damn things!! ;-)

Neil.
ID: 326499 · Report as offensive
Hans Dorn
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2262
Credit: 26,448,570
RAC: 0
Germany
Message 326504 - Posted: 4 Jun 2006, 21:26:40 UTC - in response to Message 326499.  

Hi Neil, did you ever notice that single drives _might_ fail, but RAID arrays _will_ fail :o)

Regards Hans


Yep! Damn things!! ;-)

Neil.


I once killed one by setting the wrong SCSI ID on a replacement drive. D'Oh :o)

Regards Hans
ID: 326504 · Report as offensive
Neil Munday

Send message
Joined: 10 Apr 01
Posts: 102
Credit: 244,709
RAC: 0
United Kingdom
Message 327633 - Posted: 5 Jun 2006, 17:11:09 UTC

Another update...

The RAID array has been rebuilt and fsck is now being run. I have been told that the server will be back on t'Internet by midnight.

At the moment, MundayWeb.com is pointing to a temporary location. As a result, once the server is back-up, it will take a few hours for the DNS changes to make their way around the world.

Apologies once again,

Neil.
ID: 327633 · Report as offensive
Profile Lord_Vader
Avatar

Send message
Joined: 7 May 05
Posts: 217
Credit: 10,386,105
RAC: 12
United States
Message 327636 - Posted: 5 Jun 2006, 17:12:52 UTC - in response to Message 327633.  

Another update...

The RAID array has been rebuilt and fsck is now being run. I have been told that the server will be back on t'Internet by midnight.

At the moment, MundayWeb.com is pointing to a temporary location. As a result, once the server is back-up, it will take a few hours for the DNS changes to make their way around the world.

Apologies once again,

Neil.


No apologies needed. We appreciate all that you do.

Thanks,
Vader



Fear will keep the local systems in line. Fear of this battle station. - Grand Moff Tarkin
ID: 327636 · Report as offensive
Profile Logan 5@SETI.USA
Avatar

Send message
Joined: 7 May 01
Posts: 54
Credit: 1,275,043
RAC: 0
United States
Message 327640 - Posted: 5 Jun 2006, 17:16:20 UTC - in response to Message 327633.  

The RAID array has been rebuilt and fsck is now being run. I have been told that the server will be back on t'Internet by midnight.

At the moment, MundayWeb.com is pointing to a temporary location. As a result, once the server is back-up, it will take a few hours for the DNS changes to make their way around the world.

Apologies once again,

Neil.
No need to keep apologizing mate....things that are out of anybodies control DO happen sometimes....you just had the unfortunate luck of them happening on a weekend when your hosts datacenter had less staff to be able to react as quickly as they otherwise seem to be doing...

It's all good, besides, not having a BOINC sig for a while is actually kinda refreshing after all the 'back and forth' this weekend....
ID: 327640 · Report as offensive
Profile Nightbird
Volunteer tester

Send message
Joined: 2 Feb 03
Posts: 73
Credit: 53,523
RAC: 0
France
Message 327730 - Posted: 5 Jun 2006, 18:37:25 UTC - in response to Message 327633.  
Last modified: 5 Jun 2006, 18:39:30 UTC

Another update...

The RAID array has been rebuilt and fsck is now being run. I have been told that the server will be back on t'Internet by midnight.

At the moment, MundayWeb.com is pointing to a temporary location. As a result, once the server is back-up, it will take a few hours for the DNS changes to make their way around the world.

Apologies once again,

Neil.

Don't worry Neil, we know that you're doing for the best. Your users will be patient and apologies are not needed. :)
ID: 327730 · Report as offensive
Profile Fuzzy Hollynoodles
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 9659
Credit: 251,998
RAC: 0
Message 327809 - Posted: 5 Jun 2006, 19:48:34 UTC - in response to Message 327633.  

Another update...

The RAID array has been rebuilt and fsck is now being run. I have been told that the server will be back on t'Internet by midnight.

At the moment, MundayWeb.com is pointing to a temporary location. As a result, once the server is back-up, it will take a few hours for the DNS changes to make their way around the world.

Apologies once again,

Neil.


No, it's ok. Things will be ok again. :-)



"I'm trying to maintain a shred of dignity in this world." - Me

ID: 327809 · Report as offensive
Profile D.J. Schweitz
Volunteer tester
Avatar

Send message
Joined: 29 Oct 02
Posts: 157
Credit: 871,078
RAC: 0
United States
Message 327830 - Posted: 5 Jun 2006, 20:05:57 UTC

Stuff happens Neil, keeps life from getting dull and mundane (pun intended)
Click below for our Team Website
ID: 327830 · Report as offensive
Neil Munday

Send message
Joined: 10 Apr 01
Posts: 102
Credit: 244,709
RAC: 0
United Kingdom
Message 327988 - Posted: 5 Jun 2006, 23:06:11 UTC

Update from web host:

==================================

Update by Dominic (director of Thermal Degree, the hosting company): The array was rebuilt and is now running a consistency check. The RAID controller has been replaced as well as one drive from the RAID10 array. Normally we would let the server rebuild its array in the background however this was a brand new server so we're investigating fully before we begin to load it with more clients. Additionally, there is some corruption present but this should be resolved with the consistency check and a final fsck.

All I can do is apologise because this is definitely not our normal service and we have never experienced downtime like this before. We're doing the best we can and have pointed Mundayweb to this temporary location for now, as we do with all websites if their server experiences any problems (which for what it's worth has happened just once in our time!)

==================================

ETA is now tomorrow morning (BST).

Neil.
ID: 327988 · Report as offensive
Berserker
Volunteer tester

Send message
Joined: 2 Jun 99
Posts: 105
Credit: 5,440,087
RAC: 0
United Kingdom
Message 328068 - Posted: 6 Jun 2006, 0:11:20 UTC

Yup, it's very rare for a hardware RAID array to bomb like this. The whole point of RAID is to protect against data loss. I know Dominic was excited about getting this server up and running, so this is an unwanted headache. Here's hoping the recovery goes OK.

We discovered that the DNS entry for boinc.mundayweb.com had gone missing, so that's been put back and pointed at the holding page.
Stats site - http://www.teamocuk.co.uk - still alive and (just about) kicking.
ID: 328068 · Report as offensive
beansprouts

Send message
Joined: 10 Nov 01
Posts: 4
Credit: 479,455
RAC: 0
United Kingdom
Message 328170 - Posted: 6 Jun 2006, 1:01:26 UTC - in response to Message 327640.  

Didn't see this thread - I would've replied sooner but I've been working on...this :)

Anyone want a RAID controller? I wonder if it survived being thrown around. And while you're at it you can have a Maxtor hard disk, too. Both bits got swapped out earlier (but not really thrown around.)

Maybe dodgy servers is a curse of being near Seti! Either way, we're doing the best we can...I would be running Mundayweb from another server except, of course, this being a new server I hadn't setup the offsite backups - silly me thought the spanking new array would hold out for a week doing on-server backups while I setup more immediate things. How wrong I was...

No need to keep apologizing mate....things that are out of anybodies control DO happen sometimes....you just had the unfortunate luck of them happening on a weekend when your hosts datacenter had less staff to be able to react as quickly as they otherwise seem to be doing...

Nah, weekend didn't make any difference really. There's still folk around :)

Well, I personally was off the t'internet relaxing but hey, part of the job.

Hi Neil, did you ever notice that single drives _might_ fail, but RAID arrays _will_ fail :o)

Regards Hans

That's a good one! But it's when expensive controllers randomly break the redundant RAID10 array that's really annoying :/

Sorry about the missing DNS records - I flipped mundayweb's DNS to a different server but forgot that I hadn't re-created the subdomains. We did notice the graph spikes but I thought I'd done the DNS so put it down to another site. I'm also racking my brains for the others and think I have 'em all - they should all go to the temporary page now, and the ones I've missed will go to the server's default page.

When I wake up I hope it'll be fixed....also, I think Neil's adjusting the site to use static images which will make it much easier to juggle around in future.

Dominic / Dom / Beansprout
ID: 328170 · Report as offensive
Berserker
Volunteer tester

Send message
Joined: 2 Jun 99
Posts: 105
Credit: 5,440,087
RAC: 0
United Kingdom
Message 330159 - Posted: 7 Jun 2006, 22:16:29 UTC

It proved impossible to restore the server to working condition with the existing software, so it was wiped clean and the operating system reinstalled. Hopefully it won't take more than a day or two from here, but that depends on how much work Neil and Dominic need to do, and whether the server continues to behave.
Stats site - http://www.teamocuk.co.uk - still alive and (just about) kicking.
ID: 330159 · Report as offensive
Neil Munday

Send message
Joined: 10 Apr 01
Posts: 102
Credit: 244,709
RAC: 0
United Kingdom
Message 330169 - Posted: 7 Jun 2006, 22:24:39 UTC

Hi all,

The server is back online. It would seem that most of the data can be recovered, apart from the Seti database. Luckily, I can recreate most of it from the back-ups from MundayWeb's previous hosting company.

As I type, Beansprout is restoring the data. The PHP config on the server needs sorting too.

It could be a long night - I hope the back-up works, as it took me 6 hours to get the site running last time due to all the configuration I had to do (MySQL dbs mainly, subdomains etc).

Correction: just hit refresh on FireFox and the site has come to life! Right... better get back to restoring the Seti database...

The DNS has been set to the new server again, so hopefully the site will be available to all.

Neil.
ID: 330169 · Report as offensive
Neil Munday

Send message
Joined: 10 Apr 01
Posts: 102
Credit: 244,709
RAC: 0
United Kingdom
Message 330198 - Posted: 7 Jun 2006, 22:52:02 UTC

Correction:

Pretty much all of the databases are corrupt in some way. Most have been partially restored and I'm patching them with back-ups I have.

There could be the odd bug as a result.

Neil.
ID: 330198 · Report as offensive
1 · 2 · 3 · Next

Message boards : Number crunching : MundayWeb temporarily down


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.