Message boards :
Number crunching :
MundayWeb temporarily down
Message board moderation
Author | Message |
---|---|
Neil Munday Send message Joined: 10 Apr 01 Posts: 102 Credit: 244,709 RAC: 0 |
Hi all, Looks like the RAID controller in the new server has bombed. The array is currently being rebuilt and I hope the site will be back up in an hour or two. Apologies on behalf of my host, Neil. |
Logan 5@SETI.USA Send message Joined: 7 May 01 Posts: 54 Credit: 1,275,043 RAC: 0 |
Thanks for the heads up.... it's appreciated. |
KWSN Sir Clark Send message Joined: 17 Aug 02 Posts: 139 Credit: 1,002,493 RAC: 8 |
I knew a smooth transition was too good to be true. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65745 Credit: 55,293,173 RAC: 49 |
old man murphy raises Hi head again. oh well, Life is nothing If It's not interesting. And It certainly keeps My interest. ;) The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Neil Munday Send message Joined: 10 Apr 01 Posts: 102 Credit: 244,709 RAC: 0 |
Update: Two of the drives in the RAID array bombed and as a result, the RAID array is still building. It looks like the server will not be back until tomorrow. This also means that I can't access my e-mail either - I will answer any e-mails as soon as I can. Apologies again on behalf of my web host - hopefully, it's just a one off! Neil. |
Hans Dorn Send message Joined: 3 Apr 99 Posts: 2262 Credit: 26,448,570 RAC: 0 |
Update: Hi Neil, did you ever notice that single drives _might_ fail, but RAID arrays _will_ fail :o) Regards Hans |
Neil Munday Send message Joined: 10 Apr 01 Posts: 102 Credit: 244,709 RAC: 0 |
Hi Neil, did you ever notice that single drives _might_ fail, but RAID arrays _will_ fail :o) Yep! Damn things!! ;-) Neil. |
Hans Dorn Send message Joined: 3 Apr 99 Posts: 2262 Credit: 26,448,570 RAC: 0 |
Hi Neil, did you ever notice that single drives _might_ fail, but RAID arrays _will_ fail :o) I once killed one by setting the wrong SCSI ID on a replacement drive. D'Oh :o) Regards Hans |
Neil Munday Send message Joined: 10 Apr 01 Posts: 102 Credit: 244,709 RAC: 0 |
Another update... The RAID array has been rebuilt and fsck is now being run. I have been told that the server will be back on t'Internet by midnight. At the moment, MundayWeb.com is pointing to a temporary location. As a result, once the server is back-up, it will take a few hours for the DNS changes to make their way around the world. Apologies once again, Neil. |
Lord_Vader Send message Joined: 7 May 05 Posts: 217 Credit: 10,386,105 RAC: 12 |
Another update... No apologies needed. We appreciate all that you do. Thanks, Vader Fear will keep the local systems in line. Fear of this battle station. - Grand Moff Tarkin |
Logan 5@SETI.USA Send message Joined: 7 May 01 Posts: 54 Credit: 1,275,043 RAC: 0 |
The RAID array has been rebuilt and fsck is now being run. I have been told that the server will be back on t'Internet by midnight.No need to keep apologizing mate....things that are out of anybodies control DO happen sometimes....you just had the unfortunate luck of them happening on a weekend when your hosts datacenter had less staff to be able to react as quickly as they otherwise seem to be doing... It's all good, besides, not having a BOINC sig for a while is actually kinda refreshing after all the 'back and forth' this weekend.... |
Nightbird Send message Joined: 2 Feb 03 Posts: 73 Credit: 53,523 RAC: 0 |
Another update... Don't worry Neil, we know that you're doing for the best. Your users will be patient and apologies are not needed. :) |
Fuzzy Hollynoodles Send message Joined: 3 Apr 99 Posts: 9659 Credit: 251,998 RAC: 0 |
Another update... No, it's ok. Things will be ok again. :-) "I'm trying to maintain a shred of dignity in this world." - Me |
D.J. Schweitz Send message Joined: 29 Oct 02 Posts: 157 Credit: 871,078 RAC: 0 |
|
Neil Munday Send message Joined: 10 Apr 01 Posts: 102 Credit: 244,709 RAC: 0 |
Update from web host: ================================== Update by Dominic (director of Thermal Degree, the hosting company): The array was rebuilt and is now running a consistency check. The RAID controller has been replaced as well as one drive from the RAID10 array. Normally we would let the server rebuild its array in the background however this was a brand new server so we're investigating fully before we begin to load it with more clients. Additionally, there is some corruption present but this should be resolved with the consistency check and a final fsck. All I can do is apologise because this is definitely not our normal service and we have never experienced downtime like this before. We're doing the best we can and have pointed Mundayweb to this temporary location for now, as we do with all websites if their server experiences any problems (which for what it's worth has happened just once in our time!) ================================== ETA is now tomorrow morning (BST). Neil. |
Berserker Send message Joined: 2 Jun 99 Posts: 105 Credit: 5,440,087 RAC: 0 |
Yup, it's very rare for a hardware RAID array to bomb like this. The whole point of RAID is to protect against data loss. I know Dominic was excited about getting this server up and running, so this is an unwanted headache. Here's hoping the recovery goes OK. We discovered that the DNS entry for boinc.mundayweb.com had gone missing, so that's been put back and pointed at the holding page. Stats site - http://www.teamocuk.co.uk - still alive and (just about) kicking. |
beansprouts Send message Joined: 10 Nov 01 Posts: 4 Credit: 479,455 RAC: 0 |
Didn't see this thread - I would've replied sooner but I've been working on...this :) Anyone want a RAID controller? I wonder if it survived being thrown around. And while you're at it you can have a Maxtor hard disk, too. Both bits got swapped out earlier (but not really thrown around.) Maybe dodgy servers is a curse of being near Seti! Either way, we're doing the best we can...I would be running Mundayweb from another server except, of course, this being a new server I hadn't setup the offsite backups - silly me thought the spanking new array would hold out for a week doing on-server backups while I setup more immediate things. How wrong I was... No need to keep apologizing mate....things that are out of anybodies control DO happen sometimes....you just had the unfortunate luck of them happening on a weekend when your hosts datacenter had less staff to be able to react as quickly as they otherwise seem to be doing... Nah, weekend didn't make any difference really. There's still folk around :) Well, I personally was off the t'internet relaxing but hey, part of the job. Hi Neil, did you ever notice that single drives _might_ fail, but RAID arrays _will_ fail :o) That's a good one! But it's when expensive controllers randomly break the redundant RAID10 array that's really annoying :/ Sorry about the missing DNS records - I flipped mundayweb's DNS to a different server but forgot that I hadn't re-created the subdomains. We did notice the graph spikes but I thought I'd done the DNS so put it down to another site. I'm also racking my brains for the others and think I have 'em all - they should all go to the temporary page now, and the ones I've missed will go to the server's default page. When I wake up I hope it'll be fixed....also, I think Neil's adjusting the site to use static images which will make it much easier to juggle around in future. Dominic / Dom / Beansprout |
Berserker Send message Joined: 2 Jun 99 Posts: 105 Credit: 5,440,087 RAC: 0 |
It proved impossible to restore the server to working condition with the existing software, so it was wiped clean and the operating system reinstalled. Hopefully it won't take more than a day or two from here, but that depends on how much work Neil and Dominic need to do, and whether the server continues to behave. Stats site - http://www.teamocuk.co.uk - still alive and (just about) kicking. |
Neil Munday Send message Joined: 10 Apr 01 Posts: 102 Credit: 244,709 RAC: 0 |
Hi all, The server is back online. It would seem that most of the data can be recovered, apart from the Seti database. Luckily, I can recreate most of it from the back-ups from MundayWeb's previous hosting company. As I type, Beansprout is restoring the data. The PHP config on the server needs sorting too. It could be a long night - I hope the back-up works, as it took me 6 hours to get the site running last time due to all the configuration I had to do (MySQL dbs mainly, subdomains etc). Correction: just hit refresh on FireFox and the site has come to life! Right... better get back to restoring the Seti database... The DNS has been set to the new server again, so hopefully the site will be available to all. Neil. |
Neil Munday Send message Joined: 10 Apr 01 Posts: 102 Credit: 244,709 RAC: 0 |
Correction: Pretty much all of the databases are corrupt in some way. Most have been partially restored and I'm patching them with back-ups I have. There could be the odd bug as a result. Neil. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.