Marvin crashed

Message boards : News : Marvin crashed
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Cameron
Avatar

Send message
Joined: 27 Nov 02
Posts: 110
Credit: 5,082,471
RAC: 17
Australia
Message 1617124 - Posted: 22 Dec 2014, 0:19:47 UTC

"Delays, Delays"
Hopefully it will not require another database rebuild.

Thanks for the Update.
ID: 1617124 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1617160 - Posted: 22 Dec 2014, 2:17:03 UTC - in response to Message 1617071.  

Marvin isn't finding a boot device on reboot. I'm not even sure it's seeing the RAID card at all (these remote interfaces to the boot screen aren't good at capturing things that happen quickly). Matt and I are meeting at the co-lo first thing tomorrow morning. I'm bringing a boot CD and what I think is a matching RAID card.

Any news on Lando? The SSP still shows Lando splitting APs and only 3 MB splitters working.

I take it it's not OK to leave it as it is until Monday in it's crashed state then? If non of the remote stuff works?

It's mainly generation of astropulse work that will suffer. No permanent damage should result. But I'd like to have everything in working order before folks do leave for the holidays.

Could you ask them to take a look at Lando as well tomorrow, please? Lando's four MB splitters don't seem to have been pulling their weight since Marvin went down.
ID: 1617160 · Report as offensive
Eric Korpela Project Donor
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 3 Apr 99
Posts: 1382
Credit: 54,506,847
RAC: 60
United States
Message 1617418 - Posted: 22 Dec 2014, 16:11:59 UTC - in response to Message 1617160.  

The splitters on lando are hung waiting for the database to come back, so they are running, but not doing anything.
@SETIEric@qoto.org (Mastodon)

ID: 1617418 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1617423 - Posted: 22 Dec 2014, 16:26:59 UTC - in response to Message 1617418.  

The splitters on lando are hung waiting for the database to come back, so they are running, but not doing anything.

I'd expect that for the AP splitters, but I'm surprised that needs to affect the MB splitters as well? But thanks for looking.
ID: 1617423 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19045
Credit: 40,757,560
RAC: 67
United Kingdom
Message 1617426 - Posted: 22 Dec 2014, 16:33:01 UTC
Last modified: 22 Dec 2014, 16:36:43 UTC

Chicken and egg question.

When AP database is "working", is the database problem cause of poor AP slitting, or are the AP splitters cause of database problem?

P.S. If problem is not fixed by tomorrow, please leave it and go home or you will be facing cold shoulder and hot tongue over the festive season.
ID: 1617426 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1617434 - Posted: 22 Dec 2014, 17:07:42 UTC

Hard end of year, indeed...
In next year more decoupled MultiBeam and AstroPulse workflows would be appreciated.
ID: 1617434 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1617456 - Posted: 22 Dec 2014, 18:18:12 UTC - in response to Message 1617423.  
Last modified: 22 Dec 2014, 18:19:06 UTC

The splitters on lando are hung waiting for the database to come back, so they are running, but not doing anything.

I'd expect that for the AP splitters, but I'm surprised that needs to affect the MB splitters as well? But thanks for looking.


And Eric The Man has done his job, now 6 splitters doing work on MB.

I wish a good luck for marvin getting up.
"Please keep Your signature under four lines so Internet traffic doesn't go up too much"

- In 1992 when I had my first e-mail address -
ID: 1617456 · Report as offensive
Eric Korpela Project Donor
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 3 Apr 99
Posts: 1382
Credit: 54,506,847
RAC: 60
United States
Message 1617460 - Posted: 22 Dec 2014, 18:22:53 UTC - in response to Message 1617418.  

Marvin is back up. For some reason the 3-ware BIOS was not being detected on reboot. After several attempts to make it visible we flashed a new BIOS. On the next reboot it came right back up.

The database is in transaction cleanup mode (which can take a while). Once that's done I'll survey for damage and if everything looks OK I'll restart the daemons.
@SETIEric@qoto.org (Mastodon)

ID: 1617460 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1617475 - Posted: 22 Dec 2014, 18:52:52 UTC - in response to Message 1617466.  

Marvin is back up. For some reason the 3-ware BIOS was not being detected on reboot. After several attempts to make it visible we flashed a new BIOS. On the next reboot it came right back up.

The database is in transaction cleanup mode (which can take a while). Once that's done I'll survey for damage and if everything looks OK I'll restart the daemons.

Hmm, and what about the "root partition filled on marvin (rapidly)" that you initially wrote when Marvin crashed. Is that still the reason it crashed, and why would that happen?

"Filled up" might have been remote access shorthand for "has no visible free space" - which would be the case if the controller couldn't see the disk. Or something like that.
ID: 1617475 · Report as offensive
Eric Korpela Project Donor
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 3 Apr 99
Posts: 1382
Credit: 54,506,847
RAC: 60
United States
Message 1617486 - Posted: 22 Dec 2014, 19:01:20 UTC - in response to Message 1617466.  

That's a good question. Apart from some email warnings we don't have any stored records of what happened at the time of the crash. It's likely that all the disk volumes disappeared simultaneously, but that doesn't really explain why we got any warning.
@SETIEric@qoto.org (Mastodon)

ID: 1617486 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30636
Credit: 53,134,872
RAC: 32
United States
Message 1617547 - Posted: 22 Dec 2014, 20:53:47 UTC - in response to Message 1617486.  

That's a good question. Apart from some email warnings we don't have any stored records of what happened at the time of the crash. It's likely that all the disk volumes disappeared simultaneously, but that doesn't really explain why we got any warning.

That might depend on how syslog is written. The warning message might have been passed in RAM and it was able to e-mail it but unable to find a disk to write it too, which would have generated another warning message.
ID: 1617547 · Report as offensive
Diamond
Avatar

Send message
Joined: 29 Jul 99
Posts: 12
Credit: 13,474,762
RAC: 0
Canada
Message 1617639 - Posted: 23 Dec 2014, 1:44:03 UTC - in response to Message 1617547.  

I see marvin is full of xmas joy now thanks to eric and co. Still no units yet, so i figure, if i'm going to get some...i'm going to have to distract all these hungry super cruncher guys somehow so my weenie cards can get fed first. :) hmmm, now what would douglas adams do in a situation like this......
ID: 1617639 · Report as offensive
Profile Paris
Avatar

Send message
Joined: 20 May 99
Posts: 110
Credit: 1,012,250
RAC: 0
United States
Message 1617842 - Posted: 23 Dec 2014, 14:35:47 UTC - in response to Message 1617639.  

I see marvin is full of xmas joy now thanks to eric and co. Still no units yet, so i figure, if i'm going to get some...i'm going to have to distract all these hungry super cruncher guys somehow so my weenie cards can get fed first. :) hmmm, now what would douglas adams do in a situation like this......



First of all, DON'T PANIC!

Plus SETI Classic = 21,082 WUs
ID: 1617842 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19045
Credit: 40,757,560
RAC: 67
United Kingdom
Message 1617882 - Posted: 23 Dec 2014, 16:01:44 UTC - in response to Message 1617842.  

I see marvin is full of xmas joy now thanks to eric and co. Still no units yet, so i figure, if i'm going to get some...i'm going to have to distract all these hungry super cruncher guys somehow so my weenie cards can get fed first. :) hmmm, now what would douglas adams do in a situation like this......



First of all, DON'T PANIC!

Second, grab towel
ID: 1617882 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1617894 - Posted: 23 Dec 2014, 16:31:23 UTC - in response to Message 1617882.  

I see the transitioner is down now, data being received but nothing is going out from seti. All graphs are frozen in time for 9 hours.
ID: 1617894 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1618018 - Posted: 24 Dec 2014, 0:49:51 UTC - in response to Message 1617882.  

I see marvin is full of xmas joy now thanks to eric and co. Still no units yet, so i figure, if i'm going to get some...i'm going to have to distract all these hungry super cruncher guys somehow so my weenie cards can get fed first. :) hmmm, now what would douglas adams do in a situation like this......

First of all, DON'T PANIC!

Second, grab towel

Third, ignore the problem. Before you know it, you'll be flying.
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1618018 · Report as offensive
Diamond
Avatar

Send message
Joined: 29 Jul 99
Posts: 12
Credit: 13,474,762
RAC: 0
Canada
Message 1618097 - Posted: 24 Dec 2014, 4:23:12 UTC - in response to Message 1618018.  


First of all, DON'T PANIC!

Second, grab towel

Third, ignore the problem. Before you know it, you'll be flying.


I've given this a an awful lot of thought, and now i have such a terrible headache.
ID: 1618097 · Report as offensive
BONNSaR

Send message
Joined: 9 Nov 04
Posts: 38
Credit: 21,538,589
RAC: 9
Australia
Message 1618106 - Posted: 24 Dec 2014, 4:58:22 UTC - in response to Message 1618097.  

It's easy

If it doesn't move and it should you need a hammer

If it does move and it shouldn't use duct tape

If non of the above then it's an electrical problem and you need an electrician

Merry Christmas to ALL
ID: 1618106 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1618271 - Posted: 24 Dec 2014, 12:32:55 UTC - in response to Message 1618106.  

It's easy

If it doesn't move and it should you need a hammer

If it does move and it shouldn't use duct tape

If non of the above then it's an electrical problem and you need an electrician

Merry Christmas to ALL

Ronald Reagan: If it ain't broke, don't fix it!
US Marine Corps Engineers: If we can't fix it, it ain't broke...
ID: 1618271 · Report as offensive
Phil Saugar

Send message
Joined: 18 Mar 00
Posts: 13
Credit: 3,407,351
RAC: 1
United States
Message 1618357 - Posted: 24 Dec 2014, 17:38:26 UTC
Last modified: 24 Dec 2014, 17:40:16 UTC

ok thanks for the info. I was wondering why there was no graphics on Astropulse that's running now on my computer. Phil. Ps Marry Christmas to all at SETI.
ID: 1618357 · Report as offensive
Previous · 1 · 2 · 3 · 4 · Next

Message boards : News : Marvin crashed


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.