Lost in Boston (Feb 08 2011)


log in

Advanced search

Message boards : Technical News : Lost in Boston (Feb 08 2011)

1 · 2 · Next
Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1391
Credit: 74,079
RAC: 10
United States
Message 1075427 - Posted: 9 Feb 2011, 0:03:30 UTC

My touring band arrived in Boston, MA, and we began hunting for the club we were scheduled to play at. Once in the heart of town we pulled over and looked at a map (i.e. 1998 technology). Lo and behold we were only two blocks away from the venue! However, it still took us 90 minutes (no exaggeration) to get there, due to an impossible sequence of non-right angle one-way streets. One false move, and you were forced to drive around the whole park and try again. On two occasions the club was in our actual line of sight but we still couldn't legally reach it from our current approach. Eventually we stumbled upon the correct permutation of traverses and suddenly landed in front of the building, much to our surprise.

I mention this story as it is an exact analogy to me getting a bootable OS on thumper this whole past week. One false move, and I had to reinstall the OS from scratch. However, we seem to be done with that, and of course on hindsight the ultimately solution is simple enough. The main obfuscations were (a) funky linux drive enumeration, (b) weird unpredictable linux raid behavior, but mostly (c) grub installation via the fedora installer is a bit, well, confused. I'm thinking the installer isn't ever expecting a 48 drive system where the only available boot drives are #0 and #1 according to the BIOS, but #24 and #28 according to linux. I basically had to install an OS without a raided boot, then reinstall grub by hand on both drives (using the grub shell as grub-install wouldn't work), then replace the flat boot partition with a raided booted partition, etc. etc. ETC.!

Of course I'm taking the day off tomorrow so I won't complete the configuration until Thursday.

Meanwhile, the project is just now coming up from the regular weekly outage - sorry about the delay. Usual stuff, though we added some spike-time-index fragmentation performance tests (which we wanted to do during a quiet time). They weren't all that positive - unclear what the current bottleneck is, though it doesn't seem like either the database or disk/network i/o. Maybe some code somewhere needs optimizing.

- Matt

____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Profile Gary CharpentierProject donor
Volunteer tester
Avatar
Send message
Joined: 25 Dec 00
Posts: 13180
Credit: 7,934,803
RAC: 15,124
United States
Message 1075434 - Posted: 9 Feb 2011, 5:27:09 UTC - in response to Message 1075427.

Thanks for the update Matt.

____________

Profile eaglescouter
Send message
Joined: 28 Dec 02
Posts: 162
Credit: 42,012,553
RAC: 0
United States
Message 1075435 - Posted: 9 Feb 2011, 5:27:54 UTC

Funny, he reports that the project is back up, just as the project reports that it is shut down for maintanence......
____________
It's not too many computers, it's a lack of circuit breakers for this room. But we can fix it :)

Big Bang
Send message
Joined: 7 Jan 10
Posts: 670
Credit: 28,481
RAC: 0
Message 1075438 - Posted: 9 Feb 2011, 5:55:13 UTC

Just chill-out and enjoy your gig Matt. All the best.

Profile soft^spirit
Avatar
Send message
Joined: 18 May 99
Posts: 6374
Credit: 28,647,395
RAC: 516
United States
Message 1075441 - Posted: 9 Feb 2011, 6:17:42 UTC

Two wrongs do not make a right, but 3 rights make a left.
____________

Janice

PhaserLogic
Send message
Joined: 21 May 99
Posts: 1
Credit: 24,884,542
RAC: 866
United States
Message 1075442 - Posted: 9 Feb 2011, 6:18:21 UTC - in response to Message 1075427.

That installation on thumper is epic, and the Boston analogy was perfect.

Swibby Bear
Send message
Joined: 1 Aug 01
Posts: 236
Credit: 7,276,504
RAC: 1
United States
Message 1075445 - Posted: 9 Feb 2011, 6:35:07 UTC - in response to Message 1075435.

Funny, he reports that the project is back up, just as the project reports that it is shut down for maintanence......


I guess something still needs "optimizing"

Profile RottenMutt
Avatar
Send message
Joined: 15 Mar 01
Posts: 999
Credit: 209,338,195
RAC: 53,413
United States
Message 1075453 - Posted: 9 Feb 2011, 8:04:46 UTC

somebody forgot to flip the switch, not the first time! seem seti staff is always in a hurry to not let the door hit them in the hind end!
____________

Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,520
RAC: 119
Netherlands
Message 1075455 - Posted: 9 Feb 2011, 8:59:45 UTC - in response to Message 1075453.


--[Snipped]--
The main obfuscations were (a) funky linux drive enumeration, (b) weird unpredictable linux raid behavior, but mostly (c) grub installation via the fedora installer is a bit, well, confused. I'm thinking the installer isn't ever expecting a 48 drive system where the only available boot drives are #0 and #1 according to the BIOS, but #24 and #28 according to linux. I basically had to install an OS without a raided boot, then reinstall grub by hand on both drives (using the grub shell as grub-install wouldn't work), then replace the flat boot partition with a raided booted partition, etc. etc. ETC.!

Of course I'm taking the day off tomorrow so I won't complete the configuration until Thursday.

--[End of snippage]--


A time consuming and difficult way to BOOT........
Thanks Matt, for your explanation on your adventure :)



____________

OzzFan
Volunteer tester
Avatar
Send message
Joined: 9 Apr 02
Posts: 13704
Credit: 31,717,838
RAC: 12,665
United States
Message 1075486 - Posted: 9 Feb 2011, 13:17:38 UTC - in response to Message 1075441.

Two wrongs do not make a right, but 3 rights make a left.


... and two (W)rights make an airplane! :-D

Twisted
Send message
Joined: 27 May 99
Posts: 81
Credit: 1,878,607
RAC: 5
United States
Message 1075489 - Posted: 9 Feb 2011, 13:40:34 UTC

Hey Matt,

Just as an after thought...you could always pull 46 of the drives and install, then put them back in...it would be a pain due to the fact that the cover must remain on the x4500 or it powers down after about 60 seconds. Which is kind of funny since my best time start to finish with hotswapping a drive was 58 seconds.

Feel free to shoot me a line anytime if I can be of any help with this thing. Worked on them for 3 years on the hardware side.

Kevin

ps...thanks for the updates and good luck in Boston!
____________
"Two things are infinite: The universe and human stupidity; and I'm not sure about the universe." - Albert Einstein

Profile Black Squirrel Prime
Send message
Joined: 29 Jul 07
Posts: 8
Credit: 11,650,338
RAC: 2,826
United States
Message 1075492 - Posted: 9 Feb 2011, 13:58:00 UTC - in response to Message 1075486.
Last modified: 9 Feb 2011, 14:00:20 UTC

Bahaha! Good one - I needed a joke to begin my day...I'm going to start my lecture today with that....


Two wrongs do not make a right, but 3 rights make a left.


... and two (W)rights make an airplane! :-D

Profile aaronh
Volunteer tester
Avatar
Send message
Joined: 27 Oct 99
Posts: 169
Credit: 1,412,472
RAC: 0
United States
Message 1075501 - Posted: 9 Feb 2011, 14:21:54 UTC - in response to Message 1075441.

Two wrongs do not make a right, but 3 rights make a left.


...except in Boston!

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7122
Credit: 61,612,145
RAC: 16,092
Germany
Message 1075516 - Posted: 9 Feb 2011, 14:57:20 UTC - in response to Message 1075427.

Matt, thanks for the news!


What does this mean? A few project servers are down until Thursday?

The server status page show UL/DL online, scheduler offline. But my BOINCs can't DL (still a few WUs in the transfers overviews).

____________
BR

SETI@home Needs your Help ... $10 & U get a Star!

Team seti.international

Das Deutsche Cafe. The German Cafe.

zii
Send message
Joined: 24 May 03
Posts: 7
Credit: 828,565
RAC: 0
Sweden
Message 1075517 - Posted: 9 Feb 2011, 14:58:37 UTC - in response to Message 1075489.

...it would be a pain due to the fact that the cover must remain on the x4500 or it powers down after about 60 seconds.


Hm.. The server world is unknown territory for me, but it sounds like it won't shut down if the cover is put back in time?

Then perhaps a little piece of TAPE might be a cheap and well working SPD*?
Just apply on the intrusion switch in an suitable manner... :P

*Shutdown Prevention Device



It was just a thought.
____________

JohnDKProject donor
Volunteer tester
Avatar
Send message
Joined: 28 May 00
Posts: 895
Credit: 49,135,736
RAC: 44,126
Denmark
Message 1075520 - Posted: 9 Feb 2011, 15:02:25 UTC - in response to Message 1075516.

Matt, thanks for the news!


What does this mean? A few project servers are down until Thursday?

The server status page show UL/DL online, scheduler offline. But my BOINCs can't DL (still a few WUs in the transfers overviews).

I'm not sure what Matt means. He saying "Meanwhile, the project is just now coming up from the regular weekly outage - sorry about the delay", which I would think means things should work now, but maybe something haven't been turned on.

Profile Andy Lee Robinson
Avatar
Send message
Joined: 8 Dec 05
Posts: 617
Credit: 43,831,953
RAC: 25,750
Hungary
Message 1075592 - Posted: 9 Feb 2011, 19:32:22 UTC - in response to Message 1075427.

I've been down this route a couple of times, though with not so many drives.
I copped out and used a pendrive. There was only so much pain I could
take before seeing the light!

/boot is only needed once in a blue moon, and any source pendrive, usb,
cdrom, hd, net can be used to get the system up.
Having a separate interchangable boot source removes dependency
on a raid drive failure.
Pendrives just don't break, at least for the purpose of booting, and they
can be mirrored with mdadm too for the really paranoid!

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7122
Credit: 61,612,145
RAC: 16,092
Germany
Message 1075611 - Posted: 9 Feb 2011, 20:38:27 UTC
Last modified: 9 Feb 2011, 20:47:02 UTC

Oops, the forum style is broken. Or is this now the new style?

The lines (up/down) which separated the - thread titles, messages, authors, views, last message - are away.
Threads which I already read (no new message for me) are now marked as 'read'.


EDIT: BTW. mozilla Firefox
____________
BR

SETI@home Needs your Help ... $10 & U get a Star!

Team seti.international

Das Deutsche Cafe. The German Cafe.

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4243
Credit: 34,956,922
RAC: 22,561
United Kingdom
Message 1075625 - Posted: 9 Feb 2011, 21:12:25 UTC - in response to Message 1075611.

Oops, the forum style is broken. Or is this now the new style?

The lines (up/down) which separated the - thread titles, messages, authors, views, last message - are away.
Threads which I already read (no new message for me) are now marked as 'read'.


EDIT: BTW. mozilla Firefox


I think DJStarfox has worked out why:

Panic Mode On (43) Server problems

Claggy

Profile Corvid
Avatar
Send message
Joined: 31 Oct 05
Posts: 12
Credit: 4,911,652
RAC: 1,094
United States
Message 1075965 - Posted: 10 Feb 2011, 23:41:54 UTC - in response to Message 1075501.

Two wrongs do not make a right, but 3 rights make a left.


...except in Boston!


Or pretty much anywhere in Germany.
____________

1 · 2 · Next

Message boards : Technical News : Lost in Boston (Feb 08 2011)

Copyright © 2014 University of California