Get Out of My House (Jan 18 2011)


log in

Advanced search

Message boards : Technical News : Get Out of My House (Jan 18 2011)

Previous · 1 · 2
Author Message
Profile morpheus
Avatar
Send message
Joined: 5 Jun 99
Posts: 59
Credit: 22,443,745
RAC: 34,775
Germany
Message 1068205 - Posted: 19 Jan 2011, 12:15:11 UTC

Thanks for the update, Matt.
And good luck with Bruno.
____________
.:morpheus:.

Profile Chris SProject donor
Volunteer tester
Avatar
Send message
Joined: 19 Nov 00
Posts: 31763
Credit: 13,181,744
RAC: 37,576
United Kingdom
Message 1068210 - Posted: 19 Jan 2011, 12:39:27 UTC

Sounds like Synergy got there not a moment too soon! Good luck in sorting things out, we'll still be here looking forward to business as normal. :-)
____________
Damsel Rescuer, Uli Devotee, Julie Supporter, Kitty sad,
ES99 Admirer, Raccoon Friend, Anniet fan, Hon Triumphvir


DJStarfox
Send message
Joined: 23 May 01
Posts: 1043
Credit: 547,913
RAC: 117
United States
Message 1068240 - Posted: 19 Jan 2011, 14:47:38 UTC - in response to Message 1068210.

...we'll still be here looking forward to business as normal. :-)


This is business as usual/normal. LOL

Profile Westsail and *Pyxey*
Volunteer tester
Avatar
Send message
Joined: 26 Jul 99
Posts: 338
Credit: 20,538,216
RAC: 0
United States
Message 1068342 - Posted: 19 Jan 2011, 21:34:34 UTC
Last modified: 19 Jan 2011, 21:35:02 UTC

Good luck!!! Whether with Synergy or another machine, hope things go relative smoothly.

I have a question...Is it possible to separate uploads/downloads?

If the uploads could also be routed through the new lab Gbit line. This could leave the Hurricane Elec. line purely for downloads...

??
____________
"The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, 'hmm... that's funny...'" -- Isaac Asimov

Profile Donald L. JohnsonProject donor
Avatar
Send message
Joined: 5 Aug 02
Posts: 6212
Credit: 710,890
RAC: 1,183
United States
Message 1068388 - Posted: 19 Jan 2011, 23:50:23 UTC - in response to Message 1068210.

Sounds like Synergy got there not a moment too soon!)

Or bruno, knowing synergy was on the way, held out as long as he could before crashing. Either way, bruno has been a good soldier, hope you can get him back in business again.
____________
Donald
Infernal Optimist / Submariner, retired

Saaby900T
Send message
Joined: 24 Dec 10
Posts: 76
Credit: 4,971,171
RAC: 0
United States
Message 1068443 - Posted: 20 Jan 2011, 5:18:21 UTC

When is looking like this is going to get back online?

Profile Geoff Gong
Send message
Joined: 11 Dec 99
Posts: 53
Credit: 1,217,975
RAC: 972
Australia
Message 1068447 - Posted: 20 Jan 2011, 5:36:53 UTC

Hi
Server status shows ONLY AP Splitters
Lando and Vader not running ,both are doing other
tasks
Is the Server Status page affected ?
____________

edwartr
Avatar
Send message
Joined: 2 May 00
Posts: 22
Credit: 48,133,242
RAC: 7,646
United States
Message 1068454 - Posted: 20 Jan 2011, 6:11:17 UTC - in response to Message 1068447.
Last modified: 20 Jan 2011, 6:11:46 UTC

Make sure and check the date/time on the Server Status page:

[As of 18 Jan 2011 17:10:05 UTC]

It is 20 Jan 2011 06:11 UTC as I am posting this.
____________
I gotta fever and the only prescription is more cowbell.

Profile Adam Weichel
Send message
Joined: 30 Jul 02
Posts: 22
Credit: 11,289,608
RAC: 597
Canada
Message 1068533 - Posted: 20 Jan 2011, 14:34:57 UTC

It's good to hear that everything's correctable, Matt. Is there an updated hardware requirement list that's available? Looking to donate some more parts in the spring. :)
____________
Computer nut, Distributed Computing freak, Jeeper and Dodge Ram driver.

Life is worth living... and worth discovering.

I run VMWare ESXi Free - why don't you?

Profile Jaye Ellen
Send message
Joined: 29 Nov 08
Posts: 24
Credit: 5,522,917
RAC: 2,500
United States
Message 1068565 - Posted: 20 Jan 2011, 16:21:10 UTC - in response to Message 1068032.

Keep up the excellent work, Matt and let's try to revive Bruno before his untimely demise ??? All in fun though, I was just wondering why my uploads were just sitting here and now I know, and can stop worrying ...

Jaye Ellen

Profile Todd Hebert
Volunteer tester
Avatar
Send message
Joined: 16 Jun 00
Posts: 647
Credit: 217,127,962
RAC: 0
United States
Message 1068606 - Posted: 20 Jan 2011, 17:36:12 UTC

At least when Synergy (New Bruno) comes up it will be a true test to see if it can handle the load of the project with everyone uploading their completed WU's. I have over 4k to report alone.

Todd
____________

Profile Ralph
Volunteer tester
Send message
Joined: 7 May 99
Posts: 40
Credit: 1,922,314
RAC: 0
United States
Message 1068691 - Posted: 20 Jan 2011, 20:00:15 UTC - in response to Message 1068032.

Bruno hardware failure - this suggests bruno (the upload server) is a single point of failure, is it feasible to have two systems performing the same function here?

Perhaps bruno should have a companion, borat (I'm probably thinking of the wrong bruno here.)

Profile Gary CharpentierProject donor
Volunteer tester
Avatar
Send message
Joined: 25 Dec 00
Posts: 12577
Credit: 6,882,843
RAC: 6,618
United States
Message 1068725 - Posted: 20 Jan 2011, 20:59:15 UTC - in response to Message 1068691.

Bruno hardware failure - this suggests bruno (the upload server) is a single point of failure, is it feasible to have two systems performing the same function here?

Perhaps bruno should have a companion, borat (I'm probably thinking of the wrong bruno here.)

Everything on the BOINC side of the house is a single point of failure. The only system that has a hot backup is the science database.

____________

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 1068781 - Posted: 20 Jan 2011, 22:58:08 UTC - in response to Message 1068691.

Bruno hardware failure - this suggests bruno (the upload server) is a single point of failure, is it feasible to have two systems performing the same function here?

Perhaps bruno should have a companion, borat (I'm probably thinking of the wrong bruno here.)

BOINC is supposed to make it possible to do big science on a vanishingly small budget. That means redundant servers are often out of the question.

So instead of redundant servers so that something can always take work, we have a client that handles outages gracefully.

It has to be that way because the standard solution (throw money at the problem) is not available to BOINC projects.

Profile Todd Hebert
Volunteer tester
Avatar
Send message
Joined: 16 Jun 00
Posts: 647
Credit: 217,127,962
RAC: 0
United States
Message 1069118 - Posted: 21 Jan 2011, 21:44:49 UTC - in response to Message 1068781.

I would disagree with the statement that we can make do without redundant servers for this project. Considering the number of hosts are now over 2 million for just this project that is an incredible amount of computing power and really was the spirit of the architectural design. The work still needs to go somewhere.

I don't think some people here realize the scope of the data returned and the impact of the loss of storage devices. To serve the project as has been demanded by the users, some are very vocal, there does need to be many factors considered.

It isn't about throwing money at a problem in hopes of a resolution. Things break and need to be replaced over time - requiring money or donations to achieve the goal. Not much different that expecting to drive your new car with the same tires for 150k miles or not getting an oil change.

When your dataset increases, your load and time between failures also increases. When trying to make do with piecemeal equipment it can be very challenging to make a go of it.

Todd

Bruno hardware failure - this suggests bruno (the upload server) is a single point of failure, is it feasible to have two systems performing the same function here?

Perhaps bruno should have a companion, borat (I'm probably thinking of the wrong bruno here.)

BOINC is supposed to make it possible to do big science on a vanishingly small budget. That means redundant servers are often out of the question.

So instead of redundant servers so that something can always take work, we have a client that handles outages gracefully.

It has to be that way because the standard solution (throw money at the problem) is not available to BOINC projects.


____________

John McLeod VII
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 15 Jul 99
Posts: 24544
Credit: 522,233
RAC: 86
United States
Message 1069235 - Posted: 22 Jan 2011, 1:25:49 UTC - in response to Message 1069118.

I would disagree with the statement that we can make do without redundant servers for this project. Considering the number of hosts are now over 2 million for just this project that is an incredible amount of computing power and really was the spirit of the architectural design. The work still needs to go somewhere.

I don't think some people here realize the scope of the data returned and the impact of the loss of storage devices. To serve the project as has been demanded by the users, some are very vocal, there does need to be many factors considered.

It isn't about throwing money at a problem in hopes of a resolution. Things break and need to be replaced over time - requiring money or donations to achieve the goal. Not much different that expecting to drive your new car with the same tires for 150k miles or not getting an oil change.

When your dataset increases, your load and time between failures also increases. When trying to make do with piecemeal equipment it can be very challenging to make a go of it.

Todd

Bruno hardware failure - this suggests bruno (the upload server) is a single point of failure, is it feasible to have two systems performing the same function here?

Perhaps bruno should have a companion, borat (I'm probably thinking of the wrong bruno here.)

BOINC is supposed to make it possible to do big science on a vanishingly small budget. That means redundant servers are often out of the question.

So instead of redundant servers so that something can always take work, we have a client that handles outages gracefully.

It has to be that way because the standard solution (throw money at the problem) is not available to BOINC projects.


They do have some redundancy. The DB is mirrored in real time. They have raid for their drive arrays. But having completely redundant servers is too expensive.
____________


BOINC WIKI

Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar
Send message
Joined: 20 Dec 05
Posts: 1948
Credit: 10,281,180
RAC: 17,840
United States
Message 1069546 - Posted: 22 Jan 2011, 18:08:51 UTC - in response to Message 1069118.
Last modified: 22 Jan 2011, 18:11:31 UTC

[snip]

When your dataset increases, your load and time between failures also increases. When trying to make do with piecemeal equipment it can be very challenging to make a go of it.

Todd


It's been my experience that when the dataset increases, MTBF (Mean Time Between Failures) decreases... and your load increases by a factor of two or more (for a dataset double, your load quadruples... not saying that the increase is always exponential, though...)
____________
.

Previous · 1 · 2

Message boards : Technical News : Get Out of My House (Jan 18 2011)

Copyright © 2014 University of California