Get Out of My House (Jan 18 2011)

Message boards : Technical News : Get Out of My House (Jan 18 2011)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Profile morpheus
Avatar

Send message
Joined: 5 Jun 99
Posts: 71
Credit: 52,480,762
RAC: 33
Germany
Message 1068205 - Posted: 19 Jan 2011, 12:15:11 UTC

Thanks for the update, Matt.
And good luck with Bruno.
.:morpheus:.
ID: 1068205 · Report as offensive
DJStarfox

Send message
Joined: 23 May 01
Posts: 1066
Credit: 1,226,053
RAC: 2
United States
Message 1068240 - Posted: 19 Jan 2011, 14:47:38 UTC - in response to Message 1068210.  

...we'll still be here looking forward to business as normal. :-)


This is business as usual/normal. LOL
ID: 1068240 · Report as offensive
Profile Westsail and *Pyxey*
Volunteer tester
Avatar

Send message
Joined: 26 Jul 99
Posts: 338
Credit: 20,544,999
RAC: 0
United States
Message 1068342 - Posted: 19 Jan 2011, 21:34:34 UTC
Last modified: 19 Jan 2011, 21:35:02 UTC

Good luck!!! Whether with Synergy or another machine, hope things go relative smoothly.

I have a question...Is it possible to separate uploads/downloads?

If the uploads could also be routed through the new lab Gbit line. This could leave the Hurricane Elec. line purely for downloads...

??
"The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, 'hmm... that's funny...'" -- Isaac Asimov
ID: 1068342 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1068388 - Posted: 19 Jan 2011, 23:50:23 UTC - in response to Message 1068210.  

Sounds like Synergy got there not a moment too soon!)

Or bruno, knowing synergy was on the way, held out as long as he could before crashing. Either way, bruno has been a good soldier, hope you can get him back in business again.
Donald
Infernal Optimist / Submariner, retired
ID: 1068388 · Report as offensive
Saaby900T

Send message
Joined: 24 Dec 10
Posts: 76
Credit: 4,971,171
RAC: 0
United States
Message 1068443 - Posted: 20 Jan 2011, 5:18:21 UTC

When is looking like this is going to get back online?
ID: 1068443 · Report as offensive
Profile Geoff Gong

Send message
Joined: 11 Dec 99
Posts: 53
Credit: 1,543,379
RAC: 0
Australia
Message 1068447 - Posted: 20 Jan 2011, 5:36:53 UTC

Hi
Server status shows ONLY AP Splitters
Lando and Vader not running ,both are doing other
tasks
Is the Server Status page affected ?
ID: 1068447 · Report as offensive
edwartr
Avatar

Send message
Joined: 2 May 00
Posts: 31
Credit: 79,402,615
RAC: 14
United States
Message 1068454 - Posted: 20 Jan 2011, 6:11:17 UTC - in response to Message 1068447.  
Last modified: 20 Jan 2011, 6:11:46 UTC

Make sure and check the date/time on the Server Status page:

[As of 18 Jan 2011 17:10:05 UTC]

It is 20 Jan 2011 06:11 UTC as I am posting this.
I gotta fever and the only prescription is more cowbell.
ID: 1068454 · Report as offensive
Profile Adam Weichel

Send message
Joined: 30 Jul 02
Posts: 22
Credit: 25,877,509
RAC: 46
Canada
Message 1068533 - Posted: 20 Jan 2011, 14:34:57 UTC

It's good to hear that everything's correctable, Matt. Is there an updated hardware requirement list that's available? Looking to donate some more parts in the spring. :)
Computer nut, Distributed Computing freak, Jeeper and Dodge Ram driver.

Life is worth living... and worth discovering.

I run VMWare ESXi Free - why don't you?
ID: 1068533 · Report as offensive
Profile Jaye Ellen

Send message
Joined: 29 Nov 08
Posts: 26
Credit: 20,945,032
RAC: 45
United States
Message 1068565 - Posted: 20 Jan 2011, 16:21:10 UTC - in response to Message 1068032.  

Keep up the excellent work, Matt and let's try to revive Bruno before his untimely demise ??? All in fun though, I was just wondering why my uploads were just sitting here and now I know, and can stop worrying ...

Jaye Ellen
ID: 1068565 · Report as offensive
Profile Todd Hebert
Volunteer tester
Avatar

Send message
Joined: 16 Jun 00
Posts: 648
Credit: 228,292,957
RAC: 0
United States
Message 1068606 - Posted: 20 Jan 2011, 17:36:12 UTC

At least when Synergy (New Bruno) comes up it will be a true test to see if it can handle the load of the project with everyone uploading their completed WU's. I have over 4k to report alone.

Todd
ID: 1068606 · Report as offensive
Profile ralphw
Volunteer tester

Send message
Joined: 7 May 99
Posts: 78
Credit: 18,032,718
RAC: 38
United States
Message 1068691 - Posted: 20 Jan 2011, 20:00:15 UTC - in response to Message 1068032.  

Bruno hardware failure - this suggests bruno (the upload server) is a single point of failure, is it feasible to have two systems performing the same function here?

Perhaps bruno should have a companion, borat (I'm probably thinking of the wrong bruno here.)
ID: 1068691 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30980
Credit: 53,134,872
RAC: 32
United States
Message 1068725 - Posted: 20 Jan 2011, 20:59:15 UTC - in response to Message 1068691.  

Bruno hardware failure - this suggests bruno (the upload server) is a single point of failure, is it feasible to have two systems performing the same function here?

Perhaps bruno should have a companion, borat (I'm probably thinking of the wrong bruno here.)

Everything on the BOINC side of the house is a single point of failure. The only system that has a hot backup is the science database.

ID: 1068725 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 1068781 - Posted: 20 Jan 2011, 22:58:08 UTC - in response to Message 1068691.  

Bruno hardware failure - this suggests bruno (the upload server) is a single point of failure, is it feasible to have two systems performing the same function here?

Perhaps bruno should have a companion, borat (I'm probably thinking of the wrong bruno here.)

BOINC is supposed to make it possible to do big science on a vanishingly small budget. That means redundant servers are often out of the question.

So instead of redundant servers so that something can always take work, we have a client that handles outages gracefully.

It has to be that way because the standard solution (throw money at the problem) is not available to BOINC projects.
ID: 1068781 · Report as offensive
Profile Todd Hebert
Volunteer tester
Avatar

Send message
Joined: 16 Jun 00
Posts: 648
Credit: 228,292,957
RAC: 0
United States
Message 1069118 - Posted: 21 Jan 2011, 21:44:49 UTC - in response to Message 1068781.  

I would disagree with the statement that we can make do without redundant servers for this project. Considering the number of hosts are now over 2 million for just this project that is an incredible amount of computing power and really was the spirit of the architectural design. The work still needs to go somewhere.

I don't think some people here realize the scope of the data returned and the impact of the loss of storage devices. To serve the project as has been demanded by the users, some are very vocal, there does need to be many factors considered.

It isn't about throwing money at a problem in hopes of a resolution. Things break and need to be replaced over time - requiring money or donations to achieve the goal. Not much different that expecting to drive your new car with the same tires for 150k miles or not getting an oil change.

When your dataset increases, your load and time between failures also increases. When trying to make do with piecemeal equipment it can be very challenging to make a go of it.

Todd

Bruno hardware failure - this suggests bruno (the upload server) is a single point of failure, is it feasible to have two systems performing the same function here?

Perhaps bruno should have a companion, borat (I'm probably thinking of the wrong bruno here.)

BOINC is supposed to make it possible to do big science on a vanishingly small budget. That means redundant servers are often out of the question.

So instead of redundant servers so that something can always take work, we have a client that handles outages gracefully.

It has to be that way because the standard solution (throw money at the problem) is not available to BOINC projects.


ID: 1069118 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1069139 - Posted: 21 Jan 2011, 22:47:28 UTC - in response to Message 1069118.  

Aye, Capn' Todd....

I face challenges just keeping 8 crunching rigs online some days.

Power supplies age, motherboard components age, things change.
Adjustments to settings are required.

Of course, the kitties push the rigs pretty hard, so any change in tolerances can sometimes throw things outta whack.

I suspect that Eric, Matt, and crew sleep much better recently with the new servers at work.....

I know that even with the latest short outage, we are enjoying the longest streak of uptime in Seti history for many moons.

You and all other contributors have done well.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1069139 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 1069235 - Posted: 22 Jan 2011, 1:25:49 UTC - in response to Message 1069118.  

I would disagree with the statement that we can make do without redundant servers for this project. Considering the number of hosts are now over 2 million for just this project that is an incredible amount of computing power and really was the spirit of the architectural design. The work still needs to go somewhere.

I don't think some people here realize the scope of the data returned and the impact of the loss of storage devices. To serve the project as has been demanded by the users, some are very vocal, there does need to be many factors considered.

It isn't about throwing money at a problem in hopes of a resolution. Things break and need to be replaced over time - requiring money or donations to achieve the goal. Not much different that expecting to drive your new car with the same tires for 150k miles or not getting an oil change.

When your dataset increases, your load and time between failures also increases. When trying to make do with piecemeal equipment it can be very challenging to make a go of it.

Todd

Bruno hardware failure - this suggests bruno (the upload server) is a single point of failure, is it feasible to have two systems performing the same function here?

Perhaps bruno should have a companion, borat (I'm probably thinking of the wrong bruno here.)

BOINC is supposed to make it possible to do big science on a vanishingly small budget. That means redundant servers are often out of the question.

So instead of redundant servers so that something can always take work, we have a client that handles outages gracefully.

It has to be that way because the standard solution (throw money at the problem) is not available to BOINC projects.


They do have some redundancy. The DB is mirrored in real time. They have raid for their drive arrays. But having completely redundant servers is too expensive.


BOINC WIKI
ID: 1069235 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 3187
Credit: 57,163,290
RAC: 0
United States
Message 1069546 - Posted: 22 Jan 2011, 18:08:51 UTC - in response to Message 1069118.  
Last modified: 22 Jan 2011, 18:11:31 UTC

[snip]

When your dataset increases, your load and time between failures also increases. When trying to make do with piecemeal equipment it can be very challenging to make a go of it.

Todd


It's been my experience that when the dataset increases, MTBF (Mean Time Between Failures) decreases... and your load increases by a factor of two or more (for a dataset double, your load quadruples... not saying that the increase is always exponential, though...)
.

Hello, from Albany, CA!...
ID: 1069546 · Report as offensive
Previous · 1 · 2

Message boards : Technical News : Get Out of My House (Jan 18 2011)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.