Preventive maintenance - how about that?

Message boards : Number crunching : Preventive maintenance - how about that?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 7 · Next

AuthorMessage
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,352,955
RAC: 0
United States
Message 1076467 - Posted: 12 Feb 2011, 17:16:24 UTC

All it takes is money. Lots and lots of money.
Until then you get what you got.
Unhappy with that? Just start sending in those six figure checks.
Checks sent to your power company do not count as donations to Seti@Home.

Seti only asked for spare processor cycles, not for anyone to build dedicated
hardware.

ID: 1076467 · Report as offensive
Profile RottenMutt
Avatar

Send message
Joined: 15 Mar 01
Posts: 1011
Credit: 230,314,058
RAC: 0
United States
Message 1076471 - Posted: 12 Feb 2011, 17:27:16 UTC - in response to Message 1076467.  

turning spar CPU cycles into useful CPU computations consumes more power then being idle. the whole premise is no longer valid!!!
ID: 1076471 · Report as offensive
Profile soft^spirit
Avatar

Send message
Joined: 18 May 99
Posts: 6497
Credit: 34,134,168
RAC: 0
United States
Message 1076478 - Posted: 12 Feb 2011, 17:46:29 UTC - in response to Message 1076471.  

Mutt, they do not have near enough to install fault tolerant systems and full redundancy.

I as well have worked in 24/7 environments, and this simply put is not one, never has been one, was not designed to be one, and unless their funds go up a fantastic amount will most likely never be one.




Janice
ID: 1076478 · Report as offensive
Profile Frizz
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 271
Credit: 5,852,934
RAC: 0
New Zealand
Message 1076487 - Posted: 12 Feb 2011, 18:05:10 UTC - in response to Message 1076478.  
Last modified: 12 Feb 2011, 18:30:36 UTC

Mutt, they do not have near enough to install fault tolerant systems and full redundancy.


Crunching has become some serious business/hobby for some of us.

To maintain a RAC of 100.000 points you have to pay at least 1.750USD/year for electricity.

(assumption: rather cheap power and a pretty efficient system: 10c/Kwh and 200 Watt *) to get a RAC of 10.000 points)

This is the absolute minimum imho! In reality you have to pay (a lot) more (higher power costs, less efficient systems).

Plus in other parts of the world power is not nearly as cheap as in the US. I guess in Europe you easily have to pay 3x the price for power. So it's ~5.300USD/year for a RAC of 100.000. Minimum!

This is what I meant when I said: "Quite a lot of people here put considerable resources ... into the project. Or in other words: money".

---
*) Things to consider for power consumption:

- a system is not just a CPU and/or GPU. It's also mainboard, hard drives, ... They too need power.

- overclocked CPUs/GPUs need exponentially more power compared to clock speeds.

- computer power supplies are typically 75% efficient (good ones have 85%+ efficiency). Example: When your GTX480 has a TDP of 250 Watt it's more like 300 Watt in reality. And when it's heavily overclocked more like 400.

- cost for your own little infrastructure (modem/router, air conditioning in Summer, etc.).

- cost when your cruncher is not productive (e.g. Seti is down) and is not working towards your RAC at all.
ID: 1076487 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1076490 - Posted: 12 Feb 2011, 18:14:13 UTC

As for working weekends, I hate it myself. I give my all during the week let me unwind on the weekend. Now when my company needs an order filled yes I will step up to the plate. Because I run out of work units thats no reason for anyone at Seti working a weekend.


[/quote]

Old James
ID: 1076490 · Report as offensive
Profile soft^spirit
Avatar

Send message
Joined: 18 May 99
Posts: 6497
Credit: 34,134,168
RAC: 0
United States
Message 1076509 - Posted: 12 Feb 2011, 19:18:19 UTC - in response to Message 1076487.  

Well when the servers are down, you are saving electricity.
If you want them up more, feel free to help by donating some of the power
savings towards maintaining them.

This is a hobby, and a serious hobby to many. This is not a vital service.
And the fact remains they do not have near the financial resources it would take to conduct it as if it WAS a vital service (full redundancy, backup power,
24/7 staffing, etc).

I should correct myself.. this is a serious hobby.. TO US. To Berkeley this is a very serious although underfunded scientific experiment.

These guys do work weekends, often with little/no compensation. No commercial IT application has a right to ask for anything approaching this. Having them work ALL weekend, when many need 2nd/3rd jobs to support themselves is asking a bit much in the current situation.


Janice
ID: 1076509 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1076540 - Posted: 12 Feb 2011, 21:14:48 UTC
Last modified: 12 Feb 2011, 21:15:11 UTC

This thread title irked me from the get go.

Who TF thinks they can predict the next failure point?

Or wants to pony up the resources to try to whack them all?

Geezus. Lord.

How simply stupid can you post?

Like to see what he drives.......change the motor every 50k miles, eh?

Bet the hard drive has less than 1000 hours on it......
Better change out that PSU....it's nearly a year old.
That CPU has almost a couple to Tflops on it.......better get a new one.

You get my drift.......

Thinking that the Seti staff could possible anticipate the next failure point is, well.......pointless.

I have 8 rigs running here, and if I could guess the next one that would fail....well, I might as well have tried out for the lead in Hereafter.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1076540 · Report as offensive
Profile Frizz
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 271
Credit: 5,852,934
RAC: 0
New Zealand
Message 1076587 - Posted: 12 Feb 2011, 22:47:22 UTC - in response to Message 1076540.  

How simply stupid can you post?


Hmmm ... when I looked at your posting I was asking myself the same question ;)


Who TF thinks they can predict the next failure point?


Believe it or not Mark - there are things in this world that you don't get taught in a fire truck factory. For example professional system administration.

For the record: I didn't think up Preventive Maintenance myself! It's a well proven concept.

I simply don't get why so many people here think that it's some kind of witchcraft. Crystal balls etc.

---------------
"Blogging: Never before have so many with so little to say said so much to so few"
ID: 1076587 · Report as offensive
Profile Frizz
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 271
Credit: 5,852,934
RAC: 0
New Zealand
Message 1076589 - Posted: 12 Feb 2011, 22:55:24 UTC - in response to Message 1076509.  

Having them work ALL weekend, when many need 2nd/3rd jobs to support themselves is asking a bit much in the current situation.


I am not criticizing people like for example Matt. I think he does a fantastic job with what he has.

Point of my initial posting was:

1) It seems many of us are pretty serious about crunching. We spend lots of money (time, hardware, electricity) for our own little infrastructure. Why not spend a part of this money where it benefits most: In S@H infrastructure.

2) If the S@H team could give us a ballpark figure, I am sure we could raise the money that's needed.[/b]
ID: 1076589 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34746
Credit: 261,360,520
RAC: 489
Australia
Message 1076600 - Posted: 12 Feb 2011, 23:31:16 UTC - in response to Message 1076589.  

2) If the S@H team could give us a ballpark figure, I am sure we could raise the money that's needed.[/b]

You obviously either missed several posts on this topic by Matt, Jeff and Pappa in the recent times or totally forgot about them but it has been stated several times that we will be informed what will be needed once the team has finally got the three new servers fine tuned and the rest of the servers re-arranged in around them (which is still on going but taking a little longer due to RAID failures on other servers).

Until that happens this thread is totally pointless and counter productive IMO.

If you don't have backup projects then just turn your PC's off and save on power.

Or in other words show some patience and remember that Rome wasn't built in a day.

Cheers.
ID: 1076600 · Report as offensive
Profile soft^spirit
Avatar

Send message
Joined: 18 May 99
Posts: 6497
Credit: 34,134,168
RAC: 0
United States
Message 1076604 - Posted: 12 Feb 2011, 23:39:16 UTC

Reminder: Civility PLEASE.

Name calling and personal insults is not needed nor helpful.


Janice
ID: 1076604 · Report as offensive
Profile Frizz
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 271
Credit: 5,852,934
RAC: 0
New Zealand
Message 1076605 - Posted: 12 Feb 2011, 23:43:39 UTC - in response to Message 1076600.  

Or in other words show some patience and remember that Rome wasn't built in a day.


If Rome would be S@H, the Capitoline Wolf would still suckle the infant twins Romulus and Remus :)
ID: 1076605 · Report as offensive
Profile Pappa
Volunteer tester
Avatar

Send message
Joined: 9 Jan 00
Posts: 2562
Credit: 12,301,681
RAC: 0
United States
Message 1077044 - Posted: 14 Feb 2011, 2:45:49 UTC
Last modified: 14 Feb 2011, 3:47:38 UTC

A few messages were removed prior to restoration of this thread due to the Server Side issues.

Thank You for your patience.

As was noted the Gowron is a single point of failure which is as I recall about 4 years old. The majority of the servers connect via back end network (second 1 gig Nic) for server to server communications. The Backup of the degraded Raid make traffic slow to a crawl. Matt stated that as the backup of was complete a new install of the OS should return the functionality for the SAN. Recovery would be at a gigabit speed.

For myself, I have had to move data off a paritally recoverd crashed Raid. The tools I had told the the expected (running average) transfer speed. So with a known amount of total data, the amount of time could be calculated. This was commplicated in that there were areas where ECC and other data showed there was corruption. The time for Raid Retries before the data blosck was marked as Bad and Logged. Slowed the process to its knees. The Boss told me to go home. I did go home and check back later at periods lest a screen be shown asking for permission to continue.

At this point we are waiting for the completion of the transfer to start the OS reinstall.

Regards

Pappa
Please consider a Donation to the Seti Project.

ID: 1077044 · Report as offensive
Profile Frizz
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 271
Credit: 5,852,934
RAC: 0
New Zealand
Message 1077082 - Posted: 14 Feb 2011, 8:22:12 UTC - in response to Message 1077044.  

Thanks for restoring some of the threads/postings, Pappa. And thanks for resolving the "technical" forum moderation problems ;)
ID: 1077082 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,352,955
RAC: 0
United States
Message 1077093 - Posted: 14 Feb 2011, 9:33:01 UTC - in response to Message 1076471.  

Then turn off your power drains and pocket the money. Building dedicated crunchers is your own responsibility, not S@H's. The work will still get done.
ID: 1077093 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1077094 - Posted: 14 Feb 2011, 9:38:35 UTC - in response to Message 1076450.  

I don't agree with not working to fix the system over the weekend...

So they should work for no pay just so people can keep their RAC up? Doesn't sound reasonable to be. Maybe they could work over the weekend, then have time off during the week in leiu? Of course something would go wrong mid week & people would get upset they weren't there sorting it out then.
Maybe they should just move in there till death do they part?
Grant
Darwin NT
ID: 1077094 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1077096 - Posted: 14 Feb 2011, 9:47:17 UTC - in response to Message 1076487.  

Crunching has become some serious business/hobby for some of us.

To maintain a RAC of 100.000 points you have to pay at least 1.750USD/year for electricity.

(assumption: rather cheap power and a pretty efficient system: 10c/Kwh and 200 Watt *) to get a RAC of 10.000 points)

This is the absolute minimum imho! In reality you have to pay (a lot) more (higher power costs, less efficient systems).

Plus in other parts of the world power is not nearly as cheap as in the US. I guess in Europe you easily have to pay 3x the price for power. So it's ~5.300USD/year for a RAC of 100.000. Minimum!

This is what I meant when I said: "Quite a lot of people here put considerable resources ... into the project. Or in other words: money".

If you choose to go over the top & get carried away with the whole deal, they still owe you no more than somone that installs it on a computer & forgets all about it.
They ask for your unused processing time. That is all. Should you choose to build systems specifically dedicated to the project- it was your choice. You don't have to keep doing it- you owe them nothing. They don't have to keep it running 24/7- they never promised to.

Seriously- if you consider your decision to build & operate dedicated systems means that you are owed more than anyone else that chooses to crunch for Seti, your really need to take a reality check.
Grant
Darwin NT
ID: 1077096 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1077097 - Posted: 14 Feb 2011, 9:51:18 UTC - in response to Message 1076587.  

For the record: I didn't think up Preventive Maintenance myself! It's a well proven concept.

I simply don't get why so many people here think that it's some kind of witchcraft. Crystal balls etc.

What sort of preventative maintenance would you suggest? Would that maintenace have prevented any of the outages that have occured? Would it have been possible within the budget that existed at the time it would have been implemented?
Grant
Darwin NT
ID: 1077097 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34746
Credit: 261,360,520
RAC: 489
Australia
Message 1077098 - Posted: 14 Feb 2011, 10:04:26 UTC - in response to Message 1077097.  

Damn. :(

Couldn't this stupid thread have stayed gone?

Cheers.
ID: 1077098 · Report as offensive
Profile Frizz
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 271
Credit: 5,852,934
RAC: 0
New Zealand
Message 1077112 - Posted: 14 Feb 2011, 12:35:16 UTC - in response to Message 1077098.  

Go the All Blacks! :)

ID: 1077112 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 7 · Next

Message boards : Number crunching : Preventive maintenance - how about that?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.