Deluge (Jan 27 2010)

Message boards : Technical News : Deluge (Jan 27 2010)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 966228 - Posted: 27 Jan 2010, 23:23:25 UTC

As predicted, the science secondary did indeed catch up to the primary again, so all's well on that front (for now). And in case anybody noticed, we quickly turned the splitters/assimilators off for a bit to replace the failing drive on thumper - something we planned to do during the outage yesterday but couldn't. Easy squeasy - I'm glad we pay for Sun service on that system as drives are going fast. I can safely say the rumors that SATA drives fail frequently are true.

What you may notice is our servers being clogged for a spell, as Eric just turned the astropulse splitters back on (hooray!). We'll see if all goes well on that front - it's been a while and certain parts of the engine may need oil.

- Matt

-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 966228 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 966379 - Posted: 28 Jan 2010, 12:39:22 UTC - in response to Message 966228.  


..... I can safely say the rumors that SATA drives fail frequently are true. ...
- Matt


You mean all brands of SATA drives?!

Or are they just Western Digital?
What brand/model/capacity?

I don't think Seagate will "fail frequently".

 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 966379 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 966410 - Posted: 28 Jan 2010, 15:28:44 UTC - in response to Message 966379.  


..... I can safely say the rumors that SATA drives fail frequently are true. ...
- Matt


You mean all brands of SATA drives?!

Or are they just Western Digital?
What brand/model/capacity?

I don't think Seagate will "fail frequently".

I don't have any problems with WD. True, I've seen a lot of comments on newegg about some batches that are prone to premature failure, but out of the 50+ that I've bought in the past 7 years, I've only had one failure, and it was in a raid5, so I didn't lose anything.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 966410 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 966428 - Posted: 28 Jan 2010, 17:05:10 UTC - in response to Message 966410.  


..... I can safely say the rumors that SATA drives fail frequently are true. ...
- Matt


You mean all brands of SATA drives?!

Or are they just Western Digital?
What brand/model/capacity?

I don't think Seagate will "fail frequently".

I don't have any problems with WD. True, I've seen a lot of comments on newegg about some batches that are prone to premature failure, but out of the 50+ that I've bought in the past 7 years, I've only had one failure, and it was in a raid5, so I didn't lose anything.


I've had nothing but problems with WD. Out of the two dozen I've owned over the years (from giving them another try, or because they were given to me), every single one of them has failed. On every system I've worked on, it has always been a WD when it's a hard drive failure (once it was a Fujitsu, but that is the only other HDD to have died). This is all normal use, non-RAID systems.

By comparison, I still have some Conner 80MB and Maxtor 400MB HDDs in use that work flawlessly.
ID: 966428 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 966439 - Posted: 28 Jan 2010, 18:15:46 UTC
Last modified: 28 Jan 2010, 18:15:58 UTC

The drives in thumper are Hitachis. That's what Sun chose to use. I'll never touch WDs or Maxtors for heavy production use.

Also, Eric years ago discovered a bug in WD drives being used by linux systems where memory maps to disk greater than 4GB introduced random corruption. He even wrote a little C program to prove this, and recreate the problem continuously. For all I know this hasn't yet been fixed. Seems like a major problem, but then again, who is using WD drives on a system that has processes taking up more than 4GB of actual memory (and memory maps to disk)? If you're running a big instance of mysqld... beware!

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 966439 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 966440 - Posted: 28 Jan 2010, 18:20:01 UTC - in response to Message 966439.  

I've had 1 hitachi and it didnt last a year. So that does fit with the high fail rate I guess


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 966440 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13847
Credit: 208,696,464
RAC: 304
Australia
Message 966442 - Posted: 28 Jan 2010, 18:30:05 UTC


Once upon a time, there were the Deathstars.
The problem with the drives was that IBM underestimated what was expected of a drive that was built for home/office use. If used as they expected, failure rates were very low- unfortunately for IBM by that time such use often meant the drives were on 24/5 or 24/7 instead of the (at most) 10/6 they were designed for & hence the high failure rates.

Most SATA drives are built for home/office use, and while designed to run 24/7 the actual amount of time the drives are active in such cases is really quite small. Runing in systems such as Seti the periods of inactivity would be very, very small. Hence drives such as Seagates ES series which are designed for 24/7 use in environments such as Seti.
Grant
Darwin NT
ID: 966442 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 966443 - Posted: 28 Jan 2010, 18:44:35 UTC

Google published the results of a study they did with 100,000 some odd of their hard drives. The report vaguely indicates that the manufacture doesn't seems to have any influence in the failures. I'd guess they didn't name any manufactures so they wouldn't have to show any preferences towards a specific brand.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 966443 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 966444 - Posted: 28 Jan 2010, 18:44:54 UTC - in response to Message 966442.  


Hence drives such as Seagates ES series which are designed for 24/7 use in environments such as Seti.


Agreed. I learned this the hard way. I have always sworn by Seagate. Had a bit of a problem with a batch that I bought a couple of years ago. But I think that was later confirmed to be a firmware problem rather than the hardware.

At that time, somebody pointed out to me that there was a difference between everyday hard drives and ones specifically engineered for 24/7 server use.

I got some ES's, and have not had a failure since (stroking the kitties for luck).

I don't think the Sata interface itself has much to do with the failure rate.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 966444 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 966529 - Posted: 29 Jan 2010, 0:36:29 UTC - in response to Message 966442.  

Once upon a time, there were the Deathstars.

The problem with the drives was that IBM underestimated what was expected of a drive that was built for home/office use.

IBM didn't talk much about the reasoning, I personally suspect that it had more to do with how they coated the magnetic media on the (glass) platters than the intended use: http://www.astro.ufl.edu/~ken/crash/index.html

The real problem was trying to make the warranty problem going away by "redefining" the warranty -- and sales suffered because it was easier to avoid all IBM drives than it was to avoid the specific troublesome deathstar models.

The short version of the story is that as a result, IBM chose to leave the hard disk market, and sold their drive business and manufacturing plants -- to Hitachi.
ID: 966529 · Report as offensive
Profile Widouxmaker
Avatar

Send message
Joined: 7 May 02
Posts: 12
Credit: 457,920
RAC: 0
United States
Message 966746 - Posted: 29 Jan 2010, 21:48:22 UTC - in response to Message 966529.  

Yep. Seta in my acer croaked and I was offline for awhile. didn't make it 2 years. Level 55 on diablo2 lost. Back it up. What kind of beer did ya'll get?
You talk'n to me?
ID: 966746 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 966780 - Posted: 30 Jan 2010, 1:52:58 UTC - in response to Message 966443.  

The report vaguely indicates that the manufacture doesn't seems to have any influence in the failures.
I think over the years every major manufacturer has had a bad patch or three. If you have the bad luck to get several drives out of a problem distribution, you'll have dreadful results, and may latch a permanent aversion to the manufacturer concerned.

Other folks, who happen to have sampled other bad portions, will naturally have grossly different opinions, held with equal fervor (and equal lack of adequate data).

I think the same Google study found drives marketed as "enterprise grade" not to do better either--but that won't stop people with a favorable experience on a tiny number of drives from fervently maintaining their vast superiority.

Unless you like to buy obsolete drives, and take the trouble somehow to accumulate vast field experience in the specific model and batch in advance, it is pretty hard to claim data based decision making on hard drive selection for reliability.

Disclosure: For four years of my varied electronics career I was a reliability professional--but it was at a semiconductor company, not a hard drive place. I did, on the other hand, play in a small recorder ensemble with a QA guy for Quantum, but that was so long ago that they not only built their own drives, but actually built them in California, so it really does not count.

ID: 966780 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 966788 - Posted: 30 Jan 2010, 2:30:22 UTC - in response to Message 966746.  

Yep. Seta in my acer croaked and I was offline for awhile. didn't make it 2 years. Level 55 on diablo2 lost. Back it up. What kind of beer did ya'll get?

Back in my day.......Stroh's was THE beer.......
And Wild Turkey was the whiskey.

Seagate were still the best on the block.......and still are as far as I'm concerned.......

It's ES's all the way down....
That's if any of my old Seagates ever fail.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 966788 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 3187
Credit: 57,163,290
RAC: 0
United States
Message 967309 - Posted: 31 Jan 2010, 16:17:50 UTC

Alert! the "Workunits waiting for db purging" is over .5 million and the "Results waiting for db purging" are approaching 1.2 million. Both figures are unusually high, in my experience...
.

Hello, from Albany, CA!...
ID: 967309 · Report as offensive
Galadriel

Send message
Joined: 24 Jan 09
Posts: 42
Credit: 8,422,996
RAC: 0
Romania
Message 967394 - Posted: 1 Feb 2010, 3:07:54 UTC

regarding the hdd discussion i am always a mixing fan so i use harddrives from every major manufacturer. best solution imho combined with random backups and offsite data&drive storage .

i lost during the last 5 years: 1x fujitsu 1x seagate and 1x westerndigital drives.
all of them where 2.5" notebook drives. sensible little bastards.
ID: 967394 · Report as offensive
Profile bloodrain
Volunteer tester
Avatar

Send message
Joined: 8 Dec 08
Posts: 231
Credit: 28,112,547
RAC: 1
Antarctica
Message 967667 - Posted: 2 Feb 2010, 8:43:19 UTC - in response to Message 967394.  

yeah i have had 2 wd(1 died of a bt bug) and 1 seagate die on me. it the batches just like cpu,ram,blank media.
ID: 967667 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 967828 - Posted: 3 Feb 2010, 12:03:05 UTC

I have had 4 drives fail over the last 25 years. In two of the cases I discovered that the manufacturer was no longer in business. One was apparently done in by a hurricane in Florida, and one was done by quality control issues. They sent out drives that lasted about 4 to 8 hours before failure. I got two of those (not counted in the total as I had not gotten the OS installed from the stack of floppies when they failed). When the drive from them that finally did work failed 3 years later, the company was no longer to supply a replacement under their 5 year warranty...


BOINC WIKI
ID: 967828 · Report as offensive
C

Send message
Joined: 3 Apr 99
Posts: 240
Credit: 7,716,977
RAC: 0
United States
Message 967837 - Posted: 3 Feb 2010, 13:43:59 UTC

Something appears to be wrong with the Gigabiteathernet site:

http://fragment1.berkeley.edu:80/newcricket/grapher.cgi?target=/router-interfaces/inr-250/gigabitethernet2_3&ranges=d%3Aw&view=Octets It usually only flat-lines when SETI is down, such as on Tuesdays...

C

Join Team MacNN
ID: 967837 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 967843 - Posted: 3 Feb 2010, 14:53:01 UTC - in response to Message 967837.  

Something appears to be wrong with the Gigabiteathernet site:

http://fragment1.berkeley.edu:80/newcricket/grapher.cgi?target=/router-interfaces/inr-250/gigabitethernet2_3&ranges=d%3Aw&view=Octets It usually only flat-lines when SETI is down, such as on Tuesdays...

C

It looks more like the data collector isn't getting data from the router. Since that site is run by berkeleys IT department for their use. I'm sure they will work on it if they feel the need to.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 967843 · Report as offensive
Profile Keith T.
Volunteer tester
Avatar

Send message
Joined: 23 Aug 99
Posts: 962
Credit: 537,293
RAC: 9
United Kingdom
Message 967923 - Posted: 3 Feb 2010, 20:27:34 UTC - in response to Message 967843.  

Something appears to be wrong with the Gigabiteathernet site:

http://fragment1.berkeley.edu:80/newcricket/grapher.cgi?target=/router-interfaces/inr-250/gigabitethernet2_3&ranges=d%3Aw&view=Octets It usually only flat-lines when SETI is down, such as on Tuesdays...

C

It looks more like the data collector isn't getting data from the router. Since that site is run by berkeleys IT department for their use. I'm sure they will work on it if they feel the need to.


I just tried looking at a few other routers randomly from that site. It looks like they have data problems with all of them for the same period, so it is not just SETI or SSL that is affected.
ID: 967923 · Report as offensive

Message boards : Technical News : Deluge (Jan 27 2010)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.