Panic Mode On (20) Server problems

Message boards : Number crunching : Panic Mode On (20) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 12 · 13 · 14 · 15

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 918896 - Posted: 17 Jul 2009, 23:23:45 UTC - in response to Message 918885.  

I myself still find that a decent controller and RAID 5 is my favorite for DB access/speed in most applications.


Might be of interest- SSD is the way to go.
Price is no object?


And buggered if i can find a reveiw i read recently- RAID 10, 5 & 6 compared.
Grant
Darwin NT
ID: 918896 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 918897 - Posted: 17 Jul 2009, 23:24:11 UTC - in response to Message 918894.  

Software raid doesn't have that limitation. It's better when budgets don't allow you to stock enough spares.

And when every scrounged/handmedown/engineering test server comes with a different card! I said 'historical': they may have started accepting that hardware is the way to go - especially when it's pre-built into a complete server. But I think they're still a bit nervous about failure/recovery. Did you read Matt's saga about how he has to reboot Thumper?
ID: 918897 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 918899 - Posted: 17 Jul 2009, 23:27:22 UTC - in response to Message 918889.  

Reporting does not happen on a schedule, rather it happens at the first of:

<list>

Add that to the wiki already! :)

OK, which WIKI is it missing from. I am pretty certain it is in the unofficial WIKI.


BOINC WIKI
ID: 918899 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 918900 - Posted: 17 Jul 2009, 23:29:43 UTC - in response to Message 918896.  

And buggered if i can find a reveiw i read recently- RAID 10, 5 & 6 compared.

A quick Google brought up http://blogs.zdnet.com/storage/?p=162&tag=nl.e539.
ID: 918900 · Report as offensive
Profile TCP JESUS
Avatar

Send message
Joined: 19 Jan 03
Posts: 205
Credit: 1,248,845
RAC: 0
Canada
Message 918901 - Posted: 17 Jul 2009, 23:31:05 UTC - in response to Message 918896.  

I myself still find that a decent controller and RAID 5 is my favorite for DB access/speed in most applications.


Might be of interest- SSD is the way to go.
Price is no object?


And buggered if i can find a reveiw i read recently- RAID 10, 5 & 6 compared.


Once SSD's can multitask a little better and the price comes down a bit, I will jump on that ship.

I currently try to limit myself to $200/drive x 3.....otherwise, one can get VERY carried away. $200 is the magic number, or $600 for 3 drives anyways, so that I can run a RAID 5 on my desktop.

The VelociRaptors are a decent performance increase over the last batch of Raptors and I am quite happy with their performance 99.9% of the time.

SSD is probably another year or more down the road for me.

I build a new semi-monster once a year or year-and-a-half, so perhaps 1 to 2 systems down the road for me, as my current box is less than a month old currently (Seti ID: 5000725).

Allan.
I am TCP JESUS...The Carpenter Phenom Jesus....and HAMMERING is what I do best!
formerly known as...MC Hammer.
ID: 918901 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 918903 - Posted: 17 Jul 2009, 23:31:20 UTC - in response to Message 918897.  

Software raid doesn't have that limitation. It's better when budgets don't allow you to stock enough spares.

And when every scrounged/handmedown/engineering test server comes with a different card! I said 'historical': they may have started accepting that hardware is the way to go - especially when it's pre-built into a complete server. But I think they're still a bit nervous about failure/recovery. Did you read Matt's saga about how he has to reboot Thumper?

Yes, I did.

When you stop and think about it, it's pretty amazing how well things work.

The Sun V40z SETI had was a prototype, and apparently, one-of-a-kind as the manufactured product was different when it came time to get parts.

Most everything else is either a hand-me-down or an engineering model.

In an ideal world, all of the servers would be similar (maybe just two motherboards) and as many parts as possible would be 100% interchangeable.

That's what bothers me most when someone says "mismanagement" -- they're doing an amazing job holding things together as-is.

ID: 918903 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 918906 - Posted: 17 Jul 2009, 23:42:11 UTC - in response to Message 918903.  

And Thumper is a Sun X4500: anyone not familiar with those specs, read and drool. "Up to forty-eight hot-swappable disk drives for a total of 48 TB of raw storage."

The original Thumper was a prototype: the current replacement may be a bit more maintainable, but I wouldn't want to be the one doing the maintaining.
ID: 918906 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 918907 - Posted: 17 Jul 2009, 23:46:22 UTC - in response to Message 918900.  

And buggered if i can find a reveiw i read recently- RAID 10, 5 & 6 compared.

A quick Google brought up http://blogs.zdnet.com/storage/?p=162&tag=nl.e539.

Nah, that's not it. This one was very recent.
Although i do remember that bit of doom & gloom from a while back. Dan had a nice article debunking the whole thing.
Grant
Darwin NT
ID: 918907 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 918908 - Posted: 17 Jul 2009, 23:47:37 UTC

Wow, missed a lot of discussion here. It's only been half a day since I last read the posts.

anyway, aggregating the uploads into one transmission is not a bad idea, but one thing that could be done to reduce the CPU load on the server is instead of gzip or something that compresses, why not just a tarball, or some other form of concatenation? Wouldn't require much CPU load at all to break it back apart at the other end.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 918908 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 918909 - Posted: 17 Jul 2009, 23:51:23 UTC - in response to Message 918908.  

anyway, aggregating the uploads into one transmission is not a bad idea, but one thing that could be done to reduce the CPU load on the server is instead of gzip or something that compresses, why not just a tarball, or some other form of concatenation? Wouldn't require much CPU load at all to break it back apart at the other end.

This is XML(ish): add the result file name to the header, and just concatenate them.
ID: 918909 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 918911 - Posted: 17 Jul 2009, 23:52:10 UTC - in response to Message 918901.  

Once SSD's can multitask a little better

I suggest you check those IO benchmark figures. Mechanical HDDs aren't even in the race. Not even 15,000 RPM SATA enterpise drives.


and the price comes down a bit

That's the biggest hurdle at the moment.
But if you need the performance...


The VelociRaptors are a decent performance increase over the last batch of Raptors and I am quite happy with their performance 99.9% of the time.

Yep, big boost on the previous drives.
Will probably be my drive of choice when i get a new system later this year. Maybe, hopefully, with luck.
Grant
Darwin NT
ID: 918911 · Report as offensive
Nicolas
Avatar

Send message
Joined: 30 Mar 05
Posts: 161
Credit: 12,985
RAC: 0
Argentina
Message 918912 - Posted: 17 Jul 2009, 23:56:04 UTC - in response to Message 918909.  

anyway, aggregating the uploads into one transmission is not a bad idea, but one thing that could be done to reduce the CPU load on the server is instead of gzip or something that compresses, why not just a tarball, or some other form of concatenation? Wouldn't require much CPU load at all to break it back apart at the other end.

This is XML(ish): add the result file name to the header, and just concatenate them.

That would be a project-specific solution...

SETI by itself cannot cause multiple result output files to get concatenated before uploading, that would need BOINC client changes. And if it's in the BOINC client, it should work for any project, including those without XML-ish output files.


Contribute to the Wiki!
ID: 918912 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 918913 - Posted: 17 Jul 2009, 23:56:43 UTC - in response to Message 918906.  

And Thumper is a Sun X4500: anyone not familiar with those specs, read and drool. "Up to forty-eight hot-swappable disk drives for a total of 48 TB of raw storage."

The original Thumper was a prototype: the current replacement may be a bit more maintainable, but I wouldn't want to be the one doing the maintaining.

The power consumption is truly impressive. 240v at 10a.
ID: 918913 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 918914 - Posted: 18 Jul 2009, 0:00:18 UTC - in response to Message 918908.  

Wow, missed a lot of discussion here. It's only been half a day since I last read the posts.

Sorry, we've been going off-topic: this is supposed to be the 'Panic' thread, and it's been at least an hour since we had a good panic.

How about: no new "tapes" were fetched out of storage today, and the lab is now on overtime after the Friday shift ended (as if they care).

Will 111 channels last until Monday?
ID: 918914 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 918915 - Posted: 18 Jul 2009, 0:00:46 UTC - in response to Message 918912.  

anyway, aggregating the uploads into one transmission is not a bad idea, but one thing that could be done to reduce the CPU load on the server is instead of gzip or something that compresses, why not just a tarball, or some other form of concatenation? Wouldn't require much CPU load at all to break it back apart at the other end.

This is XML(ish): add the result file name to the header, and just concatenate them.

That would be a project-specific solution...

SETI by itself cannot cause multiple result output files to get concatenated before uploading, that would need BOINC client changes. And if it's in the BOINC client, it should work for any project, including those without XML-ish output files.

If we're talking about a truly BOINC-centric solution, then it has to be generic.

That means we don't get it until enough BOINC clients are upgraded, which has been raised before as an objection.

In the short term, we're stuck.

A SETI-only solution could be to bundle work units, and change the science application to do "bundles" -- crunch more than one WU, add 'em all to one result file, and let BOINC assign "bundles" and think they're single work units.

Bundle MB, don't bundle AP, and put a half-dozen work units in each MB bundle.
ID: 918915 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 918917 - Posted: 18 Jul 2009, 0:02:51 UTC - in response to Message 918914.  

How about: no new "tapes" were fetched out of storage today, and the lab is now on overtime after the Friday shift ended (as if they care).

I wouldn't be surprised to find out that the BOINC staff is all "salaried-exempt" and therefore don't get overtime, or have precise set hours.

ID: 918917 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 918920 - Posted: 18 Jul 2009, 0:09:14 UTC - in response to Message 918917.  

How about: no new "tapes" were fetched out of storage today, and the lab is now on overtime after the Friday shift ended (as if they care).

I wouldn't be surprised to find out that the BOINC staff is all "salaried-exempt" and therefore don't get overtime, or have precise set hours.

Well, I was speaking figuaratively: I agree, any 'lab hours' are probably entirely notional, and even if they exist, SETI staff ignore them by working from home (pssst: don't tell Angela).

But I think Matt, at least, tries to get basic chores like fetching data done in daylight: and as things stand, I predict a data outage before the weekend is out.
ID: 918920 · Report as offensive
Nicolas
Avatar

Send message
Joined: 30 Mar 05
Posts: 161
Credit: 12,985
RAC: 0
Argentina
Message 918922 - Posted: 18 Jul 2009, 0:11:04 UTC - in response to Message 918920.  

But I think Matt, at least, tries to get basic chores like fetching data done in daylight: and as things stand, I predict a data outage before the weekend is out.

Remember Matt is even on vacation now...

Contribute to the Wiki!
ID: 918922 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 918925 - Posted: 18 Jul 2009, 0:16:02 UTC - in response to Message 918922.  

But I think Matt, at least, tries to get basic chores like fetching data done in daylight: and as things stand, I predict a data outage before the weekend is out.

Remember Matt is even on vacation now...

And so am I - for a few hours, at least: bedtime, UK time.

Thanks for a stimulating evening's conversation, everyone.
ID: 918925 · Report as offensive
Previous · 1 . . . 12 · 13 · 14 · 15

Message boards : Number crunching : Panic Mode On (20) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.