So you only back up once a week?

Message boards : Number crunching : So you only back up once a week?
Message board moderation

To post messages, you must log in.

AuthorMessage
Prototype
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 67
Credit: 497,118
RAC: 0
United Kingdom
Message 15576 - Posted: 19 Aug 2004, 21:58:34 UTC

I know the amount of data involved is huge but surely when your are entrusted with the hard work of others (and potentially valuable scientific data)then data should be backed up more than once a week!?!

Shoestring budgets or whatever, it simply not right that users work relies on 1 backup done once a week.

--
Join SIT@HOME, you too can sit there watching that expensive PC sit there doing nothing. Marvel at the fact the room is 5 degrees cooler than when your PC was running SETI@HOME.

SIT@HOME -- The new project!
ID: 15576 · Report as offensive
Profile Sir Ulli
Volunteer tester
Avatar

Send message
Joined: 21 Oct 99
Posts: 2246
Credit: 6,136,250
RAC: 0
Germany
Message 15579 - Posted: 19 Aug 2004, 22:02:47 UTC

some Info to a good Backup System

HP Ultrium Backup systems





Greetings from Germany NRW
Ulli [/url]
ID: 15579 · Report as offensive
Angstrom

Send message
Joined: 20 Sep 99
Posts: 205
Credit: 10,131
RAC: 0
United Kingdom
Message 15605 - Posted: 19 Aug 2004, 22:53:54 UTC

I would be surprised if these clowns even backup once a week. Of course they come "back up" much more frequently.

Sick animals are normally put out of their misery. I wonder how long it will be before this excuse for a project is finally put to rest along with those responsible.

Neil
ID: 15605 · Report as offensive
Profile Paul D. Buck
Volunteer tester

Send message
Joined: 19 Jul 00
Posts: 3898
Credit: 1,158,042
RAC: 0
United States
Message 15829 - Posted: 20 Aug 2004, 14:03:16 UTC - in response to Message 15579.  

> some Info to a good Backup System
>
> HP
> Ultrium Backup systems


So why don't you buy it for them.

Guys, you are assuming that there is an unlimited budget for the project and they can purchase what they want, when they want.

I am sure that they also back-up more frequently than once a week. However, you also assume that the back-up is good. Again, unfortunately, there is only one way to test to see if the back-up is good or not. And that is to restore to another machine and compare the content of the two machines to see if the back-up and restore are identical. Of course, you have to "freeze" changes on the source machines until the test has been completed.

And, by-the-way; you would have had to stop changes on the source machine so that you could make the test. In my working life, I never saw a back-up made of a relational database that was operational and on-line that was able to restore the machine to a known state. Contrary to assertions of the database and back-up vendors ...



ID: 15829 · Report as offensive
STE\/E
Volunteer tester

Send message
Joined: 29 Mar 03
Posts: 1137
Credit: 5,334,063
RAC: 0
United States
Message 15839 - Posted: 20 Aug 2004, 14:53:15 UTC

SIT@HOME -- The new project!
===========

What is the URL for that Project...YourHome.com...???

ID: 15839 · Report as offensive
mbn

Send message
Joined: 13 Oct 99
Posts: 24
Credit: 31,546
RAC: 0
Denmark
Message 15840 - Posted: 20 Aug 2004, 14:56:53 UTC - in response to Message 15829.  

That's right. It's very likely that they made backups after the 11th but those backups were useless, since they contained the errors from the raid failure.
--------------
Ost, nej tak!
--------------
ID: 15840 · Report as offensive
Profile Troy_ND

Send message
Joined: 23 Aug 02
Posts: 8
Credit: 1,879,844
RAC: 0
United States
Message 15843 - Posted: 20 Aug 2004, 15:09:31 UTC - in response to Message 15605.  
Last modified: 20 Aug 2004, 15:10:20 UTC

I'm coming in new to the Seti BOINC testing after using Seti@Home Classic for a few years, but I'm throughly saddened today when I finally stopped by these forums at the maturity level of most of the posts here, especially the one I quoted below.

First of, most of the time these type of projects don't have a large budget so they won't be able to buy complete top of the line hardware.

Second, I'm sure that they do backup more then once a week, but unless you've gone through a true server hardware crash, please don't even try to sound like you know what you're talking about. RAID crashes can be the worst since it is possible that it corrupts you data for a few days before the crash, thus making a restore from the night before impossible, especially with databases.

Third, if anyone thinks they have all the answers then by all means, donate some money to the project so that they have more resources. Plus instead of just stating the obvious, provide constuctive suggestions instead outright flames and attacks.

Fourth, since the Seti BOINC project seems to basically be in a beta phase yet, if you don't like possible downtime and lost data, then by all means run the Seti@Home Classic until they force people to switch over to the BOINC version once they have all the bugs worked out. It's the same as people beta testing online games, they moan and complain when there's a problem when beta WILL HAVE PROBLEMS, so deal with it or leave until it's out of beta.

Troy


> I would be surprised if these clowns even backup once a week. Of course they
> come "back up" much more frequently.
>
> Sick animals are normally put out of their misery. I wonder how long it will
> be before this excuse for a project is finally put to rest along with those
> responsible.
>
> Neil
>
>
ID: 15843 · Report as offensive
JMirman

Send message
Joined: 15 Nov 99
Posts: 7
Credit: 2,146,685
RAC: 0
United States
Message 15847 - Posted: 20 Aug 2004, 15:35:14 UTC

While I agree and appreciate most of Troy's points, I have to comment on one.

I sent an inquiry 3 months ago to SETI about making a dontation. The inquiry was an e-mail to the ID identified for such inquiries and clearly stated I wanted to make a donation. I never received a response. I didn't follow-up and I expect the reasons would be somewhat obvious.

As frustrated as I've been, I've never complained. I have tried to donate.

.
ID: 15847 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 15849 - Posted: 20 Aug 2004, 15:40:19 UTC - in response to Message 15576.  

> I know the amount of data involved is huge but surely when your are entrusted
> with the hard work of others (and potentially valuable scientific data)then
> data should be backed up more than once a week!?!
>

There's little point doing a backup while they're copying from one system to another. ;) Remember, except for the forums, the system was down from 11. to 16. August, and a corruption was found the 17th.
ID: 15849 · Report as offensive
Profile Troy_ND

Send message
Joined: 23 Aug 02
Posts: 8
Credit: 1,879,844
RAC: 0
United States
Message 15854 - Posted: 20 Aug 2004, 16:04:13 UTC - in response to Message 15847.  

JMirman: There's has been a Donate Link on the original Seti@Home site for a while. Here's the link: :)

http://setiathome.berkeley.edu/donor.html

Troy


> While I agree and appreciate most of Troy's points, I have to comment on one.
>
> I sent an inquiry 3 months ago to SETI about making a dontation. The inquiry
> was an e-mail to the ID identified for such inquiries and clearly stated I
> wanted to make a donation. I never received a response. I didn't follow-up
> and I expect the reasons would be somewhat obvious.
>
> As frustrated as I've been, I've never complained. I have tried to donate.
>
> .
>
>
ID: 15854 · Report as offensive
Purdy

Send message
Joined: 3 Apr 99
Posts: 76
Credit: 42
RAC: 0
Bolivia
Message 15855 - Posted: 20 Aug 2004, 16:14:02 UTC - in response to Message 15829.  
Last modified: 6 Jan 2005, 18:03:51 UTC

We need better backup procedures.
ID: 15855 · Report as offensive
Profile Troy_ND

Send message
Joined: 23 Aug 02
Posts: 8
Credit: 1,879,844
RAC: 0
United States
Message 15865 - Posted: 20 Aug 2004, 17:16:38 UTC - in response to Message 15855.  

Good Call, Purdy! I'd just read you reply to Paul D Buck.

At work, I make a daily backup of a couple live DBs that are operational and online, and if I want I can restore down to the second before a crash as long as I have the most current transaction file backup. Worst case, I'd have to restore to the night before and when the restore is finished the DB is fully operational and usable. Because we don't run 24 hours, I don't need to backup more then that, but if I needed to I could backup the transation logs minutely so that the worst you'd lose is 1 minute of data.

Of course, this is with enterprise level DB servers, like MS SQL, Oracle, etc. They are fully capable of a online backup then can restore and result in a fully operation DB. Although I've done the same with "sub" enterprise level databases like Pervasive.
ID: 15865 · Report as offensive
Profile Paul D. Buck
Volunteer tester

Send message
Joined: 19 Jul 00
Posts: 3898
Credit: 1,158,042
RAC: 0
United States
Message 15875 - Posted: 20 Aug 2004, 18:55:00 UTC - in response to Message 15855.  
Last modified: 20 Aug 2004, 19:12:57 UTC

> Paul D Buck you are a technical writer not a DBA. Please stop spreading
> disinformation. DBAs in the 'real world' are taking 'hot backups' and
> restoring RDBMS databases all the time. At this very moment we have all
> around the world thousands of thousands of 24/7 critical systems that are
> being backed up and restored very quickly without any problems. Every single
> DBMS from Oracle, Sybase, Informix . . . to DB2 is cable of providing this
> basic functionality. Your reasoning was acceptable at the beginning of the
> 80s not now 24 year later!

Actually, I am not a technical writer either. Before becoming disabled I was a systems engineer with my specialty being the logical design of relational database schemas. Primarily on Oracle databases but also Microsoft's SQL Server. Though not a DBA, that is true, I sat right next to several ... And I will stick with my statement. Taking a drive snapshot for back ups don't work because of changes in the transaction logs. Taking "live" shots through tools which were supposed to make restorable images failed almost every time we tried to do a restore.

I have seen the back-ups being made, just the way they were suposed to be done. Rarely they created an image that was able to be restored correctly.

I however, was not clear in one regard. The only reliable back-up mechanism that *I* saw was the export of the database after committing all transactions and taking the database off-line. All other techniques in my experience usually failed to make a reliable back-up. Check under "Your Mileage May Vary" for details.

> Anyway your arguments are irrelevant because BOINC has not been a 24/7 system
> (2h/1d? Max). The BOINC's ""DBA?"" could have taken serveral 'cold backups' x
> day. With more than 1 tape copy. Users have been off the system most of the
> time. There are no excuses what so ever.

The difficulty is that in many of those time periods that you have assumed could have been used to make a back-up were being used to move the database from one machine to another. While that is being done it is hard to make a copy of the database to back it up. Though you could make an argument that the database should have been taken off-line, a back-up made, with the restore to the new hardware configuration.

An interesting thought. However, it also ignores the issue where you make a back-up but have no place but the original system to test the restore. And if the back up is bad, well, you are still sunk. Every client that I worked for always could not get the point that each and every back-up has to be tested for integrity and its ability to be restored. Which means you must have a "mirror" system that is used for those tests (it can also be used as a disaster recovery site, testing, and training system).

Though it is common sense, and the cost is minimal, I never saw a client implement this elementary safeguard. Oh, and while we are at it, should we address the life-cycle costs of tape vs. mirror images vs. DVD, etc.?

> I am assuming you do not work for Berkeley and you do not want to be included
> in the list of the "7 Magnificent Cheerleaders".

Include me if you wish. I have been in worse company ... And if you had been paying attention you might recall that I have at times taken UCB to task for issues with regard to design and implementation. As a non-UCB participant I have the luxury, as do you, of making criticism without the headache of actually having to make something work.

[edit]
For those interested, see this FAQ and note that they recommend using more than hot back up.

Also see:
this, and Lastly, this.



ID: 15875 · Report as offensive
Purdy

Send message
Joined: 3 Apr 99
Posts: 76
Credit: 42
RAC: 0
Bolivia
Message 15909 - Posted: 20 Aug 2004, 21:52:24 UTC
Last modified: 6 Jan 2005, 18:02:57 UTC

That is a very good point!
ID: 15909 · Report as offensive
Profile Paul D. Buck
Volunteer tester

Send message
Joined: 19 Jul 00
Posts: 3898
Credit: 1,158,042
RAC: 0
United States
Message 16004 - Posted: 21 Aug 2004, 4:21:02 UTC - in response to Message 15909.  

Purdy & Poppy,

> I repeat . . . At this very moment we have all around the world thousands of
> thousands of 24/7 critical systems that are on-line at all times. These
> databases are very rarely taken off line for the very reason that are critical
> and need to be always available. These databases are backed up while they are
> up and running. If a disk failure occurs the database can be restores very
> quickly up to the very last minute. The ‘hot backup’ copy is restored to disk
> and then database redo logs are used to roll forward.

I agree that there are lots of 24/7 systems, and that they do back-ups, and the back-up systems report that the save set is good. But rarely are the back up save sets actually tested.

And, how can you assert that the restore is, in-fact, successful? All you have is the report of the back-up and restore program... Is the database really restored? Or is it in an apparant state of correct operation with the correct data even though it is (or may be) not, in fact, correct. From your assertion there is no problems and the database is successfully restored and rolled forward. But how do you know? And the answer is that you do not know for sure. You cannot. To ensure correctness you would have to have an identical system, with an uncorrupted image that you could compare against to guarentee correctness.

Just as an aside, data quality and correctness is a significant problem about which many of the database magazines are now finally addressing the issue, and books also. And the problem is that we are doubling the quantities of data, yet there is little ability to validate that the data is, in fact, correct.

> I do understand that you are not a DBA, but please think for a minute about
> what you are saying. You are claiming that the ‘hot backups’ of these
> thousands of systems are not reliable. I can assure you that the world would
> be in complete chaos if that would be true. Can you imagine banks saying
> “Sorry we lost the last 2 hours of data we had a RAID disk failure and we
> don’t really know how much money you have”. Airline systems etc.

Yeah, you got the point right away. These systems are not reliable. If they were we would not see overbooking of airline seats (actually this problem is also the result of the predictive booking algorithm failing, sort of like the predictive dialing used by telemarketers). No payroll errors, no tax problems, all of the mail that you get would be addressed correctly and would not have errors in names, addresses, and we wouild not be able to vote for someones dog.

I don't know about you, but I also read the papers and see many instances where the systems in place are not tested, validated, and continously monitored. If we think back to the last minor interruption in New Your city there were significant issues reported because networks were not truly independent with routes that were installed with care that there was sufficient spacing between the routes to eliminate single point failures.

For those that can recall even furthur back; NYC also had a small incident in the world trade center that sshut down many companies for a period that exceeded one week. As an oh-by-the-way, something like 90% of the companies that were down for that week were out of business by the end of the following year.

And there is at least one magazine now that is concerned with only disaster planning and recovery of IT systems.

> There is no need for long verbose answers or pointing to technical DBA
> documentation that is actually explaining how wrong you are (you should read
> the links you provide). On-line backups are very easy to set-up and are
> extremely reliable. Just today in our organisation we restored 18 databases
> using hot backups not because we had hardware failures, but because these are
> also used to create new database instances. In fact in 25 years as a DBA I
> can not remember last time we had a bad ‘hot backup’ that could not be used
> for recovering a database.

There is a need. You are making an unsupportable assertion and cannot prove your case. Mostly because right now I have you in the spot where you have to prove a negative. And I only pointed to those places for others that might have been interested in the background. You did not have to go there if you did not wish too. And I will point out that your answers are just as verbose ... :)

And how do you know I did not read them? The answer is that you do not. I would not have referred to them if I had not. Fundamentally you are seeing in them what you believe supports your case. As am I. However, I do have the stronger case because in none of them is there an assertion that reliable database backups can be made using any one plan or methodology.

The point is that Oracle Corp. is very specific in their recommendation that you do not rely on "hot" back-up of the database. They always have. Probably always will... Getting a reliable image is very difficult because you have to have the entire database in a known state and then capture that complete image on your back-up media. If any portion of that image is out of step with the rest of the components, well, you now have an inconsistent image of the database state.

On page 11 and 12 of the PDF file Oracle quite clearly states that to make a consistent back-up the database has to be off-line. No quibble, flat stated fact. 24/7 can be backed up to an inconsistent back-up and they do not come out and say that this is a good thing or even that this process will make a restorable image.

> I understand you have to keep a reputation in these forums, but there is
> nothing wrong in saying “Sorry I don’t really know. I am not a DBA and I have
> never been responsible for the integrity and recovery of critical database
> systems”.

This is the least of my concerns ... Though not responsible for the activity, I was present when the restores were attempted - and failed. And this is all that I said to start. I pointed to my experience. Yours obviously is different. Or at least you have a lot more faith in your systems than I have ever been able to muster about anyones system (to include mine). If the systems were perfect we would not need service pack #1, much less #2.

And I am never ashamed to admit that I do not have knowledge about many things. C++ is a perfect case in point. When we were discussing the schema I had to wade through the C code (and I may never feel clean again for the rest of my life) and I am not sure I truly understood what they were trying to do.

> Finally DBAs are responsible for the physical design of the database.
> Database schemas are always physical there is no such thing as a “logical
> database schema”. System designers and analysts are responsible for the
> logical design (ER diagrams etc).

Thank you for agreeing with me. DBAs do the physical design, I did the logical design and also indicated that I was one of those systems engineers that did the logical design. For your other points:

In the logical design the emphasis is on the analysis of the business rules and how they are best implemented in a relational database design. Though the efficiency of the schema is one of the least concerns, it does play a part in this phase. As a logical designer my emphasis was to develop the database in such a way that the business rules were completely enforced through the structures of the database. These structures to include tables, views, primary keys, foreign keys, 3rd normal forms (or higher), unique indexes, triggers, stored procedures, functions, transactions, security features (including roles and data access mechanisms), etc.

If you look at virtually all of the relational data base depection schemes they are all about the logical design. None of them have symbology to represent any indexes beyond the primary key. All other structures; including triggers, tablespaces, clusters, rollback segments, etc. are not part of the symbology. The "crows-foot" representation does not even allow you a symbology that will allow you to indicate cardinality beyond "one" (single line) or "many" (a 3 leg connector, hence, a "crows foot" with three "toes").

Normally not included in a logical design are the structures (using Oracle as a frame of reference) are indexes for query speed, tablespaces, roll-back segments, block sizes, clustering, index types (though this may be part of the logical schema; coordination between the logical designer and the physical DBA is important and the implementation may be done in the prototypes developed by the logical designer), RAID to include mirrors, stripes, physical design of the servers, etc.

Using Designer (unless Oracle has changed its name again) there is a data modeling tool which then is converted into a physical design (though I never did get a physical design from that transition that would allow the construction of a database).



ID: 16004 · Report as offensive

Message boards : Number crunching : So you only back up once a week?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.