西山华府出售:8.3. Databases

来源:百度文库 编辑:九乡新闻网 时间:2024/04/28 21:31:50

8.3. Databases

There is one cardinal rulewhen designing backup solutions for databases: database administrators(DBAs) always want local copies of their database backups, so it isimportant to understand why. DBAs, as custodians of the databases theysupport, need to have total control of many aspects of the systems onwhich they run, backups included. Part of that responsibility is toensure that the databases are restored as quickly as possible or asfrequently is requested; the data needs to be restored "instantly". WhenDBAs talk about restoring data, they are speaking about having the databack in the database structure and available to end users, includingall the steps required to re-integrate backups provided by the backupservice back into the database in question. Frequently, DBAs only needto restore small portions of the database, not entire sections. They arerestoring data as a result of mistaken entries to the data, nottypically corruption of the database as a whole.

DBAs, especially ones who havemore career experience, have had poor experiences with external backupsoftware, either in terms of ability to restore small pieces of thedatabase, performance impacts on the databases during backup, andavailability of backups from which to restore data. Given all thesefactors, many DBAs prefer to have native database tools perform localbackups of the database(s) onto local disk and then have the backuptools backup the resulting file set. From the DBA perspective, if thereare problems with the backup or resulting restore, the DBA only needs toengage the database vendor not a group of people consisting of thebackup administrator, backup vendor, and database vendor to solve anyproblems.

While this is one strategy, itis important to understand that while it may seem efficient from the DBAperspective and solves the tactical issues at some level, as the backuparchitect, your view must be more strategic. Modern backup softwaredoes not impact the database any more than native tools, and in manycases simply uses the same API as the native tools for backup.Additionally, while backup software does not provide record-levelrecoverability of databases, both NetBackup and CommVault provide theability to perform incremental and differential copies of databases,providing the ability to not have to restore entire databases.

Local backups to disk alsoconsume large amounts of resources. The disk space alone will require atleast as much space as is required for the database as a whole, and mayactually require more space, depending on the rotation schedule thatwould be required to maintain active copies available during backupoperations. Native tool backups also create compressed copies of thedata in order to consume as few resources as possible. The compressiontechniques used are no different from those used by standard foldercompression, and come with all the CPU, memory, and I/O overhead. So, byrunning the native tools, it is possible to actually introduce moreload on the server than using the backup software, when running in acompressed mode. In addition, the compression hinders the ability of thebackup software to make copies of the native tools backup. BothNetBackup and CommVault use the hardware compression on media tapedrives. However, having compressed source data defeats the hardwarecompression on the media and actually slows down the media transferrates and increases the media that is consumed. Ifthe target media is a deduplicated target, then the ability to performdeduplication on compressed data is effectively turned off.

8.3.1. Database Log Backups

Local backups do havealternatives. Both Oracle and SQL Server maintain records of thetransactions that have been committed to the database over a specificperiod of time. These records are called transaction logs.Oracle has two types of these logs: redo logs and archive logs, and SQLServer has transaction logs. These logs can be replayed against adatabase to bring the database back to the last transaction within thedatabase. As will be shown later in the chapter, it is stronglyrecommended, and in some cases required, that these logs be backed up aspart of a regular backup scheme.

If the database logsare backed up more frequently than the database, the logs can be used torestore the database to a point in time between normal backups. Thiseffectively provides the equivalent of an incremental database backup,occurring between regular backup cycles. So instead of committing thedisk space to a local backup, a database could be backed up daily by thebackup software and then provide periodic backups of the transactionlogs, which can then be replayed back. This provides a roughlyequivalent functionality to the local backup plus record restorefunctionality enabled by a local backup methodology. Discuss the optionwith the DBA as an alternative to local backup.

8.3.2. Locally Controlled Database Backups

If DBAs do not want or needlocal backup, many times they do desire a locally controlled backup.This type of backup is essentially a user-initiated backup and providescontrol of exactly when the backup occurs. While both NetBackup andCommVault provide scheduled backups, they are not deterministicschedulers, meaning that they only provide a window in which the backupmay be executed, not an exact time of execution. This may not work forcertain environments in which data processing occurs on the databaseduring specific hours and when backups cannot be run. Locally controlledbackups may go to disk, followed by a backup software pass, or maydirectly execute the backup software. A locally controlled backuptypically requires some level of scripting to integrate the nativebackup tools and the backup software into a single process. Since theprocess is controlled locally, the backup can inject a load into thebackup system that may not be accounted for in the normal schedulingprocess. This load may actually delay the database backup further—thebackup job is waiting for resources that were not "scheduled" for use ormay impact the performance of other backup jobs by competing forresources that were originally allocated for use by scheduled backups.

8.3.3. Pre- and Post-Scripts

The alternatives tolocally scheduled database backups do not completely solve the problem.Both CommVault and NetBackup provide a means to execute pre- andpost-backup scripts. Pre- and post-backup scripts are optionalcomponents that can perform tasks prior (pre-) or after (post-) theexecution of the backup. This ability of the backup software to start ascript before the backup actually runs gives DBAs the ability to controlthe exact time that a backup will run through the use of a trigger.A trigger script is one that is executed and waits for a specific eventbefore stopping, known as a trigger. DBAs will use their own scripts toplace the database in the appropriate state and issue the trigger. Thepre-backup script will then stop and the backup will begin. Thepost-backup script then puts the database back into normal operation atthe conclusion of the backup.

In general, a triggercould be designed to function in response to a number of items: time ofday, completion of processing, and so on. The trigger script would beincluded in a pre- and post-backup process that would be executed wellbefore the window in which the application is required to be backed up.While this solves the problem of not a deterministic scheduler executinga backup at a specific time, these types of script commit backupresources and idle them until the trigger is pulled. If the script failsthen the resources can be committed until the error is detected, manytimes well after the backup was supposed to have been completed.Additionally, these scripts have to be maintained—if there are changesto the backup software, database software, or both, the scripts have tobe revalidated to ensure that the process and functionality remain afterthe software change—a process that can be time-consuming and faultinjecting. When looking at local control of database backups, balancethe ability to forecast resources for the backup against the need tohave scheduling around critical business processes. If the businessprocess requires non-backup time to complete the tasks, local controlmay be the answer. Simply plan around the resource constraint as best aspossible and monitor the backups of both the database and other jobsaround that time to ensure that there are not issues introduced by usingthis method.

8.3.4. Snapshot Backups

But what if you could have thebest of both worlds—have the ability to provide a local copy of thedatabase, but be able to control the backup environment? Split-mirrorand snapshot backups provide this type of functionality. A split-mirrorbackup is one where a mirror copy of the disks are created, most oftenat the disk array level, and made available for use independently of theoriginal source disks. The split mirror is a full clone of every block,on every physical disk, that is used for the database and participatedin the mirror split.

8.3.4.1. Snapshots

A snapshotbackup is similar. Snapshots can be created on the operating system,using products such as VSS or FlashSnap, or on the disk array withTimeFinder or NetApp Snapshot. Instead of creating a full clone copy ofevery block, the snapshot starts at a point in time, tracks the blocksthat have changed by making a copy of the changed blocks onto a separateset of disks, and then references the unchanged plus the changed blocksto make a usable, presentable copy of the disks. The snapshot process,in most instances, uses a method called copy on write (CoW).A CoW snapshot, unless carefully managed and used for only a shortperiod of time, can have a significant impact on performance. CoWintercepts all writes to a set of disks and makes copies of blocks thathave changed since the start of the snapshot session. These blocks arethen copied to a separate set of disks, while the original blocks arecommitted to their intended location on disk. The write is notacknowledged back to the host until bothcopies of the block have been fully written—introducing a second write,thus introducing time to write. When the data is read from thesnapshot, the data from the blocks that have changed during the durationof the snapshot session are read from this separate location in whichthey were written, but the unchanged blocks are read from the originallocation. If another process is attempting to read the same block at thesame time as the process reading a snapshot, a deadlock occurs, and oneprocess must wait, introducing latency to the other. While this mayseem like an unlikely occurrence, in a relatively small number ofblocks, with only a moderately active database, such deadlocks can occurfrequently enough to slow the overall performance of the database.

8.3.4.2. Split Mirror

The splitmirror/snapshot backup works by using a combination of operating systemand/or array functionality that creates this point-in-time copy of allthe files on the disks occupied by the database. This point-in-time copycan then be backed up independently of the database, at a time of thebackup software's choosing, and without impacting the database as awhole. This copy could even be backed up by an alternative system thatwould completely offload the whole backup process from the databaseserver—effectively making the backup look to the database like it wasalmost instantaneous, depending on the requirements of the database. Theprocess looks something like this:

  1. The database is quiesced and/or placed into a backup mode. This mode may allow the continuation of processing during the backup.

  2. The snapshot is created from the quiesced database. This snapshot represents a full point-in-time copy of the database at the point at which the database was placed into the backup mode. The snapshot must include all portions of the database that allow it to run, such as transaction logs, trace log, and the like.

  3. The database is then taken out of backup mode. This effectively ends the backup cycle for the database.

  4. The snapshot is then available for backup, offline processing (as another database), or restore purposes (record-level restores—just like a local copy). All this functionality can be local to the source server, or operated on a secondary server, set aside for such purposes (such as a Media Server or MediaAgent).

However, as with all thingsbackup, there are trade-offs. In using this type of backup method, therestill can be a significant amount of disk required. The storagerequirement can range from a full copy of the disk space, in the case ofa split mirror, to the amount of change that occurs between backups—therequirement for a snapshot backup. The process described previously isnot automatic; in order to function in a scheduled backup environment,it must be automated. This automation can be in the form of a set ofscripts that implement the functionality or in additional agents thatneed to be installed to extend the backup software capabilities toinclude integration with the split mirror/snapshot provider.Additionally, if it is desirable to back up the mirror/snapshot on aserver separate from the source, shared storage is required, andadditional servers of the same OS type and level need to be in place tohost the mirror/snapshot. This secondary server is known as a proxy server.Finally, if any database operations are desired to be performed on theproxy server, such as record-level restores, a copy of the databasesoftware must also be installed and maintained on the proxy server.Depending on the database vendor, this proxy server may need to beisolated from the production network as the database may try toadvertise itself as the primary database. This is most often the casewhen working with SQL Server—SQL registers itself with Active Directoryat start time.

The previous discussion talkedabout the need to put the database into a "backup mode" prior to makingcopies of the database. Prior to backing up, all databases need to makesure that pending transactions are committed to the database whichcreates what is known as a consistent database. This is necessary toensure that all data that is in the database is correctly stored andreferenced by the internal database structures. If this process is notperformed, the resulting database copy can be unusable. There aregenerally three methods of preparing a database for backup: shuttingdown the database completely ("cold"); placing the database into a logtracking-only state where new transactions are placed only into thetransaction log and not committed to the database until the backup modeis exited ("hot"); or through the use of an data arbitrator or APIspecifically designed around database backup ("programmatic").Mirror/snapshot backups operate using either the cold or hot databasebackup methods; native backup tools use the programmatic method(sometimes in combination with either the hot or cold), and the backupsoftware can use a combination of all three.

So what are thedifferences? A cold backup requires that access to the database besuspended and that all transactions that reside within the transactionlogs be fully committed to the database. This process usually requiresthe complete shutdown of the database software to complete. Obviously,this represents downtime for the database during which time end userscannot perform any work. However, a cold backup is useful as a baselinecopy of the database from which to work, and is the easiest to back up:simply make a copy of the data files that are part of the database. Oncethe copy is made (which can be made via a mirror/snap, as well as asimple direct backup of the data), the database can be restarted andbrought back online. This method is great for databases that do not haveany off-hours activity or only require periodic backups as theyrepresent archive copies of structured data.

A hot backup, on the other hand,allows for access to the database during backup. During a hot backup,the database software is commanded to place the database into aspecialized mode, commit any pending transactions, and capture allfurther transactions exclusively in the transaction logs until thedatabase is removed from hot backup mode. This mode provides the abilityfor end users to continue to query, insert, and update records from thedatabase, while maintaining consistency of the database files forbackup. Table 8-1 shows the differences between cold and hot backups.

Table 8-2. Comparison of Cold and Hot Database Backups
 AdvantagesDisadvantages Cold backup Provides complete, guaranteed consistent copy of database.

Easiest to execute.

Does not require any special configuration of database to run. Database is offline during backup.

Largest backup; backup is a full backup every time.

Only full database restores are possible without proxy servers. Hot backup Database continues to function during backup. Performance of database can degrade during backup.

Records are not committed during backup, so database is not truly updated.

Transaction logs are of fixed size and can quickly fill during backups, causing database outages.

A hot backup sounds like theperfect solution; however, it comes at a price. Since all newtransactions are placed into the transaction logs, and not committed tothe database during hot backup mode, the database is in a state wherethere two records of the same data at the same time can exist—one in thetransaction logs and one in the database. This is reconciled after thebackup mode is released, as the transaction logs are replayed againstthe database, making the records consistent again.

Additionally, the transaction logs are of fixed size, and only occupy a small amount of fast disk space. (See Figure 8-4.)If the backup mode is maintained for too long, the transaction logs canrun out of space and the database can crash, requiring DBAs to performdatabase recovery and possible restoration. Therefore, it is importantto keep a database in backup mode for only short periods of time—justlong enough to get a backup copy. Enter mirror/snap backups. Bycombining the hot backup mode with a mirror/snap, backups of very largedatabases can be accomplished, at least from the database perspective,in a matter of minutes by using the hot backup mode in the mirror/snapprocess.

Figure 8.4. Size comparison between data and log files

The programmatic method ofcreating backups is used by software vendors and scripting authors tocreate customized methods of backup. The database vendor developers usethese programmatic methods to selectively ensure that parts of thedatabase are consistent during backup and that the individual records,or in some cases database blocks of data, are retrieved and backed up ina manner that will allow for selective or full restores. These methodscan be in the form of software APIs that allow for standard methods ofdatabase access at a low-level or vendor-provided tools that provideaccess and backup outside of the traditional cold and hot backupmethods. Third-party software providers such as SQL Lightspeed use thesemethods to provide optimized backup methods outside of both the"official" SQL Server backup tools as well as NetBackup and CommVault—aswill be discussed later in the chapter.

Of course, the reason whybackup software providers have database agents is to provide anautomated, controlled, and tracked method of backing up database data.Both NetBackup and CommVault have several database agents that cover anumber of different databases, such as Oracle, SQL Server, and DB2; andeven some smaller database types such as MySQL and Postgres. The twomost commonly found databases will be review here in detail: Oracle andSQL Server.

8.3.5. SQL Server

SQL Server has the advantageof only running on Windows platforms and is able to take advantage ofseveral features of the OS that make backup easier. SQL Server has theability to perform both hot and cold database backups using nativetools, and has several ways that allow for backup applicationintegration.

8.3.5.1. SQL Server Terminology

Before we get into the specificsof how to backup SQL Server, let's talk about terminology. Like allother software, database software does not necessarily use a commonlexicon to describe the different pieces that make up a database. SQLServer is no different. SQL Server is installed on a physical server andcreates an instance of SQL Server. The instance is the highest level ofgranularity on a server. Within each instance, there may be one or moredatabases contained and managed within the instance. The database willconsist of multiple filegroups, which are exactly what they soundlike—groups of files that contain the data structures that are createdwithin the database to store the data: tables, triggers, indices, and soon. (See Figure 8-5.)

Figure 8.5. SQL Server hierarchy

The simplest method that SQLServer uses for backup is the native dump. This provides a local,point-in-time copy of the database, and can either be a simple copy ofthe required files within the filegroups of the database, or can be adata-level backup, essentially an export of the data contained in thedatabase into a SQL Server specific backup format. When backing updatabases within a SQL Server instance, the method of performing backupsis guided by a "recovery model" that is specified during the databasecreation. This model can be changed after the database is created, butthe model guides both the method of backup and the ability to recoverdata. The recovery model can be one of the following:

  • SIMPLE—Transaction logs are automatically managed (truncated) during normal database operations. Since the transaction logs do not necessarily contain information since the last backup, no backups of the transaction logs are taken. This means that the database can only be recovered to a specific point in time—the time of the last backup. This model can be used for either file-/filegroup-level backups or database-level backups.

  • FULL—Transaction logs are only truncated after a completed backup, and as such maintain all transactions that occur in the database between backups. Because of this, transaction log backups are taken along with the database. Can be used for either file-/filegroup-level backups or database-level backups. Allows for point-in-time recoveries of data at any point between backups by replaying transaction logs. This is typically the most common and recommended recovery model for SQL Server.

  • BULK_LOGGED—Only minimal transaction logs are created during normal operations. This model is designed specifically for databases that have data loaded into them all at once, not over time. Such databases are called bulk- or batch-loaded databases. This recovery model also requires that transaction logs are backed up.

SQL Server also allows forthe concept of full or differential backups, both at the database andthe file-/filegroup-level. The "full" backup stated here should not beconfused with the FULL recovery described previously. When fullis discussed here, we are referring to the level at which the backup istaken taking all the data and backing it up, not the method ofmaintaining transaction logs relative to the backup (the recoverymodel). A full backup at the database level contains all data thatresides within the database. In a differential database backup, only thedata that has changed since the last backup is put into the backupcontainer. A full backup is required to have been taken prior to adifferential to act as a baseline. While this may seem like a trivialactivity, it is important to understand that SQL Server does notautomatically assume that a full backup has not been taken, and actuallyhas no way to determine differential data without a full beingpreviously taken.

SQL Server alsoallows for full and differential backups of files or filegroups within adatabase. This can be useful as it creates a simple method ofprotecting databases simply by copying files. However, do not befooled—restoring the database can be complex because multiple files frommultiple backups may be required to bring the database back to aconsistent state. File-level backups are also larger than their databasecounterparts as the file will contain an unallocated database area inaddition to the live data. This unallocated space within the databasefiles, depending on your DBA's level of paranoia, can be more than 50percent of the total size of the data files—effectively doubling theamount of data required to be backed up.

SQL Server also cantake advantage of a Windows-specific operating system feature forbackups: Volume Shadow Service (VSS). VSS provides a native operatingsystem-level ability to create snapshots of file systems or volumeson a particular server, and make these snapshots available for use bybackup or other operations. The VSS snapshot is effectively a full copyof the database at the point in time of the backup, but only containsthe blocks that have changed since the initiation of the VSS session.This method is good for full backups under the SIMPLE or FULL recoverymodels.

SQL native tools are good forproviding local backup to disk, but both NetBackup and CommVaultprovide management of backups over time and can utilize differentmethods of backing up the data within the database to make recoveriesmore straightforward. NetBackup and CommVault both use another featureof SQL Server, called the Virtual Device Interface (VDI), to accomplishdatabase-level backups. VDI is a command data interface to the databasesthat allow NetBackup and CommVault to request specific streams of datato be generated and send through a "device" which is read by the backupsoftware and re-directed back to the backup media. While this backupmethod does not allow the ability to track specific row/records, it doesallow the ability to playback data to a particular point in time.CommVault and NetBackup also allow for file-based backups using the VDIinterface—the interface places the database into a quiesced state whichallows for the backup of all files within the database, or just specificfiles that have changed since the last backup.

NetBackup and CommVault alsohave the ability to use VSS as a backup method as well. While thismethod is not as granular as the VDI method, it can be a faster methodof backup from the perspective of the database. During VDI-initiatedbackups from both NetBackup and CommVault, load is placed on thedatabase as both pieces of backup software are essentially executingdatabase-level operations while regular operations are occurring, thuscompeting for resources along with regular queries. VSS backups,however, work at the operating-system level to create a snapshot copy ofthe database. The only time the database sees any interaction is whenthe VSS service quiesces the database long enough to create thesnapshot—an operation that is measured in minutes or seconds, not hours.From there, the backup software can take a backup of the databasedirectly from the VSS snapshot without any further impact, other thanOS-level impact, on database operations. While VSS snapshots do not havethe ability to playback data as does VDI-based backups, they can berelatively small and taken several times a day, thus providing anability that is analogous to log replays. Simply apply the VSS snapshotto the database, and the database is restored to the point in time atwhich the snapshot was originally taken.

In addition to the SQLServer native and NetBackup/CommVault-based backups, there also existsbackup software that interacts directly with SQL Server and createsbackups that allow row/record-level recoveries from backup storedlocally on disk. The most widely used software for this type of backupis SQL Lightspeed by Quest Software. Lightspeed provides SQL Server DBAsthe ability to do what they want—have locally controlled "backups" thatquickly restore rows/records to a specific database without requiringexternal assistance, or obviate the need to have a second copy of thedatabase software running to which to restore a native backup.Lightspeed is a powerful tool, but when used for creating backups thatwill be protected by either NetBackup or CommVault;, there are someissues to be aware of.

8.3.5.2. SQL Lightspeed

Since Lightspeed doesnot interact directly with either backup software, NetBackup andCommVault will essentially treat the backup of a Lightspeed backup as afull backup, regardless of what Lightspeed has generated. This isbecause to NetBackup or CommVault, the new backup generated byLightspeed is simply a collection of files—there is nothing inherentlydifferent from a backup perspective of the Lightspeed backups from anyother type of file. This means that the amount of data actually backedup by the backup software can be much greater than is really requiredfrom the database.

Secondly, there is aninherent assumption that the Lightspeed backup actually backed up datathat was different from the previous backup. Said another way, there isno way for either NetBackup or CommVault to track what data was actuallyprotected within a Lightspeed backup. This has two implications:

  • Backup levels-Every backup of a Lightspeed database dump is by definition a full backup. To both NetBackup and CommVault all files in a Lightspeed backup are new, so all files need to be protected.

  • Restore levels-There is no real way for either piece of backup software to determine which backup contains the necessary Lightspeed backup when restoring to a point in time, without intervention/documentation from the DBA(s) performing the Lightspeed backup.

This exposes the dataprotection method as a whole to the ability of the DBAs to manuallymaintain a record, either within Lightspeed or externally, of whichbackup corresponds to which point of recovery.

Finally, LightSpeed bydefault creates compressed copies of SQL Server backups. While this isvery efficient for storing backups on local disk, it can play havoc withbackups to media via either NetBackup or CommVault. Both pieces ofbackup software, when using tape media, use the compression that isnative to the tape drive. When compressed data is backed up to a tapedrive that is using compression, the drive does two things: it slowsdown as it is attempting to recompress already compressed data and itstores two to three times less data since no compression is possible onthe data set. Under normal file system backups, this is not significantbecause in most cases the amount of compressed data as a percentage ofthe total data type is not great. However, with a LightSpeed backup set,100 percent of the data is compressed, making for media storage that isgreatly expanded from that which would be normally required. This issueis even more pronounced when using deduplication—compression can cause10-20 times more storage requirements and performance degradation overuncompressed data.

When deciding which backupmethod(s) to use, discuss with the DBAs for the database theirassessment of how often the data would need to be recovered, and to whatlevel of granularity. Generally, a combination of approaches will coverthe requirements of the backup administrator to provide detailed,long-term storage of data for protection; and those of the DBA toprovide granular, short-term operational backups of data available forimmediate restore. If the DBAs are using LightSpeed or a similarproduct, work with them to understand the reasons for use. If the DBAsare simply using LightSpeed as a pure backup (they do not require theability to quickly do record-level restores), work with them toimplement a backup based on the backup software modules.

If a local backup isrequired due to operational or business concerns, then a separatebackup, either using VDI or VSS, should also be made daily, independentof the LightSpeed backup. Why not just backup the LightSpeed output?There are several reasons. First, as was discussed previously, since thebackup software does not interact with the local backup, there is noway to determine the point in time at which the backup was taken. Whilethis may not seem like an issue at the point of backup, it can becomecritical at the point of recovery, particularly when recovering offsite.

Take the following scenario:backups are taken of the local backup onto the backup software. Anoffsite recovery is required to allow for auditing of the database at aparticular period of time in the past, which lands between backupcycles. There are two LightSpeed backups that have been tracked by thebackup software, which bracket the point in time required for the audit.Which one is required to restore? The only way to tell is to restoreboth, restore the databases from the resulting LightSpeed backup, andmake the determination at that point. If a NetBackup or CommVault backupof the database had been taken in addition to the LightSpeed backup,the determination of the point of backup and the restoration of thedatabase to satisfy the audit is a single restore process, with theability to determine exactly which backup version is require.

If the backup softwaremodules from NetBackup or CommVault are acceptable to the DBAs,determine their true needs. If the needs require a granular level ofrestore throughout the day, perform VSS backups of the transaction logswith daily VDI-level backups of the database, with the FULL recoverymodel having been implemented for the database. This will enable theDBAs to recover the database to a particular point in time, while havingthe continual record of transactions available for recovery on backupmedia. This will require moving data to the backup media throughout theday, however, since only the transaction logs will be moving, the amountof data actually backed up will only represent the amount of changesince the last log backup. The daily VDI backup will roll up all thechanges during the business day into a single backup. The VDI backupwould be implemented on a standard full/differential schedule, asdetermined by the retention requirements of the business.

If a lower level of granularitycan be utilized, either a VDI- or VSS-based backup can be taken of thedatabase. Either the FULL or SIMPLE recovery model can be utilized,depending on how much granularity is required during restores. If theFULL recovery model is used, a VDI backup should be used. While thisplaces more load on the database during backup, it allows for a smalleramount of data to be moved during a differential backup because only thechanges within the database, not the whole database file, is backed up.However, if it is acceptable to only take a daily snapshot of thedatabase, or if the SIMPLE recovery model has been implemented, using aVSS-based backup of the database at the file-/filegroup-level should beconsidered. This will still allow for a full/differential backup scheme,but since the SIMPLE recovery model does not allow for the capture oftransaction logs, there is no reason to gather database blocksdirectly—the file or filegroup has already captured all the data neededfor a restore.

Native tool backupsshould be avoided unless there is a very specific operational need. TheSQL native tools provide no benefit over the use of the backup softwaretools, and the resulting backups can consume large amounts of disk spaceand system resources. The recovery method of native tool backups, whileappearing to be simpler than the backup software-based backup, canactually require more steps than using the backup software client intoSQL Server. In addition, native tool backups also suffer from the samenondeterministic identity problem described previously for LightSpeedbackups. If there is an absolute requirement for local backups, a backupsoftware-based backup should also be taken to mitigate the issue.

8.3.6. Oracle

Where SQL Server operates ona single platform, Windows, Oracle operates on Windows and multipleflavors of *NIX operating systems. While this may seem to present anissue in crafting a backup methodology for Oracle, there is a good setof common tools, both from the native Oracle backup and theNetBackup/CommVault sides to make backups across platforms look verysimilar.

8.3.6.1. Oracle Terminology

Oracle, like SQL Server, hasits own set of terms to describe the various pieces of theimplementation. Oracle databases are implemented on physical servers,with instances of Oracle representing the global memory structures andin memory elements required to run a database. The data within an Oracledatabase is placed in containers on disk called a tablespace.A tablespace can consist of one or more data files that hold the datacontent. Oracle also maintains transaction logs, but there are two typesthat can be generated, depending on the mode in which the database iscurrently running. In standard, or non-archivelog mode, the Oracledatabase generates only REDO logs. These logs are used to replaytransactions that have already occurred—hence the name. In function theyare very similar to the transaction logs in SQL Server. Oracle providesa second mode in which the database can operate: ARCHIVELOG mode. InARCHIVELOG mode, Oracle generates twotypes of transaction logs: the REDO log and ARCHIVE logs. While thefunction of the REDO logs remains the same, the ARCHIVE log is used tocapture transactions before they are committed to the database. This allows for not only roll back of the database to a point in time but also a roll forwardto a point in time after the application of REDO logs. This is a verypowerful tool and is required for hot backups, as you will see.

There are several othertypes of files that are extremely important to Oracle backups: controlfiles and configuration files. Oracle control files are binary filesthat essentially store information for the database that tells thevarious software binaries critical information about how to start andmaintain state within the database and stores the current state of thedatabase in a static area. There are usually two copies of the controlfile created: a primary and a backup copy. If the primary copy of thecontrol file is corrupted or missing, the backup can be substituted.Without a control file, the database will not be able to be started, or mounted in Oracle terminology.

NOTE

It is exceedingly important to backup the control files along with the backup of any Oracle database.

Oracle alsomaintains configuration files that provide information to the binariesregarding options that can be utilized during binary startup andmaintains information regarding licensing of the options to the databasebeing maintained. The primary configuration file within a database isthe init.ora file. This file contains information regarding the databasename, the location of the primary and backup control files, and otherconfiguration information that is critical to identifying the databaseto the instance. Additionally a secondary file, sqlnet.ora, may bepresent. It defines connection parameters for the database.The sqlnet.ora file contains information regarding the networkinterface on which to listen for connections, the physical server nameand DNS domain, the external name of the databases, and the securitymethod to be used on the databases specified. While not as important asthe control files, the init.ora and the sqlnet.ora files should alwaysbe backed up as part of any backup created.

Oracle has two basicmethods of preparing the database for backup and several methods forlocal backup. When using a manual or external method of backing up theOracle database, using a backup of the data files as the way to back upthe database, the database must be quiesced, just as in the SQL Serverprocess. Oracle can be quiesced by shutting down the database, but thisimposes the same issues on Oracle as it does on SQL Server—namely thatthe database is unavailable for use by end users.

8.3.6.2. Oracle Hot Backup

Oracle can also be placedinto hot backup mode, or online backup mode, depending on thedocumentation set being referenced. In this mode, the ARCHIVE logs actas the primary target for transactions while the database is beingbacked up. In Oracle version 10g or greater, each individual tablespace,which consists of one or more data files, can discretely be placed inhot backup mode and backed up independently. Prior to 10g the entiredatabase had to be placed into hot backup mode and backed up as a unit.Why is placing the entire database into hot backup mode an issue?

During the period of time thatthe database is in hot backup mode, the ARCHIVE logs are receiving thetransactions and the REDO logs are receiving the complete appliedtransaction, not just the pointers to the changes as normally happens.Since the ARCHIVE and REDO logs are simply circular logs that arelocated on disk, they do not have the performance that the database as awhole has. So the overall performance of the database is greatlyreduced during the period of time that the database is in hot backupmode. Oracle partially solved this problem by allowing individualtablespaces to be placed in hot backup mode. This means that only aselect group of files, representing the tablespace being backed up, isimpacted, thus allowing an overall better performance during backup ofthe database, although possibly at the expense of the speed of thebackup.

The traditional Oraclehot and cold backups also provided a mechanism to perform quick snapshotbackups. The ability to create a snapshot, after placing Oracle intoeither mode, greatly sped the backup process from the Oracleperspective, and also provided an instantaneous copy of the Oracledatabase that could be mounted, restarted, and used for other purposes.The snapshot could be VSS-based, in the case of Windows implementationsof Oracle, or could simply be an array-based snapshot or clone—thefunctionality of which would apply across operating system types.

NOTE

Using array-basedsnapshots or clones for Oracle database backups require that scriptingto integrate both the Oracle and the backup application so that thesnapshots can coordinate with the state of the database.

This also somewhat fixed theissue with the impact of placing all or part of the database into hotbackup mode, by greatly shortening the length of time required for thedatabase to be in hot backup mode to the time required to make thesnapshot copy—typically measured in minutes, not hours. This method alsohad the limitation of requiring complex scripting to coordinate thechange of database mode with the creation of the snapshot and thenreversing the change of mode back to standard operation.

8.3.6.3. RMAN Backups

The hot and cold backupmodes for Oracle have the limitation of only allowing data file-levelbackups of the database, especially using snapshots for the databasecopies, which effectively looks like a full backup to the backupsoftware. While the traditional methods are still in use, Oracleintroduced a native method that allows for a robust means to performfull and differential backups of an Oracle database: Recovery Manager(RMAN). RMAN is essentially an internal mechanism to track Oraclebackups of the data within a particular database. RMAN has two methodsof tracking this information: locally maintaining backup metadata withinthe database control file or globally within a RecoveryCatalog—essentially a separate Oracle database specifically setup forRMAN backup metadata. RMAN can perform online and offline (cold) backupsof the database being monitored, and is the primary method ofintegration for both NetBackup and CommVault Oracle backup agents.

NOTE

RMAN is requiredfor agent-based backup, such as those provided by NetBackup andCommVault. The Recovery Catalog is required for RMAN and must be setupby the DBA prior to executing backups using the backup agent. For moredetailed information regarding implementing RMAN, consult RMAN Recipes for Oracle Database 11g, by Sam R. Alapati, Darl Kuhn, and Arup Nanda (Apress, 2007).

RMAN provides usefulfeatures to the backup software that are not found with the traditionalhot/cold file-level backup. Since RMAN maintains state within thecontrol file, differential backups of Oracle data, not data files, canbe created—significantly reducing the amount of data required to bebacked up and thus the time required for completion. This is similar infunction to the VDI-based backups that can be used for SQL Server—amechanism to track the individual database blocks and only backup thoseblocks that have changed. If the global Recovery Catalog is used, thismetainformation can be maintained for longer periods of time and cancover multiple databases, across multiple servers—providing the abilityto manage an entire enterprise and maintain multiple levels of RMANrecovery points, all from a single location.

RMAN can be operated fromseveral points. First is the Database Control graphical user interface(GUI). This provides an interface that is very similar in functionalityto what standard backup software would look like, except that it isspecific to Oracle. Secondly, RMAN provides a command-line interface(CLI) to allow for command-line execution of backups, along with allconfiguration features. Lastly, and most important for this discussion,is RMAN's ability to run as a scripted utility. RMAN scripts form thebasis for both local Oracle backups, as well as for backups controlledby both NetBackup and Oracle. The RMAN script contains all theconfiguration and command information necessary to execute the backup.By scripting RMAN, it is possible to consistently backup a databasemultiple times—which again provides a repeatable mechanism that can beused for backup software. It is by using these scripts that bothNetBackup and CommVault can perform automated Oracle backups.

One of the more interestingfeatures provided by RMAN is the ability to natively multiplex Oracledatabase backups through the use of RMAN channels. An RMAN channel is a data pipe through which the backup data flows to be written to the defined backup media.

NOTE

Just like therelationship between the maximum amount of streams allowed from either aNetBackup or CommVault file system client, the RMAN channelspecification must be tuned to the capabilities of the database fromwhich the backup is taken.

For instance, if you have adatabase with 10 data files, you may want to open 10 channels, one foreach data file. However, there may be limitations for performance—thesystem may have I/O limitations, or the number of media targetsavailable for backup is fewer than the number of channels that arecreated. In order to throttle the number of channels backing up at thesame time, RMAN provides the PARALLELISM parameter within the RMANscript to artificially limit the number of channels that can be operatedin parallel.

RMAN Interfaces to NetBackup and CommVault

RMAN provides the interface for both NetBackup and CommVault through the installation of special binaries, called libraries,into the Oracle software. These libraries provide the software "pipe"through which the output of RMAN backups flows to the backup softwarefor tracking and storage onto the backup media. In defaultconfigurations, Oracle DBAs create RMAN scripts and run them by usingeither the Oracle default library to create a local, disk-based backupor the provided software libraries to seamlessly provide backups to thebackup media. Here is where the differences between NetBackup andCommVault implementation are seen. NetBackup requires the manual creation of RMAN scripts by the DBAs, which are then called by the Oracle agent to be executed at the time of backup. CommVault, on the other hand, generates RMAN scripts based on the parameters specified during subclient configuration. The RMAN script is generated at backup time and executed using the native RMAN tools.

There are advantages anddisadvantages to each approach. The NetBackup approach allows for agreat deal of customization and tuning, using the large number ofconfigurable parameters within RMAN to optimize the backup stream or toprovide a method of quickly altering how the RMAN backups are completed,simply by changing the RMAN script being used. The CommVault strategytakes the opposite approach. When a subclient is created withinCommVault, the backup software already knows what common parameters arechanged within the RMAN script and allows for an easy method ofadjusting the parameters without physically logging into the client andmanually adjusting the script. CommVault also allows for a script"preview" where the backup administrator or DBA can take the generatedcopy of the RMAN script and make changes.

This feature of CommVault brings up some interesting side notes:

  • If the RMAN script is essentially running the backup, with only minimal involvement from the backup software, what would prevent the script from being run at will from the database server?

  • If RMAN is so powerful, why use backup software at all to capture the backup output?

To the first point, thereis absolutely nothing that inherently prevents the RMAN script used byeither NetBackup or CommVault from executing manually, except anyexclusion windows for backup that have been created on the MasterServer/CommServe that restrict times in which backups can run. In fact,this feature of RMAN is what is used to run the RMAN scripts forCommVault that have been manually altered in the preview process. Thepower of RMAN is to provide a very flexible method of performing backupsof Oracle databases, run either from the backup software or the clientas a user-initiated backup.

But why run backupsoftware at all? RMAN has a scheduling utility, a method of tracking thebackup status of the various databases under control and is able torestore quite well. The answer goes back to a subject that was discussedin the SQL Server discussion above: the ability to track backups froman enterprise level, not just from a tactical level. While RMAN doeshave all these abilities, the backup administrator must be concernedwith protection of data in the entirety of the enterprise, not justwithin the specific set of clients that represent the Oracle databases.The backup software provides a means by which backups can be tracked andrecovered external to that provided strictly within the purview of theDBA. Using backup software to perform the Oracle backups also providesthe ability to restore servers that may be related to the particularOracle databases that are notbacked up by RMAN. Just as with SQL Server, a variety of approachesthat depend on the needs of the DBA(s), including RMAN scheduledbackups, is entirely appropriate.

Again, just as with SQLServer, it is important to assess the needs of the particular databaseand make some determinations regarding the criticality of the data, therequirement for granularity in restore, and the required RPO/RTO(recovery point objective/recovery time objective). Fully automated,RMAN backups to local disk may be a good solution for DBAs who requirethe need to very quickly restore data to a point in time, withoutwaiting for backup media. After this type of backup is taken, it ispossible to perform a backup software backup of the resulting RMANbackup. However, the same issues of the ability to track the backup andits status as it relates to other issues in the environment remain, ifthe second pass method is used.

RMAN provides a solution tothe problem. Instead of having RMAN define the backup target and providethe interface, create a local backup disk target within the backupsoftware (essentially make the database server its own media writer) andallow user-directed backups to the disk target. Then duplicate copiesof a now backup-software-tracked backup can be made by either usingVault/bpduplicate (in the case of NetBackup) or an AUX copy process (inthe case of CommVault). Very short-term retention periods can be setusing this method, again working with the DBA(s) to determine how manycopies they are required to keep, which allows the backup software toautomatically maintain the backup space, offloading that task from theDBAs as well.

During a restore, the backupmedia is always available (it is the local disk). If not, RMAN willseamlessly request the backup from the backup software. It will providethe appropriate media stream if it is currently available or throw analert to the backup administrator to retrieve the media and make itavailable for recovery.

However, this is anexpensive process—there are additional licenses required for the mediawriter implementation, along with the disk storage required tophysically store the backups that are generated by the process. So thismethod should only be used for critical systems that really require verytight restore timelines.

Sounds great! So, why can'tthis process be used for SQL Server? While it could in theory, theintegration between the SQL Server and either NetBackup or CommVault isas seamless as RMAN—a restore call from the native SQL utility onlyrecognizes backups generated by SQL Server—the restore is not seamlesslypassed to the backup software for restoration, regardless of location.The RMAN interface allows for this seamless interface through the use ofthe software "library"-level integration—that pipe that moves the datafrom RMAN to the backup software for tracking, storage, and retentionmaintenance.

The alternative is to simplyhave RMAN backup to the standard backup media, as defined with thebackup software. While the backup is not "local," it is still fullyaccessible from the RMAN recovery methods, it is simply located on mediathat is not on the same physical system. There are several issues thatDBA(s) cite when presented with this method:

  • The performance of the restore will be slow—it may be slower than the local restore, depending on a number of factors on the Media Server/MediaAgent side and the network connection used to perform the backup/restore of the database. If the initial backup is stored on a Disk Storage Unit (DSU)/magnetic library (MagLib) and duplicate copies are made for longer-term storage, then the overall performance will only be gated by the network.

  • The backup will not be accessible because the tape will be inaccessible for some reason. One way to avoid this argument is to use disk storage as the backup target instead of tape and then make a copy of the backup image to external storage. Since the backup will remain on storage that is directly attached to the media writer, the backup will be accessible for use at any time. As a side benefit, potentially more RMAN backups can be accommodated by more effectively using the total storage allocated for RMAN backups and centralizing it on one or two Media Servers/MediaAgents.

  • Backups/recoveries cannot be done at a specific time due to batch processing, database maintenance, and so on. Since the backups are running to disk, which the backups will then be duplicated to longer-term media as a secondary step, essentially the backups can be scheduled at almost any time. However, this does present challenges for the backup administrator in terms of resource allocation and network congestion.

Using standard media on a"remote" Media Server/MediaAgent provides the ability to fully managethe complete backup process without sacrificing DBA direct access to thebackups for operational purposes.

An alternative to usingRMAN backups is to create a snapshot copy of the database. This requiresthat the database and all related files, including control files andconfiguration files, reside on disk that has the ability to havesnapshots. These snapshots can be array-based snapshots, volumemanager-snapshots (through the use of software on the operating system),or (in the case of Windows) a VSS snapshot. The procedure for takingbackup copies based on snapshot is as follows:

  • Quiesce the database, either as a hot or cold database.

  • Create the snapshot.

  • Restart or remove the database from hot backup mode.

  • Mount the snapshot database.

  • Perform the backup.

The advantage of doing Oracledatabase backups in this way is that a complete copy of the database ismade during the backup process. This copy can be used for reporting,testing, and development; extraction for data warehousing; and justabout any other means needed to have a replica database available. Thedisadvantage is, of course, the requirement for a relatively largeamount of storage to perform the snapshot—either the amount needed forthe change when using snaps or the complete size of the database whenusing a clone.

Oracle provides a great deal offlexibility when it comes to protection of the data within thedatabase. Through the use of RMAN, Oracle can both meet the needs of theDBA for operational protection and the needs of the backupadministrator in strategic protection of the data over time.