The Oracle Recovery Catalog(henceforth, the catalog) is used by Oracle’s RMAN backup utility to store backup metadata and is included by Oracle as best practice for backup and recovery.  In this article, I will discuss about the common dilemma of maintaining the catalog and the proposition of virtualizing it.

First off, this is not to be confused with the Virtual Private Catalog feature, which uses the single catalog but gives the users the impression that they have their own catalog.  That is the same idea as Virtual Private Database.

The Recovery Catalog vs. the Controlfiles

The Recovery Catalog is a vital piece of database recovery and should be included in every company’s toolset.  If you don’t use the catalog but choose to use controlfiles instead, here are a few things you can expect:

  1. RMAN retention becomes constrained by the control_file_record_keep_time parameter.
  2. Controlfiles are frequently accessed by Oracle.  They’re inherently not a good place to store anything else.
  3. Controlfiles only store the metadata for the current incarnation.  If you do an incomplete recovery and resetlogs, you cannot restore back to the previous incarnation unless you restore the controlfile.
  4. You won’t be able to store RMAN scripts and share it across databases.
  5. Using controlfiles, as metadata ages and falls out of the retention period enforced by the control_file_record_keep_time parameter, this means that you cannot retain backups for the long term.

These are just a few of the limitations you have to be aware of if you so choose to use the controlfiles instead of the catalog.




The Dilemma

But in spite of being a best practice, centralizing backup metadata into a single recovery catalog is a common challenge especially for multinational companies due to network latency across regions.  The RMAN backup mechanism itself does not store extensive data in the recovery catalog; only the metadata goes there.  The actual backup data goes from the database servers to the backup location, which usually is local to the data center.  However, if you have a network with a high latency between the database and the catalog, you can expect catalog resyncs to take a long time.  I’ve dealt with one case where the database was in Chennai and the catalog was in Los Angeles.  And it was all fine.  In a different case, it was between Paris and Los Angeles.  And the resync took forever.  It really depends on your network latency.

Other centralized technologies also face the similar challenges(e.g., OEM).

In addition, the small space footprint and workload of a recovery catalog often do not warrant a full-blown, bare metal HA configuration such as an Oracle RAC cluster or Data Guard.  What if you just put the catalog database onto an existing RAC cluster?  The issue with it is that you would just be storing backup metadata of the databases on the same hardware as the catalog.  As critical as it is, if the catalog database suffers from a catastrophic event, it does not however cause any service outage of any kind, since the latest backup metadata is still stored in the controlfiles.  But you just have to fix the catalog as soon as possible.

Virtualizing the RMAN Catalog

In summary, these are some of the factors supporting virtualization of the catalog:

  1. The catalog should be in close proximity to the databases being backed up.
  2. But you’d not want it to share the same hardware as the databases being backed up.
  3. Small size and low workload do not warrant full-blown, bare metal HA.
  4. Catalog database failure does not immediately cause outage concerns.
  5. Current backup metadata is still stored in controlfiles.  Once the catalog is restored, catalog resyncs will get the catalog updated.

All these factors lead to the notion that we need a local catalog built on something easy to provision and, in recovery scenarios for itself, easy to recover.  Virtualization is an ideal solution for this problem, offering:

  1. Some level of HA as the node is detached from a SPOF on a standalone hardware
  2. Ease of node recovery through VM snapshots
  3. Ease of deployment to multiple regions through VM cloning

With virtualization, we can create a single VM guest in each data center or region with an Oracle database instance serving as the local recovery catalog.  It will just use the VM guest’s filesystem to store the catalog database.  Plain and simple.

On each database that needs to talk to the catalog, you’d set up a TNS entry named, say, RCAT, in which the hostname will point to the correct, local catalog.  This makes the TNS entry location-agnostic, in the sense that all backups will talk to the same TNS name regardless of which region’s catalog database it is.  This also simplifies the recovery procedure in that during recovery scenarios you don’t have to figure out which TNS entry you should use.



Backing up the Local Catalog

Several methods can be used to back up the local catalog database on the VM guest:

  1. Perform a VM snapshot.  Since the catalog’s database files are in the VM guest’s filesystem, the VM snapshot also backs up the database.  Note that to follow best practice, we also need to do a “alter database begin backup;” before the snapshot, and a “alter database end backup;” after the snapshot.  This protects against in-flight blocks being modified partially during backup.
  2. Expdp the catalog schema.  You can store the export file in a mounted filesystem in the data center, or scp it to a global location.  Follow a purge criteria to keep a rolling period of such exports.
  3. Backup using RMAN with nocatalog.  This seems to be an overkill.  But you do have this option.

Recovering the Local Catalog

These are the options for recovering the catalog during a disaster scenario:

  1. Impdp
    1. Restore a base of the catalog using any of the following methods:
      1. Restore the latest VM snapshot
      2. Importing the latest expdp export onto the catalog schema w/ TABLE_EXISTS_ACTION=REPLACE.
    2. Issue a “resync catalog;” from each database target.  Or just wait for the next backup to trigger a catalog resync.
  2. RMAN Recovery

Benefits

By virtualizing the recovery catalog, we can achieve the following benefits for your organization, in particular the larger ones.

  1. Faster response from the catalog.
  2. Higher scalability for the catalog as a whole, as network latency is no longer the bottleneck.
  3. Less hardware to maintain.  You no longer need dedicated hardware for the catalog database anywhere.

Lastly, thank you for lasting till the end.  Drop a comment or two below!  Good or bad, both welcomed!



References

https://docs.oracle.com/cd/B28359_01/backup.111/b28270/rcmcatdb.htm#BRADV8015

 

Leave a Reply

Your email address will not be published. Required fields are marked *