Showing posts with label Database Backup. Show all posts
Showing posts with label Database Backup. Show all posts

Wednesday, March 13, 2019

Top 10 Backup and Recovery Best Practices

Top 10 Backup and Recovery Best Practices
Assuming that you are doing the Backup and Recovery basics
- Running in Archivelog mode
- Multiplexing the controlfile
- Taking regular backups
- Periodically doing a complete restore to test your procedures.
- Restore and recovery validate will not uncover nologging issues. Consider turning on force-logging if they need all transactions to be recovered, and not face nologging problems ( ALTER DATABASE FORCE LOGGING; )

1. Turn on block checking.
The aim is to detect, very early the presence of corrupt blocks in the database. This has a slight performance overhead, but will allow Oracle to detect early corruption caused by underlying disk, storage system, or I/O system problems.

SQL> alter system set db_block_checking = true scope=both;

2. Turn on Block Change Tracking tracking when using RMAN incremental backups (10g and higher)
The Change Tracking File contains information that allows the RMAN incremental backup process to avoid reading data that has not been modified since the last backup. When Block Change Tracking is not used, all blocks must be read to determine if they have been modified since the last backup.

SQL> alter database enable block change tracking using file '/u01/oradata/ora1/change_tracking.f';

3. Duplex redo log groups and members and have more than one archive log destination.
If an archivelog is corrupted or lost, by having multiple copies in multiple locations, the other logs will still be available and could be used.
If an online log is deleted or becomes corrupt, you will have another member that can be used to recover if required.

SQL> alter system set log_archive_dest_2='location=/new/location/archive2' scope=both;
SQL> alter database add logfile member '/new/location/redo21.log' to group 1;

4. When backing up the database with RMAN use the CHECK LOGICAL option.
This will cause RMAN to check for logical corruption within a block, in addition to the normal checksum verification. This is the best way to ensure that you will get a good backup.

RMAN> backup check logical database plus archivelog delete input;

5. Test your backups.
This will do everything except actually restore the database. This is the best method to determine if your backup is good and usable before being in a situation where it is critical and issues exist.
If using RMAN this can be done with:

RMAN> restore validate database;

6. When using RMAN have each datafile in a single backup piece
When doing a partial restore RMAN must read through the entire piece to get the datafile/archivelog requested. The smaller the backup piece the quicker the restore can complete. This is especially relevent with tape backups of large databases or where the restore is only on individual / few files.
However, very small values for filesperset will also cause larger numbers of backup pieces to be created, which can reduce backup performance and increase processing time for maintenance operations. So those factors must be weighed against the desired restore performance.

RMAN> backup database filesperset 1 plus archivelog delete input;

7. Maintain your RMAN catalog/controlfile
Choose your retention policy carefully. Make sure that it complements your tape subsystem retention policy, requirements for backup recovery strategy. If not using a catalog, ensure that your CONTROL_FILE_RECORD_KEEP_TIME parameter matches your retention policy.

SQL> alter system set control_file_record_keep_time=21 scope=both;

This will keep 21 days of backup records in the control file.

Run regular catalog maintenance.
REASON: Delete obsolete will remove backups that are outside your retention policy.
If obsolete backups are not deleted, the catalog will continue to grow until performance
becomes an issue.

RMAN> delete obsolete;

REASON: crosschecking will check that the catalog/controlfile matches the physical backups.
If a backup is missing, it will set the piece to 'EXPIRED' so when a restore is started,
that it will not be eligible, and an earlier backup will be used. To remove the expired
backups from the catalog/controlfile use the delete expired command.

RMAN> crosscheck backup;
RMAN> delete expired backup;

8. Prepare for loss of controlfiles.
This will ensure that you always have an up to date controlfile available that has been taken at the end of the current backup, rather then during the backup itself.

RMAN> configure controlfile autobackup on;

keep your backup logs
REASON: The backup log contains parameters for your tape access, locations on controlfile backups
that can be utilised if complete loss occurs.

9. Test your recovery
REASON: During a recovery situation this will let you know how the recovery will go without
actually doing it, and can avoid having to restore source datafiles again.

SQL> recover database test;

10. In RMAN backups do not specify 'delete all input' when backing up archivelogs
REASON: Delete all input' will backup from one destination then delete both copies of the archivelog where as 'delete input' will backup from one location and then delete what has been backed up. The next backup will back up those from location 2 as well as new logs from location 1, then delete all that are backed up. This means that you will have the archivelogs since the last backup available on disk in location 2 (as well as backed up once) and two copies backup up prior to the previous backup.

Backup and Recovery Scenarios

Backup and Recovery Scenariosa) Consistent backups

A consistent backup means that all data files and control files are consistent  to a point in time. I.e. they have the same SCN. This is the only method of  backup when the database is in NO Archive log mode.

b) Inconsistent backups
An Inconsistent backup is possible only when the database is in Archivelog mode.  We must apply redo logs to the data files, in order to restore the database to a consistent state.  Inconsistent backups can be taken using RMAN when the database is open.
Inconsistent backups can also be taken using other OS tools provided the tablespaces (or database) is put into backup mode.
ie: SQL> alter tablespace data begin backup;
    SQL> alter database begin backup; (version 10 and above only)


c) Database Archive mode
The database can run in either Archivelog mode or noarchivelog mode.  When we first create the database, we specify if it is to be in Archivelog  mode. Then in the init.ora file we set the parameter log_archive_start=true  so that archiving will start automatically on startup.
If the database has not been created with Archivelog mode enabled, we can  issue the command whilst the database is mounted, not open.
SQL> alter database Archivelog;.
SQL> log archive start
SQL> alter database open;
SQL> archive log list
This command will show us the log mode and if automatic archival is set.
 

d) Backup Methods
Essentially, there are two backup methods, hot and cold, also known as online and offline, respectively. A cold backup is one taken when the database is shutdown. The database must be shutdown cleanly.  A hot backup is on taken when the database is running. Commands for a hot backup:
For non RMAN backups:
1. Have the database in archivelog mode (see above)
2. SQL> archive log list
--This will show what the oldest online log sequence is. As a precaution, always keep the all archived log files starting from the oldest online log sequence.
3. SQL> Alter tablespace tablespace_name BEGIN BACKUP;
or SQL> alter database begin backup (for v10 and above).
4. --Using an OS command, backup the datafile(s) of this tablespace.
5. SQL> Alter tablespace tablespace_name END BACKUP
--- repeat step 3, 4, 5 for each tablespace.
or SQL> alter database end backup; for version 10 and above
6. SQL> archive log list
---do this again to obtain the current log sequence. make sure that we have a copy of this redo log file.
7. So to force an archived log, issue
SQL> ALTER SYSTEM SWITCH LOGFILE
A better way to force this would be:
SQL> alter system archive log current;
8. SQL> archive log list
This is done again to check if the log file had been archived and to find the latest archived sequence number.
9. Backup all archived log files determined from steps 2 and 8.
10. Back up the control file:
SQL> Alter database backup controlfile to 'filename'
For RMAN backups:
see Note.<>  RMAN - Sample Backup Scripts 10g
or the appropriate RMAN documentation.


e) Incremental backups
These are backups that are taken on blocks that have been modified since the last backup. These are useful as they don't take up as much space and time. There are two kinds of incremental backups Cumulative and Non cumulative.
Cumulative incremental backups include all blocks that were changed since the  last backup at a lower level. This one reduces the work during restoration as  only one backup contains all the changed blocks.
Noncumulative only includes blocks that were changed since the previous backup  at the same or lower level.
Using rman, we issue the command "backup incremental level n"
Oracle v9 and below RMAN will back up empty blocks, oracle v10.2 RMAN will not back up empty blocks


f) Support scenarios
When the database crashes, we now have a backup. We restore the backup and
then recover the database. Also, don't forget to take a backup of the control
file whenever there is a schema change.

RECOVERY SCENARIOS

Note: All online datafiles must be at the same point in time when completing recovery;
There are several kinds of recovery we can perform, depending on the type of  failure and the kind of backup we have. Essentially, if we are not running in archive log mode, then we can only recover the cold backup of the database and we will lose any new data and changes made since that backup was taken. If, however, the database is in Archivelog mode we will be able to restore the database up to the time of failure. There are three basic types of recovery:


1. Online Block Recovery.
This is performed automatically by Oracle.(pmon) Occurs when a process dies  while changing a buffer. Oracle will reconstruct the buffer using the online  redo logs and writes it to disk.


2. Thread Recovery.
This is also performed automatically by Oracle. Occurs when an instance  crashes while having the database open. Oracle applies all the redo changes  in the thread that occurred since the last time the thread was checkpointed.


3. Media Recovery.
This is required when a data file is restored from backup. The checkpoint count in the data files here are not equal to the check point count in the  control file.
Now let's explain a little about Redo vs Undo.
Redo information is recorded so that all commands that took place can be  repeated during recovery. Undo information is recorded so that we can undo changes made by the current transaction but were not committed. The Redo Logs  are used to Roll Forward the changes made, both committed and non- committed  changes. Then from the Undo segments, the undo information is used to
rollback the uncommitted changes.
Media Failure and Recovery in Noarchivelog Mode
In this case, our only option is to restore a backup of Oracle files. The files we need are all datafiles, and control files.  We only need to restore the password file or parameter files if they are lost or are corrupted.
Media Failure and Recovery in Archivelog Mode
In this case, there are several kinds of recovery we can perform, depending on what has been lost.


The three basic kinds of recovery are:
1. Recover database - here we use the recover database command and the database must be closed and mounted. Oracle will recover all datafiles that are online.


2. Recover tablespace - use the recover tablespace command. The database can be open but the tablespace must be offline.


3. Recover datafile - use the recover datafile command. The database can be  open but the specified datafile must be offline.
Note: We must have all archived logs since the backup we restored from,  or else we will not have a complete recovery.


a) Point in Time recovery:
A typical scenario is that we dropped a table at say noon, and want to recover it. We will have to restore the appropriate datafiles and do a point-in-time  recovery to a time just before noon.
Note: We will lose any transactions that occurred after noon.  After we have recovered until noon, we must open the database with resetlogs. This is necessary to reset the log numbers, which will protect the database  from having the redo logs that weren't used be applied.
The four incomplete recovery scenarios all work the same:
Recover database until time '1999-12-01:12:00:00';
Recover database until cancel; (we type in cancel to stop)
Recover database until change n;
Recover database until cancel using backup controlfile;
Note: When performing an incomplete recovery, the datafiles must be online. Do a select * from v$recover_file to find out if there are any files  which are offline. If we were to perform a recovery on a database which has  tablespaces offline, and they had not been taken offline in a normal state, we  will lose them when we issue the open resetlogs command. This is because the data file needs recovery from a point before the resetlogs option was used.


b) Recovery without control file
If we have lost the current control file, or the current control file is  inconsistent with files that we  need to recover, we need to recover either by using a backup control file command or create a new control file. We can also recreate the control file based on the current one using the  'SQL> backup control file to trace' command which will create a script for we to  run to create a new one.  Recover database using backup control file command must be used when using a  control file other that the current. The database must then be opened with
resetlogs option.


c) Recovery of missing datafile with rollback segments
The tricky part here is if we are performing online recovery. Otherwise we can just use the recover datafile command. Now, if we are performing an  online recovery, we will need to create a new undo tablespace to be used.  Once the old tablespace has been recovered it can be dropped once any uncommitted  transactions have rolled back.


d) Recovery of missing datafile without undo segments
There are three ways to recover in this scenario, as mentioned above.
1. recover database;
2. recover datafile 'c:\orant\database\usr1orcl.ora';
3. recover tablespace user_data;


e) Recovery with missing online redo logs
Missing online redo logs means that somehow we have lost our redo logs before  they had a chance to archived. This means that crash recovery cannot be  performed, so media recovery is required instead. All datafiles will need to be restored and rolled forwarded until the last available archived log file is applied. This is thus an incomplete recovery, and as such, the recover
database command is necessary.
As always, when an incomplete recovery is performed, we must open the database with resetlogs.
Note: the best way to avoid this kind of a loss, is to mirror online log files.


f) Recovery with missing archived redo logs
If archives are missing, the only way to recover the database is to restore from latest backup. We will have lost any uncommitted
transactions which were recorded in the archived redo logs. Again, this is why  Oracle strongly suggests mirroring online redo logs and duplicating copies  of the archives.


g) Recovery with resetlogs option
Reset log option should be the last resort, however, as we have seen from above, it may be required due to incomplete recoveries. (recover using a backup control file, or a point in time recovery). It is imperative that we backup up the database immediately after we have opened the database with reset logs.  It is possible to recover through a resetlogs, and made easier with Oracle V10, but easier
to restore from the backup taken after the resetlogs


h) Recovery with corrupted undo segments.
If an undo segment is corrupted, and contains uncommitted system data we may not be able to open the database.
The best alternative in this situation is to recover the corrupt block using the RMAN blockrecover command next best would be to restore the datafile from backup and do a complete recovery.
If a backup does not exist and If the database is able to open (non system object) The first step is to find out what object is causing the rollback to appear corrupted. If we can determine that, we can drop that object.
So, how do we find out if it's actually a bad object?
1. Make sure that all tablespaces are online and all datafiles are online. This can be checked through via the v$recover_file view.


2. Put the following in the init.ora:
event = "10015 trace name context forever, level 10"
This event will generate a trace file that will reveal information about the  transaction Oracle is trying to roll back and most importantly, what object  Oracle is trying to apply the undo to.
Note: In Oracle v9 and above this information can be found in the alert log.
Stop and start the database.


3. Check in the directory that is specified by the user_dump_dest parameter (in the init.ora or show parameter command) for a trace file that was  generated at startup time.


4. In the trace file, there should be a message similar to: error recovery tx(#,#) object #.
TX(#,#) refers to transaction information.
The object # is the same as the object_id in sys.dba_objects.


5. Use the following query to find out what object Oracle is trying to perform recovery on.
select owner, object_name, object_type, status
from dba_objects where object_id = <object #>;


6. Drop the offending object so the undo can be released. An export or relying on a backup may be necessary to restore the object after the corrupted undo segment is released.
 

i) Recovery with System Clock change.
We can end up with duplicate timestamps in the datafiles when a system clock  changes. This usually occurs when daylight saving comes into or out of the picture. In this case, rather than a point in time recovery, recover to a specify log or SCN
 

j) Recovery with missing System tablespace.
The only option is to restore from a backup.
 

k) Media Recovery of offline tablespace
When a tablespace is offline, we cannot recover datafiles belonging to this  tablespace using recover database command. The reason is because a recover database command will only recover online datafiles. Since the tablespace is  offline, it thinks the datafiles are offline as well, so even if we  recover database and roll forward, the datafiles in this tablespace will not be touched.  Instead, we  need to perform a recover tablespace command. Alternatively, we could restored the datafiles from a cold backup, mount the database and select  from the v$datafile view to see if any of the datafiles are offline. If they are, bring them online, and then we can perform a recover database command.
 

l) Recovery of Read-Only tablespaces
If we have a current control file, then recovery of read only tablespaces is  no different than recovering read-write files. The issues with read-only tablespaces arise if we have to use a backup control file. If the tablespace is in read-only mode, and hasn't changed to read-write since the last backup, then we will be able to media recovery using a backup control file by taking the tablespace offline. The reason here is that when we are using the backup control file, we must open the database with resetlogs. And we know that Oracle wont let us read files from before a resetlogs was done. However, there is an exception with read-only tablespaces. We will be able to take the datafiles online after we have opened the database.
When we have tablespaces that switch modes and we don't have a current control file, we should use a backup control file that recognizes the tablespace in  read-write mode. If we don't have a backup control file, we can create a new  one using the create controlfile command.  Basically, the point here is that we should take a backup of the control file every time we switch a tablespaces mode.


RMAN - Sample Backup Scripts

RMAN - Sample Backup Scripts



Making Whole Database Backups with RMAN
We can perform whole database backups with the database mounted or open. To perform a whole database backup from the RMAN prompt the BACKUP DATABASE command can be used. The simplest form of the command requires no parameters, as shown in this example:
RMAN> backup database;

In the following example no backup location was specified meaning that the backups will automatically be placed in the Flash Recovery Area (FRA). If the FRA has not been setup then all backups default to $ORACLE_HOME/dbs.

How to check if the RFA has been setup:
SQL> show parameter recovery_file_dest

NAME                                 TYPE           VALUE
------------------------------------ -----------    -------------------------db_recovery_file_dest                string         /recovery_area
db_recovery_file_dest_size           big integer    50G
If FRA is not setup (i.e. values are null) please refer to the following note for assistance in setting it up.
Check my Blog: What is a Flash Recovery Area and how to configure it?

If you wish to place backup outside the FRA then following RMAN syntax may be used.

RMAN> backup database format '/backups/PROD/df_t%t_s%s_p%p';

• Backing Up Individual Tablespaces with RMAN

RMAN allows individual tablespaces to be backed up with the database in open or mount stage.

RMAN> backup tablespace SYSTEM, UNDOTBS, USERS;

• Backing Up Individual Datafiles and Datafile Copies with RMAN
The flexibility of being able to backup a single datafile is also available. As seen below to reference the datafile via the file# or file name. Multiple datafiles can be backed up at a time.

RMAN> backup datafile 2;
RMAN> backup datafile 2 format '/backups/PROD/df_t%t_s%s_p%p';
RMAN> backup datafile 1,2,3,6,7,8;
RMAN> backup datafile '/oradata/system01.dbf';

• Backing Up the current controlfile & Spfile
The controlfile and spfile are backed up in similar ways. Whenever a full database backup if performed, the controlfile and spfile are backed up. In fact whenever file#1 is backed up these two files are backed up also.
It is also good practise to backup the controlfile especially after tablespaces and datafiles have been added or deleted.
If not using an RMAN catalog, it is more important that we frequently backup our controlfile. We can also configure another method of controlfile backup which is referred to as 'autobackup of controlfile'.
Refer to the manual for more information regarding this feature.

RMAN> backup current controlfile;
RMAN> backup current controlfile format '/backups/PROD/df_t%t_s%s_p%p';
RMAN> backup spfile;

• Backing Up Archivelogs
It is important that archivelogs are backed up in a timely manner and correctly removed to ensure the file system does not fill up. Below are a few different examples. Option one backs up all archive logs to the FRA or default location. Option two backs up all archivelogs generate between 7 and 30 days and option three backs up archive logs from log sequence number XXX until logseq YYY then deletes the archivelogs. It also backups the archive logs to a specified location.

RMAN> backup archivelog all;
RMAN> backup archivelog from time 'sysdate-30' until time 'sysdate-7';
RMAN> backup archivelog from logseq=XXX until logseq=YYY delete input format '/backups/PROD/%d_archive_%T_%u_s%s_p%p';

• Backing up the Whole database including archivelogs
Below is an example of how the whole database can be backed up and at the same time backup the archive logs and purge them following a successful backup.  The first example backups up to the FRA, however it we wish to redirect the output the second command shows how this is achieved.

RMAN> backup database plus archivelog delete input;
RMAN> backup database plus archivelog delete input format '/backups/PROD/df_t%t_s%s_p%p';

Reference
Oracle® Database Backup and Recovery Basics



Some Tips About FNDLOAD

Data Synchronization  Data Synchronization is a process in which some setup data would be synchronized, and this would be more important w...