Oracle7 Server Administrator's Guide
Examples of Recovery Procedures
This section describes how to recover from common media failures, and includes the following topics:
Types of Media Failures
Media failures fall into two general categories: permanent and temporary. Permanent media failures are serious hardware problems that cause the permanent loss of data on the disk. Lost data cannot be recovered except by repairing or replacing the failed storage device and restoring backups of the files stored on the damaged storage device. Temporary media failures are hardware problems that make data temporarily inaccessible; they do not corrupt the data. Following are two examples of temporary media failures:
- A disk controller fails. Once the disk controller is replaced, the data on the disk can be accessed.
- Power to a storage device is cut off. Once the power is returned, the storage device and all associated data is accessible again.
Loss of Datafiles
If a media failure affects datafiles of a database, the appropriate recovery procedure depends on the archiving mode of the database, the type of media failure, and the exact files affected by the media failure. The following sections explain the appropriate recovery strategies in various situations.
Loss of Datafiles, NOARCHIVELOG Mode
If either a permanent or temporary media failure affects any datafiles of a database operating in NOARCHIVELOG mode, Oracle automatically shuts down the database. Depending on the type of media failure, you can use one of two recovery paths:
- If the media failure is temporary, correct the temporary hardware problem and restart the database. Usually, instance recovery is possible, and all committed transactions can be recovered using the online redo log.
- If the media failure is permanent, follow the steps to recover from the media failure.
Loss of Datafiles, ARCHIVELOG Mode
If either a permanent or temporary media failure affects the datafiles of a database operating in ARCHIVELOG mode, the following situations can exist:
- If a temporary or permanent media failure affects any datafiles of the SYSTEM tablespace or any datafiles that contain active rollback segments, the database becomes inoperable and should be immediately shut down if it has not already been shut down by Oracle.
If the hardware problem is temporary, correct the problem and restart the database. Usually, instance recovery is possible, and all committed transactions can be recovered using the online redo log.
If the hardware problem is permanent, follow the procedure in "Performing Closed Database Recovery" .
- If a temporary or permanent media failure affects only datafiles not mentioned in the previous item, the affected datafiles are unavailable and taken offline automatically by Oracle, but the database can continue to operate.
If the unaffected portions of the database must remain available, do not shut down the database. First take all tablespaces that contain problem datafiles offline using the temporary option. Then follow the procedure in "Performing Open Database-Offline Tablespace Recovery" .
Loss of Online Redo Log Files
If a media failure has affected the online redo log of a database, the appropriate recovery procedure depends on the configuration of the online redo log (mirrored or non-mirrored), the type of media failure (temporary or permanent), and the types of online redo log files affected by the media failure (current, active, not yet archived, or inactive online redo log files). The following sections describe the appropriate recovery strategies in various situations.
Loss of an Online Redo Log Member of Mirrored Online Redo Log
If the online redo log of a database is mirrored, and at least one member of each online redo log group is not affected by the media failure, Oracle allows the database to continue functioning as normal (error messages are written to the LGWR trace file and ALERT file of the database). However, you should handle the problem by taking one of the following actions:
- If the hardware problem is temporary, correct the problem. After it has been fixed, LGWR accesses the previously unavailable online redo log files as if the problem never existed.
- If the hardware problem is permanent, use the DROP command to drop the damaged member and use the ADD command to add a new member.
Note: The newly added member provides no redundancy until the log group is reused.
Loss of All Online Redo Log Members of an Online Redo Log Group
If all members of an online redo log group are damaged by a media failure, different situations can arise, depending on the type of online redo log group affected by the failure and the archiving mode of the database. You can locate the filename in V$LOGFILE, and then look for the group number corresponding to the one you lost to verify the lost file's status (verify that it was inactive).
SELECT *
FROM v$logfile
;
GROUP# STATUS MEMBER
------------------------------------------------
0001 log1
0002 log2
0003 log3
SELECT *
FROM v$log
;
GROUP# MEMBERS STATUS ARCHIVED
-------------------------------------------------
0001 1 INACTIVE YES
0002 1 ACTIVE YES
0003 1 CURRENT NO
Loss of an Inactive, Online Redo Log Group If all members of an inactive online redo log group are damaged, the following situations can arise:
- If a temporary media failure affects only an inactive online redo log group, correct the problem; LGWR can reuse the group when required.
- If a media failure permanently prevents access to only an inactive online redo log group, the damaged inactive online redo log group will eventually halt normal database operation.
If you notice the problem before the database shuts down, use the ALTER DATABASE CLEAR LOGFILE command.
If the database has already shut down, perform the following tasks:
To Recover From Loss of an Inactive, Online Redo Log Group
1. Abort the current instance immediately with the Server Manager Shutdown Database dialog box with the Shutdown Abort radio button selected, or the SHUTDOWN command with the ABORT option.
2. Start a new instance and mount the database, but do not open it. This operation can be performed with the Server Manager Startup Database dialog box with the Startup Mount radio button selected, or the STARTUP command with the MOUNT option.
3. If the lost log was archived, issue the ALTER DATABASE CLEAR LOGFILE command.
4. If the lost log was unarchived, issue the ALTER DATABASE CLEAR UNARCHIVED LOGFILE command, and immediately backup the database. Also backup the database's control file (using the ALTER DATABASE command with the BACKUP CONTROLFILE option).
Clearing a log that has not been archived allows it to be reused without archiving it. However, this will make backups unusable if they were started before the last change in the log (unless the file was taken offline prior to the first change in the log). Hence, if the cleared logfile is needed for recovery of a backup, it will not be possible to recover that backup.
If there is an offline datafile that requires the cleared unarchived log to bring it online, the keywords UNRECOVERABLE DATAFILE are required. The datafile and its entire tablespace will have to be dropped from the database because the redo necessary to bring it online is being cleared, and there is no copy of it.
Note: The ALTER DATABASE CLEAR LOGFILE command could fail (with an I/O error due to media failure) in two cases:
- When it is not possible to relocate the logfile onto alternative media by re-creating it under the currently configured logfile name.
- When it is not possible to reuse the currently configured logfile name to recreate the logfile because the name itself is invalid or unusable (for example, due to media failure).
In these two cases, the CLEAR LOGFILE command (before receiving the I/O error) would have successfully updated the control file to change the state of the logfile to "being cleared" and "not requiring archiving." The I/O error occurred at the step in which CLEAR LOGFILE attempts to create the new logfile and write zeros to it.
At this point, you can complete recovery by executing, in order, the following commands:
- ADD a logfile under a new name.
- DROP the logfile under the old name.
You can now open the database.
Loss of an Active Online Redo Log Group If your database is still running and the lost active log is not the current log, you can use the ALTER SYSTEM CHECKPOINT command. If successful, your active log is rendered inactive, and you can follow the steps .
If unsuccessful, or if your database has already halted, you cannot use the steps . Instead, perform the following tasks:
To Recover From Loss of an Active Online Redo Log Group
1. If the media failure is temporary, correct the problem so that Oracle can reuse the group when required.
2. If the database is in NOARCHIVELOG mode and a permanent media failure prevents access to an active online redo log group, restore the database from a full backup.
After restoring the database, redo the work and open the database using the RESETLOGS option. Updates done after the backup have been lost and must be re-executed. Shut down the database and take a full offline backup.
3. If the database was in ARCHIVELOG mode, incomplete media recovery must be performed. Use the procedure given in "Performing Cancel-Based, Time-Based, or Change-Based Recovery" , recovering up through the log before the damaged log. Ensure that the current name of the lost redo log can be used for a newly created file. If not, issue the RENAME command to rename the damaged online redo log group to a new location.
4. Open the database using the RESETLOGS option.
Note: All updates executed from the endpoint of the incomplete recovery to the present must be re-executed.
Loss of Multiple Redo Log Groups If you have lost multiple groups of the online redo log, use the recovery method for the most difficult log to recover. The order of difficulty, from most difficult to least, follows:
1. the current online redo log
2. the active online redo log
3. the unarchived redo log
4. the inactive online redo log
Loss of Archived Redo Log Files
If the database is operating so that filled online redo log groups are being archived, and the only copy of an archived redo log file is damaged, it does not affect the present operation of the database. However, the following situations can arise if media recovery is required in the future:
- If all datafiles have been backed up after the filled online redo log group (which is now archived) was written, the archived version of the filled online redo log group is not required for complete media recovery operation.
- Assume the most recent backup file of a datafile was taken before the filled online redo log group was written. The group now corresponds to the damaged archived redo log file. At some future point, the corresponding datafile is damaged by a permanent media failure. The most recent backup of the damaged datafile must be used, and incomplete media recovery can only recover the database up to the damaged archived redo log file.
- If time-based recovery is needed, the damaged archived redo log file may be required if you use old datafile backups that were taken before the original online redo log group was written. In this case, the incomplete media recovery can only recover the database up to the damaged archived redo log group.
Warning: If you know that an archived redo log group has been damaged, immediately backup all datafiles so that you will have a complete backup that does not require the damaged archived redo log.
Loss of Control Files
If a media failure has affected the control files of a database (whether control files are mirrored or not), the database continues to run until the first time that an Oracle background process needs to access the control files. At this point, the database and instance are automatically shut down.
If the media failure is temporary and the database has not yet shut down, immediately correcting the media failure can avoid the automatic shut down of the database. However, if the database shuts down before the temporary media failure is corrected, you can restart the database after fixing the problem (and restoring access to the control files).
The appropriate recovery procedure for media failures that permanently prevent access to control files of a database depends on whether you have mirrored the control files. The following sections describe the appropriate procedures.
Loss of a Member of a Mirrored Control File
Use the following steps to recover a database after one or more control files of a database have been damaged by a permanent media failure, and at least one control file has not been damaged by the media failure:
Note: If all control files of a mirrored control file configuration have been damaged, follow the instructions for recovering from the loss of non-mirrored control files.
To Recover a Database After Control Files Are Damaged
1. If the instance is still running, immediately abort the current instance with the Server Manager Shutdown Abort option of the Shutdown Database dialog box, or the SHUTDOWN command with the ABORT option.
2. Correct the hardware problem that caused the media failure. If the hardware problem cannot be repaired quickly, you can proceed with database recovery by restoring damaged control files to an alternative storage device.
3. Use an intact copy of the database's control file to copy over the damaged control files. If possible, copy the intact control file to the original locations of all damaged control files. If the hardware problem persists, copy the intact control file to alternative locations. If you restored all damaged control files to their original location, proceed to Step 5. If all damaged control files were not restored, or not restored to their original location, proceed to Step 4.
4. If all damaged control files were not restored, or not restored to their original location in Step 3, the parameter file of the database must be edited so that the CONTROL_FILES parameter reflects the current locations of all control files and excludes all control files that were not restored.
5. Start a new instance. Mount and open the database.
Loss of All Copies of the Current Control File
If all control files of a database have been lost or damaged by a permanent media failure, but all online redo logfiles remain intact, you can recover by creating a new control file (using the CREATE CONTROLFILE command with the NORESETLOGS option). Then execute RECOVER DATABASE followed by ALTER DATABASE OPEN.
Depending on the existence and currency of a control file backup, you have the following options for generating the text of the CREATE CONTROLFILE command:
- If you have executed ALTER DATABASE BACKUP CONTROLFILE TO TRACE NORESETLOGS since you made the last structural change to the database, and if you have saved the SQL command output, then you can use the CREATE CONTROLFILE command from the output as-is. If, however, your most recent execution of ALTER DATABASE BACKUP CONTROLFILE TO TRACE was performed before you made a structural change to the database, then you must edit the output of ALTER DATABASE BACKUP CONTROLFILE TO TRACE to reflect that structural change. For example, if you recently added a datafile to the database, then you should add that datafile to the DATAFILE clause of the CREATE CONTROLFILE command.
- If you have not backed up the control file using the TO TRACE option, but instead have used the TO filename option of the ALTER DATABASE BACKUP CONTROLFILE command, then you can use the control file copy to obtain SQL command output. You can do this by copying the backup control file and executing STARTUP MOUNT before executing ALTER DATABASE BACKUP CONTROLFILE TO TRACE NORESETLOGS. If your control file copy predated a recent structural change, you must edit the TO TRACE output to reflect that structural change.
- If you do not have a backup of the control file (in either TO TRACE format or TO filename format), then you must generate the CREATE CONTROLFILE command manually.
Recovery From User Errors
An accidental or erroneous operational or programmatic change to the database can cause loss or corruption of data. Recovery may require a return to a state prior to the error.
Note: If the database administrator has properly granted powerful privileges (such as DROP ANY TABLE) to only selected, appropriate users, user errors that require database recovery are minimized.
To Recover Data Lost or Corrupted by User Error
1. Back up the existing, intact database.
2. Leave the existing database intact, but reconstruct a temporary copy of the database up to the time of the user error using time-based recovery.
3. Export the lost or corrupted data from the reconstructed, temporary copy of the database.
4. Import the lost or corrupted data into the permanent database.
5. Delete the files associated with the temporary copy of the reconstructed database to conserve disk space.
The following scenario describes how to recover a table that has been accidentally dropped.
1. The database that experienced the user error can remain online and available for normal use. The database can remain open or be shut down. Back up all datafiles of the existing database in case an error is made during the remaining steps of this procedure.
2. Create a temporary copy of the database to a past point-in-time using time-based recovery. Be careful not to cause a conflict with the existing control file of the permanent database. Restore a single control file backup to an alternative location (step 4) and edit the parameter file, as necessary, or create a new control file at the alternative location. Also, restore all datafiles to alternative locations (step 5) so that you do not affect the permanent copy of the database.
3. Export the lost data using the Oracle utility Export from the temporary, restored version of the database. In this case, export the accidentally dropped table.
Note: System audit options are exported.
4. Import the exported data (step 3) into the permanent copy of the database using the Oracle Import utility.
5. Delete the files of the temporary, reconstructed copy of the database to conserve space.
See Also: For more information about the Import and Export utilities, see Oracle7 Server Utilities.