Recovery Procedures for Oracle Tablespace Failures: Recovery Flow Using RMAN

English

In Oracle Database operations, if a failure occurs in a specific tablespace, it is possible to recover only that tablespace without shutting down the entire database. This article explains the recovery procedures for when a failure occurs in a specific tablespace or data file using Oracle Recovery Manager (RMAN).

1. Identifying Failures and Checking Current Status

When a failure occurs, first identify which tablespace has the issue.

Checking Tablespace and Data File Status

Execute the following SQL to check the status of tablespaces and data files:

-- Check tablespace status
SELECT tablespace_name, status FROM dba_tablespaces;

-- Check data file status (e.g., for USERS tablespace)
SELECT name, status FROM v$datafile WHERE tablespace_name = 'USERS';

If the status is RECOVER or shows an abnormal value, recovery operations are necessary. Additionally, you can reliably identify the presence of corrupted blocks by performing a validation in RMAN as follows:

RMAN> VALIDATE DATABASE;

2. Tablespace Recovery Procedures (Online Recovery)

If a tablespace is corrupted, perform a restore and recovery after taking the target tablespace offline. This allows for recovery without stopping services that utilize other tablespaces.

Procedural Steps

  1. Take Tablespace Offline: ALTER TABLESPACE <tablespace_name> OFFLINE;
  2. Restore: RESTORE TABLESPACE <tablespace_name>;
  3. Recover: RECOVER TABLESPACE <tablespace_name>;
  4. Bring Online: ALTER TABLESPACE <tablespace_name> ONLINE;

Execution Example (Recovery of USERS Tablespace)

If an error occurs while running RMAN, take the tablespace offline via SQL first, then proceed with the restore.

# 1. Change USERS tablespace to OFFLINE
ALTER TABLESPACE USERS OFFLINE;

# 2. Execute restore
RMAN> RESTORE TABLESPACE USERS;

# 3. Execute recovery
RMAN> RECOVER TABLESPACE USERS;

# 4. Bring back online to complete recovery
ALTER TABLESPACE USERS ONLINE;

[Execution Trace Example]

RMAN> select status from v$instance;

using target database control file instead of recovery catalog
STATUS
------------
OPEN

RMAN> validate database;

Starting validate at 25-03-17
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=74 device type=DISK
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
input datafile file number=00001 name=/u01/app/oracle/oradata/V19/system01.dbf
input datafile file number=00003 name=/u01/app/oracle/oradata/V19/sysaux01.dbf
input datafile file number=00004 name=/u01/app/oracle/oradata/V19/undotbs01.dbf
input datafile file number=00007 name=/u01/app/oracle/oradata/V19/users01.dbf
channel ORA_DISK_1: validation complete, elapsed time: 00:00:07
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
1    OK     0              18147        149764          2274304
  File Name: /u01/app/oracle/oradata/V19/system01.dbf
  ...
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
7    FAILED 0              725          9121            2274262
  File Name: /u01/app/oracle/oradata/V19/users01.dbf
  ...

One or more corrupted blocks were found during validation
Please refer to trace file /u01/app/oracle/diag/rdbms/v19/v19/trace/v19_ora_3136.trc for details
...
Finished validate at 25-03-17

RMAN> restore tablespace users;  ★Error occurred

Starting restore at 25-03-17
using channel ORA_DISK_1

channel ORA_DISK_1: starting datafile backup set restore
...
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 03/17/2025 20:54:37
ORA-19870: error while restoring backup piece /u01/app/oracle/product/19.0.0/dbhome_1/dbs/013kjnp8_1_1_1
ORA-19573: cannot obtain exclusive enqueue for datafile 7.
ORA-19890: datafile is already in use

RMAN> alter tablespace users offline;  ★Change USERS tablespace to OFFLINE

Statement processed

RMAN> restore tablespace users;  ★Restore

...
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
Finished restore at 25-03-17

RMAN> recover tablespace users;  ★Recover

...
Starting media recovery
media recovery complete, elapsed time: 00:00:03
Finished recover at 25-03-17

RMAN> alter tablespace users online;  ★Change to ONLINE

Statement processed

3. Data File-Level Recovery

If a specific data file is corrupted, recovery at the data file level is also possible.

# Restore/Recover at data file level
RMAN> RESTORE DATAFILE '/u01/oradata/ORCL/users01.dbf';
RMAN> RECOVER DATAFILE '/u01/oradata/ORCL/users01.dbf';

# Bring back online
ALTER DATABASE DATAFILE '/u01/oradata/ORCL/users01.dbf' ONLINE;

4. Handling Failures in Special Areas (SYSTEM/UNDO)

Corruption of the SYSTEM Tablespace

Since this is directly linked to database startup, recovery in mount mode is required.

RMAN> STARTUP MOUNT;
RMAN> RESTORE TABLESPACE SYSTEM;
RMAN> RECOVER TABLESPACE SYSTEM;
RMAN> ALTER DATABASE OPEN;

Corruption of the UNDO Tablespace

The quickest recovery method is to create a new UNDO tablespace and switch to it.

-- Create new UNDO tablespace
CREATE UNDO TABLESPACE UNDO_TBS2 DATAFILE '/u01/oradata/ORCL/undotbs02.dbf' SIZE 500M AUTOEXTEND ON;

-- Switch to new tablespace
ALTER SYSTEM SET UNDO_TABLESPACE = UNDO_TBS2;

5. Preventive Measures and Best Practices

Operational points to avoid panic during failures:

  • Regular Backups: Periodically execute RMAN> BACKUP DATABASE PLUS ARCHIVELOG; and keep the latest backups.
  • Configure Fast Recovery Area (FRA): Automates and streamlines the management of archived logs and backups.SQLALTER SYSTEM SET DB_RECOVERY_FILE_DEST_SIZE = 10G; ALTER SYSTEM SET DB_RECOVERY_FILE_DEST = '/u02/fra';
  • Log Monitoring: Automate periodic checks of alert logs to detect failures at an early stage.

6. FAQ

Q1: Does the entire database stop during tablespace recovery? A1: No. By taking only the failed tablespace offline, you can continue operations with other tablespaces remaining online.

Q2: What should I do if an ORA-19890 error occurs? A2: This indicates the data file is in use. First, take the relevant tablespace offline using ALTER TABLESPACE <name> OFFLINE; and then retry the restore.

Q3: What is the recovery procedure if the SYSTEM tablespace is corrupted? A3: Since the SYSTEM tablespace is a required area, you must start the database in mount mode and execute the RESTORE and RECOVER commands using RMAN.

7. Summary

If a failure occurs in a specific tablespace, you can recover quickly by implementing appropriate recovery procedures.

  • Tablespace-level Recovery: Minimizes service impact by taking the tablespace offline.
  • Utilization of RMAN: Allows efficient execution of restore and recovery.
  • Daily Preparation: Regular backups and log monitoring are essential.

It is highly recommended that you master and verify these recovery procedures to be prepared for potential failures.

This article targets Oracle Database 19c (screens and default values may differ in other versions).

[reference]
Oracle Database Backup and Recovery Reference, 19c

How to Perform Complete Recovery with RMAN up to the Point of Failure (with Diagram & RESETLOGS Explained)
In Oracle Database operations, data file corruption, accidental deletion, or disk failure can occur at any time.When suc…

コメント

Copied title and URL