Monday, 19 September 2016

RECOVER ORACLE DATAFILES WITH NO DOWNTIME


What will happen if dbf file accidentally deleted when database is still open and how to recover it?

And Lots of time interviewer asking this question (So first method is here and 2nd method is restore RMAN backup). Anyways come to the point and here we go.....

On Unix/Linux, when a file is deleted, but a process still has the file open, the file is still there in the filesystem, and only the inode is removed.
But the process can continue to use its file handle, and the file can also be accessible under /proc/<pid>/fd .

In the following example, we use that behavior to recover a lost datafile after is has been dropped from the os (with rm) but the datafile is still open by the background processes.

First, we create a tablespace, and populate a table in it.

----------------------------
Create a tablespace: 
----------------------------

SQL> create tablespace TEST datafile '/u01/app/oracle/oradata/orcl/test01.dbf' size 10M;
Tablespace created.

-------------------------------------
Create a table in tablespace: 
-------------------------------------

SQL> create table ORATAB tablespace TEST as select * from dba_objects;
Table created.

----------------------------------------------
Check that table data is accessible: 
----------------------------------------------

SQL> select count(*) from ORATAB;

  COUNT(*)
----------
     71902

SQL> exit

Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production

-----------------------------------------------------------
Then, we remove the datafile from unix CLI.
-----------------------------------------------------------
here is the datafile 

ls -l /u01/app/oracle/oradata/orcl/test_rm.dbf
-rw-r—– 1 oracle dba 10493952 Sep 19 15:21 24 /u01/app/oracle/oradata/orcl/test01.dbf

we ‘accidently’ remove the datafile 

rm /u01/app/oracle/oradata/orcl/test01.dbf

ls -l /u01/app/oracle/oradata/orcl/test01.dbf

ls: /u01/app/oracle/oradata/orcl/test01.dbf: no such file or directory

--------------------------------
Here the datafile is lost.
Now we connect again.
-------------------------------

sqlplus / as sysdba

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production

-------------------------------------------------------------------------------------------
Check if table data is accessible & you getting below error some times: 
-------------------------------------------------------------------------------------------

SQL> select count(*) from ORATAB;

select * from ORATAB
*
ERROR at line 1:
ORA-01116: error in opening database file 5
ORA-01110: data file 5: '/u01/app/oracle/oradata/orcl/test01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3

--------------------------------------------------------------
The datafile is lost and data is not accessible.
--------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------------
However, the datafile should still have an open file descriptor by an oracle background process
--------------------------------------------------------------------------------------------------------------------------

Check the dbwriter pid: 

ps -edf | grep dbw

[oracle@host01 ~]$ ps -edf | grep dbw
oracle    6350     1  0 13:09 ?        00:00:01 ora_dbw0_orcl
oracle    8237  8212  0 15:20 pts/1    00:00:00 grep dbw


List the deleted file handles for that DBWRITER process.
ls -l /proc/_PID_OF_DBWRITER/fd | grep deleted

or

Check its opened file descriptors for our file: 
[oracle@host01 ~]$ ls -l /proc/6350/fd | grep test
lrwx------ 1 oracle oinstall 64 Sep 19 15:21 24 -> /u01/app/oracle/oradata/orcl/test01.dbf (deleted)

here it is: 
[oracle@host01 ~]$ ls -l /proc/6350/fd/24
lrwx------ 1 oracle oinstall 64 Sep 19 15:21 /proc/6350/fd/24 -> /u01/app/oracle/oradata/orcl/test01.dbf (deleted)

In some other unix, lsof may be needed to map the file descriptor with the deleted file name

first we set a symbolic link so that oracle can see it as it was before the delete:
    ln -s /proc/6350/fd/24 -> /u01/app/oracle/oradata/orcl/test01.dbf

here data is accessible, but that will be lost if dbwriter closes it file handle (i.e if the database is closed)

----------------------------------------------------------------------------------------------------------------------------
However we can now set the tablespace read only so that it is checkpointed, and no writes occurs on it.
----------------------------------------------------------------------------------------------------------------------------
SQL> alter tablespace TEST read only;
Tablespace altered.

Now copy the file safely.

then we drop the symbolic link: 
     rm /u01/app/oracle/oradata/orcl/test01.dbf
     ls -l /u01/app/oracle/oradata/orcl/test01.dbf

     ls: /u01/app/oracle/oradata/orcl/test.dbf: No such file or directory

and we can now copy the file 
    cp -p /proc/6350/fd/24 /u01/app/oracle/oradata/orcl/test01.dbf
    ls -l /u01/app/oracle/oradata/orcl/test01.dbf

-rw-r—– 1 oracle dba 10493952 Sep 19 14:54 /u01/app/oracle/oradata/orcl/test01.dbf

And datafile is now available again.

-----------------------------------------------------------------------
We have it back, lets put the tablespace back in read/write 
------------------------------------------------------------------------
SQL> alter tablespace test_rm read write;
Tablespace altered.

-------------------------------
Check data is still there: 
-------------------------------
SQL> select count(*) from ORATAB;

  COUNT(*)
----------
     71902

This is not to be used like that in production. This is unsupported and may behave differently on different unix/linux or oracle flavours.

No comments: