Data block recovering process using Normal Redundancy

I have heard many times from many DBAs that when a data block corruption happen in a disk which is part of a Diskgroup using normal redundancy, only the mirror block will take care of the coming database read operations but if another corruption happens in the mirror block the rows of the object saved in that block will not be accessible anymore. Then we have to restore the object via export/import or using another method.

Note: It is always possible to skip the corrupted data block using DBMS_REPAIR.SKIP_CORRUPT_BLOCKS.

When I hear that, I always have to correct them and tell them that that is only a myth because ASM doesn’t work that way and ASM is more intelligent that they think. I usually explain them how ASM recover the data block corrupted and how does it work after the recovering. All those things were the main aim that I had in order to write this article, explaining through examples and at the end give a conclusion about this recovering behavior.

The myth:

“When the primary data block get corrupted the secondary data block becomes in the primary data block but if that data block get corrupted the rows in it will not be accessible anymore”.

In order to explain the recovering behavior, let me tell you more about my environment:

I have a normal redundancy diskgroup named “DATA”, its block size is 4K and its AU is 1MB.

SQL> select name, allocation_unit_size, block_size, type from v$asm_diskgroup;
NAME                            ALLOCATION_UNIT_SIZE   BLOCK_SIZE  TYPE
------------------------------ ---------------------- ------------ ------
DATA                            1048576                4096         NORMAL

Let’s create a table with two rows:

SQL> create table dgomez(id number primary key, value varchar2(20)); 
Table created.
SQL> insert into dgomez values (1,'deiby');
1 row created.
SQL> insert into dgomez values (2,'gomez');
1 row created.
SQL> commit;

Now, we are going to identify the blocks where these rows are saved, once the blocks are identified we will damage a block.

The first thing is to identify the ASM file where the table ‘DGOMEZ’ is saved.

SQL> select b.file_id, b.file_name from dba_segments a, dba_data_files b where a.header_file=b.file_id and segment_name='DGOMEZ';

   FILE_ID FILE_NAME
---------- --------------------------------------
       4 +DATA/orcl/datafile/users.263.850528663

As you can see, I am using Oracle Managed File (OMF), so as per the OMF documentation, every ASM file name should be in the following format:

+diskgroup_name/database_name/database_file_type/tag_name.file_number.incarnation

Can you guess in which ASM file our table is saved? Yes, you’re right, it is the ASM file 263. Also see that the Database file number is 4.

We identified the ASM file but now we have to identify in which AU our rows are saved. Let’s do the following in order to get that information:

SQL> select rowid, value from dgomez.dgomez;
ROWID              VALUE
------------------ --------------------
AAADW0AAEAAAACGAAA deiby
AAADW0AAEAAAACGAAB gomez
 
SQL> select DBMS_ROWID.ROWID_BLOCK_NUMBER('AAADW0AAEAAAACGAAA') "Block number" from DUAL; 
Block number
------------
       134

Now we know that our row “Deiby” is saved in the database block number 134 in the ASM file number 263. But remember, we are talking about a Database block, don’t mix the concepts between Database block and ASM block. You can see the following image and understand how the database structures are mapped to ASM structures.

1-asm_instance.png-550x0

Since the file is saved into ASM we have to map the Database block to an ASM block, in the following steps I will show you how to do it.

Checking the database block size:

SQL> show parameters db_block_size
NAME                        TYPE   VALUE
----------------------- ----------- ------------------------
db_block_size               integer      8192

So far, everything is well. Let’s stop here for a while, please stand up and go for a drink because this may be a little difficult since we will treat with a lot of number. If you like hamburgers like me then you can also have one.

Database Block      =         8K
Allocation Unit (AU)            =         1024K

How many Database blocks are in an AU?

1024k/8k=128 database blocks.

Do you remember in which database block our row is saved? It is the database block number 134.

So if every AU has 128 database blocks, our database block number 134 should be in the second AU of same disk.

Exactly in the 8k-block number 6 of the second AU: 134-128=6.

We have identified that our row is in the second AU but of which disk? That is what we are going to get:

SQL> select DISK_KFFXP "disk #",
PXN_KFFXP "physical extent #", 
XNUM_KFFXP "virtual extent #",
AU_KFFXP "AU number",
decode(LXN_KFFXP,0,'Primary',1,'Secondary','header metadata') “AU type”
from X$KFFXP
where NUMBER_KFFXP=263
and pxn_kffxp!=2147483648
AND GROUP_KFFXP=1
order by 2;  2    3    4    5    6    7    8    9   10 
 
disk #      physical extent #     virtual extent #     AU number   AU type
----------- ---------------------- --------------------- ----------- --------
     0                 0                     0               581      Primary
     1                 0                2147483648           611      Primary
     1                 1                     0               581      Secondary
     0                 1                2147483648           611      Secondary
     1                 2                     1               582      Primary
   65534               2               2147483648        4294967294   header metadata
     0                 3                     1               582      Secondary
     0                 4                     2               583      Primary
    1                 5                     2               583      Secondary
     1                 6                     3               584      Primary
     0                 7                     3               584      Secondary

Further information:

A value of 2147483648 is for the file metadata.
DISK_KFFXP can have the value 65534 when AU not present on physical storage (applies to normal or high redundancy DG).
AU_KFFXP can have the value 4294967294 when AU not present on physical storage because of failure for example (applies to normal or high redundancy DG)

Let’s analyze the information. We don’t have all the information in only one disk, this is because of the rebalance process. I have highlighted in yellow the first two primary AUs and also the two first secondary AUs.

Why we need to know which AU is primary and which AU is secondary?

The right concept:

"ASM always use the primary AU for to read data. If the primary AU is corrupted then ASM will read the secondary AU. If the secondary AU is well then ASM tries to overwrite the corrupted primary AU using the secondary AU. If the corrupted primary AU is fixed then that AU will be the primary AU as always. If the corrupted primary AU can’t be overwritten then ASM tries to write the new AU to other location in the disk. If that write operation is successfully then that AU will be the new primary AU."

Let’s be sure about this concept. We should damage the second AU of the disk number 1 because that is the primary AU in which our row is saved. We can use the following method in order to find out which OS disk is the ASM disk number 1:

SQL> select disk_number, path from v$asm_disk;
DISK_NUMBER PATH
----------- ------------------------------
        1 /dev/oracleasm/disks/ASMDISK2
        0 /dev/oracleasm/disks/ASMDISK1

Where is my row? Where should I see inside the disk?

8k-blocks in an AU= (1024/8)

The second AU of the disk is the number=582

The row is in the 8k-block number 6 of the 2th AU=6

Then our row is in the 8k-block number: (1024/8)*582+6=74502

Reading the block:

Primary AU:

[grid@a1 ~]$ dd if=/dev/oracleasm/disks/ASMDISK2 bs=8k count=1 skip=74502 | od -a
0000000 ack   " nul nul ack nul nul soh dc3   9 etx nul nul nul stx ack
0000020   ;   j nul nul soh nul nul nul   4   5 nul nul dc1   9 etx nul
0000040 nul nul nul nul stx nul   2 nul nul nul nul soh stx nul  ff nul
0000060   A nul nul nul   4   Y   @ nul   ' nul etb nul nul nul nul nul
0000100  vt   9 etx nul soh nul stx nul   > nul nul nul   #   [   @ nul
0000120   # nul   ( nul soh  sp nul nul dc3   9 etx nul nul nul nul nul
0000140 nul nul nul nul nul soh stx nul del del syn nul nul  us   j  us
0000160   j  us nul nul stx nul  ff  us nul  us nul nul nul nul nul nul
0000200 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
*
0017740 nul nul nul nul   , stx stx stx   A etx enq   g   o   m   e   z
0017760   , nul stx stx   A stx enq   d   e   i   b   y stx ack dc3   9
0020000
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000213514 seconds, 38.4 MB/s
[grid@a1 ~]$

The secondary AU is in the same location but in the ASM disk number 0 (you can confirm this by yourself).

Secondary AU:

[grid@a1 ~]$ dd if=/dev/oracleasm/disks/ASMDISK1 bs=8k count=1 skip=74502 | od -a
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000819341 seconds, 10.0 MB/s
0000000 ack   " nul nul ack nul nul soh dc3   9 etx nul nul nul stx ack
0000020   ;   j nul nul soh nul nul nul   4   5 nul nul dc1   9 etx nul
0000040 nul nul nul nul stx nul   2 nul nul nul nul soh stx nul  ff nul
0000060   A nul nul nul   4   Y   @ nul   ' nul etb nul nul nul nul nul
0000100  vt   9 etx nul soh nul stx nul   > nul nul nul   #   [   @ nul
0000120   # nul   ( nul soh  sp nul nul dc3   9 etx nul nul nul nul nul
0000140 nul nul nul nul nul soh stx nul del del syn nul nul  us   j  us
0000160   j  us nul nul stx nul  ff  us nul  us nul nul nul nul nul nul
0000200 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
*
0017740 nul nul nul nul   , stx stx stx   A etx enq   g   o   m   e   z
0017760   , nul stx stx   A stx enq   d   e   i   b   y stx ack dc3   9
0020000
[grid@a1 ~]$

Damaging the primary AU:

We are going to damage the primary AU because ASM always read the primary AU. Once the primary AU is corrupted we are going the check our table DGOMEZ.GOMEZ and we will see if the data is accessible. It should be accessible and also the corrupted block shall be fixed because of what we said in the “right concept”.

[grid@a1 ~]$ dd if=/dev/zero bs=8k count=1 seek=74502 of=/dev/oracleasm/disks/ASMDISK2
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 2.9401e-05 seconds, 279 MB/s

We have to be sure that the block is corrupted:

[grid@a1 ~]$ dd if=/dev/oracleasm/disks/ASMDISK2 bs=8k count=1 skip=74502 | od -a
0000000 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
*
0020000
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000248158 seconds, 33.0 MB/s

Yes, we can be so proud of to have corrupted the block. Let’s see what happened with our data:

[oracle@a1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Wed Jun 18 22:57:04 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options
 
SQL> select * from dgomez.dgomez;
     ID VALUE
---------- --------------------
      1 deiby
      2 gomez

As you can see our data is there. And also the corrupted block shall be fixed, let’s be sure about it:

[grid@a1 ~]$ dd if=/dev/oracleasm/disks/ASMDISK2 bs=8k count=1 skip=74502 | od -a
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 2.7613e-05 seconds, 297 MB/s
0000000 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
*
0020000

Awesome, our block has not been fixed. Did we forget something? Let’s think for a minute…

What about if the data blocks were in memory? Perhaps that is what we forget.

SQL> ALTER SYSTEM FLUSH BUFFER_CACHE; 
System altered.
 
Let’s try again:
 
SQL> select * from dgomez.dgomez;
     ID VALUE
---------- --------------------
      1 deiby
      2 gomez
SQL>

Let’s better see the log in order to be sure:

[oracle@a1 trace]$ tail -10 /u01/app/oracle/diag/rdbms/orcl/orcl/trace/alert_orcl.log
Wed Jun 18 22:57:57 2014
ALTER SYSTEM: Flushing buffer cache
Hex dump of (file 4, block 134) in trace file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_3712.trc
Corrupt block relative dba: 0x01000086 (file 4, block 134)
Completely zero block found during multiblock buffer read
Reading datafile '+DATA/orcl/datafile/users.263.850528663' for corruption at rdba: 0x01000086 (file 4, block 134)
Read datafile mirror 'DATA_0001' (file 4, block 134) found same corrupt data (no logical check)
Read datafile mirror 'DATA_0000' (file 4, block 134) found valid data
Hex dump of (file 4, block 134) in trace file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_3712.trc
Repaired corruption at (file 4, block 134)

It seems like the corrupted block was fixed. I can’t believe it! Is ASM as intelligent as we said? Let’s see:

[grid@a1 ~]$ dd if=/dev/oracleasm/disks/ASMDISK2 bs=8k count=1 skip=74502 | od -a
0000000 ack   " nul nul ack nul nul soh dc3   9 etx nul nul nul stx ack
0000020   ;   j nul nul soh nul nul nul   4   5 nul nul dc1   9 etx nul
0000040 nul nul nul nul stx nul   2 nul nul nul nul soh stx nul  ff nul
0000060   A nul nul nul   4   Y   @ nul   ' nul etb nul nul nul nul nul
0000100  vt   9 etx nul soh nul stx nul   > nul nul nul   #   [   @ nul
0000120   # nul   ( nul soh  sp nul nul dc3   9 etx nul nul nul nul nul
0000140 nul nul nul nul nul soh stx nul del del syn nul nul  us   j  us
0000160   j  us nul nul stx nul  ff  us nul  us nul nul nul nul nul nul
0000200 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
*
0017740 nul nul nul nul   , stx stx stx   A etx enq   g   o   m   e   z
0017760   , nul stx stx   A stx enq   d   e   i   b   y stx ack dc3   9
0020000
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000505269 seconds, 16.2 MB/s
[grid@a1 ~]$

So far everything is perfect. We have to confirm one thing more, does the primary AU is still the primary one?

SQL> select DISK_KFFXP "disk #",
PXN_KFFXP "physical extent #", 
XNUM_KFFXP "virtual extent #",
AU_KFFXP "AU number",
decode(LXN_KFFXP,0,'Primary',1,'Secondary','header metadata') “AU type”
from X$KFFXP
where NUMBER_KFFXP=263
and pxn_kffxp!=2147483648
AND GROUP_KFFXP=1
order by 2;  2    3    4    5    6    7    8    9   10 
 
disk # physical extent # virtual extent #  AU number AU type
----------- ---------------------- --------------------- ---------- -----
        0               0                 0    581 Primary
        1               0          2147483648  611 Primary
        1               1                 0    581 Secondary
        0               1          2147483648  611 Secondary
        1               2                 1    582 Primary
      65534            2          2147483648 4294967294 header metadata
        0               3                 1    582 Secondary
        0               4                 2    583 Primary
        1               5                 2    583 Secondary
        1               6                 3    584 Primary
        0               7                 3    584 Secondary

As you can see everything is at the same way. It seems like nothing happened. Yes, sometimes ASM is very cool.

Thanks for reading this article. Before finishing this article let’s give a conclusion:

ASM recovers automatically the primary AU if it is corrupted.
The secondary AU always will be secondary unless a disk fail occurs.
The secondary AU is used for recovering the primary AU.
If ASM can’t overwrite the primary AU it will write the new primary AU in other disk part.
ASM writes an entry in the alert log when a recovering process occurs.