Reducing Oracle Redo Copy Latch Contention

What is the Redo Copy Latch?

The redo copy latch is needed to copy a redo entry into the log buffer. The redo copy latches are used to indicate that a process is copying redo into the log buffer, and that LGWR should wait until the copy has finished, before writing the target log buffer blocks to disk.

Most gets against the redo copy latches are in a no wait mode because the process can use any redo copy latch to protect its write into the log buffer. The process will first try to get the last latch it held. If that fails, then it will attempt to get the next redo copy latch in a no wait mode. The process will finally use a willing to wait mode if it fails to get a copy latch against all the copy latches.

This is the reason why V$LATCH usually shows many more IMMEDIATE_GETS and IMMEDIATE_MISSES then GETS and MISSES. Willing to wait for this latch is a last resort for Oracle.

The example below allows you to see the hit rates on the redo copy latch:

col name format a15
col WILLING_TO_WAIT format 999.99
col NO_WAIT format 999.99

select
       name,
       gets, misses, (misses/gets) * 100 WILLING_TO_WAIT,
       immediate_gets, immediate_misses,
       (immediate_misses/immediate_gets) * 100 NO_WAIT
from   v$latch
where name = ‘redo copy’

Resolving Contention

If you do see a large number of gets as opposed to immediate gets for the redo copy latch, this may suggest that the size of the redo log files are too small and log switches are occurring too frequently. A checkpoint occurs at every log file switch and redo log buffers are written to the redo log files.

One indicator that this is the case is if there are “checkpoint not complete” messages in the alert log.

Another way to determine if you are having contention on the redo copy latch is via the following query. This query will show you both Background checkpoints started and completed. The values for these statistics should ideally be equal. If they are not, then a checkpoint was started before the previous checkpoint had a chance to complete.

select *
from v$sysstat
where name like ‘background checkpoint%’
/

To find the size of the redo log files and how frequently the log switch is occurring use:

Col member format a50
Col bytes format 999,999,999

select a.group#, b.member, a.bytes, a.first_time, a.status
from v$log a, v$logfile b
where a.group# = b.group#
/

If there is buffer write thrashing caused by too frequent log file switching, the redo logs should be increased in size to handle the transaction volumes. Resizing of the redo logs can be done with the database online. The following example changes the size of the redo logfiles to 100M and groups 1-3 exist.

Add more log groups temporarily:

Alter database
Add logfile group 4
(‘/ot2_01/oradata/questd/redo04a.log’,
   ‘/ot2_01/oradata/questd/redo04b.log’)
size 100M
/

Alter database
Add logfile group 5
(‘/ot2_01/oradata/questd/redo05a.log’,
   ‘/ot2_01/oradata/questd/redo05b.log’)
size 100M
/

Alter database
Add logfile group 6
(‘/ot2_01/oradata/questd/redo06a.log’,
   ‘/ot2_01/oradata/questd/redo06b.log’)
size 100M
/

Switch logfile to make one of the new groups the current one:

alter system switch logfile
/

Drop inactive groups:

Alter database drop logfile group 1
/

Alter database drop logfile group 2
/

Alter database drop logfile group 3
/

Remove the old log files:

!rm /ot2_01/oradata/questd/redo01?.log
!rm /ot2_01/oradata/questd/redo02?.log
!rm /ot2_01/oradata/questd/redo03?.log

Re-add the log files with a larger size:

Alter database
Add logfile group 1
(‘/ot2_01/oradata/questd/redo01a.log’,
   ‘/ot2_01/oradata/questd/redo01b.log’)
size 100M
/

Alter database
Add logfile group 2
(‘/ot2_01/oradata/questd/redo02a.log’,
   ‘/ot2_01/oradata/questd/redo02b.log’)
size 100M
/

Alter database
Add logfile group 3
(‘/ot2_01/oradata/questd/redo03a.log’,
   ‘/ot2_01/oradata/questd/redo03b.log’)
size 100M
/

Switch log file so that the current group is not one that is to be dropped:

alter system switch logfile
/

Drop the temporary groups:

Alter database drop logfile group 4
/

Alter database drop logfile group 5
/

Alter database drop logfile group 6
/

Prior to 8i, initialization parameters LOG_SIMULTANEOUS_COPIES and LOG_SMALL_ENTRY_MAX_SIZE were used to tune access to the redo copy latch. They were both made obsolete in 8i.

LOG_SIMULTANEOUS_COPIES sets the number of redo copy latches available on a multiple CPU server. To tune for a large number of immediate misses it was beneficial to add more redo copy latches.

Beginning in Oracle8i, LOG_SMALL_ENTRY_MAX_SIZE has been eliminated and LOG_SIMULTANEOUS_COPIES defaults to twice the number of CPUs (CPU_COUNT). This undocumented parameter could be increased if there is a high number of immediate misses on the redo copy latch, but this kind of latch contention is pretty rare. The default setting is usually adequate and should not be altered unless advised to do so by Oracle Support.

What is the Redo Copy Latch?

Resolving Contention

Start the discussion at forums.toadworld.com