This article is focused on how to monitor and troubleshoot the Data Guard when managing with Broker. There are many ways to monitor Data Guard such as using Generic/Traditional views, Using Grid control/Cloud control and on top of that Data Guard can be monitored using Data Guard Broker utility DGMGRL. This article is completely based on Data Guard broker and how to deal with it and it covers even 12c new features of Data Guard .
Data Guard Broker
To create or manage Data Guard(physical or logical or snapshot standby) the broker is not an mandatory to use, but it gives lot of comfort in maintenance and administration such as switchover or failover with single line commands instead of executing multiple commands from the both primary and standby sites.
Data Guard broker has its own background process called DMON on each site of Primary and Standby database(s). This process will start working with the influence of DG_BROKER_START to TRUE. If you want to start managing Data Guard with broker you have to create configuration by mentioning what are the members of the Data Guard configuration. This configuration is maintained in the configuration files which are available in each site of the default location of "DG_BROKER_CONFIG_FILE_n", these configuration files will be having all the information of Data Guard configuration. In order to investigate any issues around Data Guard broker, it has its own log files we can find them in same diag destination where the alert log is located with the file name drc$ORACLE_SID.log and you can find this process in many posts/blogs , I would like to provide brief information on this as am going to deal with Data Guard broker throughout the article.
It is very highly recommended to manage Data Guard with Data Guard broker if enabled in order to avoid misconception in configuration and parameters.
Monitoring and Troubleshooting
I would like to start with a question before discussing about on how to monitor and toubleshooting Data Guard using Broker.
Question) If customer asked you to verify the configuration of the Data Guard when using generic method (SQL Plus) , What are the things you have to review?
Answer) When it comes to Data Guard, for every DBA the very first thing is to check is there any LAG on Standby database with Primary , For that we have to use several views and they need to run on primary and standby databases in order to justify that my Data Guard configuration is working fine. But when it comes to Data Guard Broker the life becomes easy, Data Guard can be managed with a single line command, i.e.
DGMGRL> show configuration;
Configuration – ckpt12c
Protection Mode: MaxPerformance
Databases:
canada – Primary database
canfar – Far Sync
india – Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
DGMGRL>
So the configuration is pretty good and no issues found from either Primary or Standby or Far Sync Instance(12c) and you can judge simply by checking the status whether configuration status is in SUCCESS or ERROR. if i want to check the LAG between Primary and Standby no need to review the views any more, I can get all the information with a single command again.
DGMGRL> show database india;
Database – india
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 0 seconds ago)
Apply Lag: 0 seconds (computed 0 seconds ago)
Apply Rate: 8.81 MByte/s
Real Time Query: ON
Instance(s):
drmcdb
Database Status:
SUCCESS
DGMGRL>
The output illustrates the database role, whether the recovery is in progress or not, LAG rate and whether Standby is running in Real Time Query or not so on. So with Data Guard broker so handy, no need to use the traditional method in order to justify quickly.
Instance Status
If we want check the status of instance instance, we have to use the ORACLE_SID with the commmand and hence we can review brief status with the few configuration settings.
DGMGRL> show instance verbose drmcdb;
Instance 'drmcdb' of database 'india'
Host Name: CKPT-ORA-04
PFILE:
Properties:
StaticConnectIdentifier = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=CKPT-ORA-04(PORT=1522))(CONNECT_DATA=(SERVICE_NAME=INDIA_DGMGRL)(INSTANCE_NAME=drmcdb)(SERVER=DEDICATED)))'
StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST'
AlternateLocation = ''
LogArchiveTrace = '0'
LogArchiveFormat = '%t_%s_%r.dbf'
TopWaitEvents = '(monitor)'
Instance Status:
SUCCESS
DGMGRL>
Send Queue entries
This command shows all the log files that are not successfully not archived on the standby databases (On Primary)
DGMGRL> show database canada SendQEntries
PRIMARY_SEND_QUEUE
STANDBY_NAME STATUS RESETLOGS_ID THREAD LOG_SEQ TIME_GENERATED TIME_COMPLETED FIRST_CHANGE# NEXT_CHANGE# SIZE (KBs)
india ARCHIVED 854451579 1 80 04/23/2015 22:00:40 04/24/2015 13:33:28 2355595 2386718 41995
CURRENT 854451579 1 81 04/24/2015 13:33:28 2386718 18618
Receive Queue entries This command returns the archive log sequences which are received but not applied yet. (On Standby)
DGMGRL> show database india recvqentries
STANDBY_RECEIVE_QUEUE
STATUS RESETLOGS_ID THREAD LOG_SEQ TIME_GENERATED TIME_COMPLETED FIRST_CHANGE# NEXT_CHANGE# SIZE (KBs)
Top Wait Events
Shows the top wait events of long waiting time of the instance, and it can be run either on primary or standby database.
DGMGRL> show database india topwaitevents
TOP SYSTEM WAIT EVENTS
Event Wait Time
rdbms ipc message 968194150
SQL*Net message from client 226616830
PX Idle Wait 203594395
DIAG idle wait 101791332
lreg timer 50899217
DGMGRL>
From 12c we can have Far Sync Instance and Data Guard broker can monitor even Far Sync instance.
DGMGRL> show far_sync canfar;
Far Sync – canfar
Transport Lag: 0 seconds (computed 1 second ago)
Instance(s):
canfar
Far Sync Status:
SUCCESS
DGMGRL>
Redo Transport Status
It returns if any error status of edo transport for the statndby databases which are in ENABLE mode.
DGMGRL> show database canada LogXptStatus;
LOG TRANSPORT STATUS
PRIMARY_INSTANCE_NAME STANDBY_DATABASE_NAME STATUS
mcdb india
DGMGRL>
Status Report It returns overall report of the database with oracle errors (Ex: ORA-12514) and also the severity code.
DGMGRL> show database canada statusreport
STATUS REPORT
INSTANCE_NAME SEVERITY ERROR_TEXT
DGMGRL>
If output is in ERROR status then actual troubleshooting comes into place and you have to view several monitorable properties and they are covered with the troubleshooting Data Guard part which follows.
Test Case : How to deal with Data Guard Broker Errors
So far we are limited to monitoring but when something is gone wrong and resulted the output of the configuration with ERROR, then the real troubleshooting of Data Guard broker starts, In order to demonstrate Troubleshooting Data Guard using a broker I've included with a live example which makes you feel that managing Data Guard broker is easy, whereas many others considered Data Guard broker as lousy.
Usually the error outputs from the Data Guard broker may not straight forward as we see in alert log, So we will see how to deal with them. We can confirm whether the configuration is in good status or not by below command of DGMGRL and it may result with either warning or error depends on the issue, So if we see from below output the configuration status is in Warning status and we can find the errors associated with the Primary, Physical Standby databases.
DGMGRL> show configuration;
Configuration – ckpt12c
Protection Mode: MaxAvailability
Databases:
canada – Primary database
india – Physical standby database
Warning: ORA-16792: configurable property value is inconsistent with database setting
Fast-Start Failover: DISABLED
Configuration Status:
WARNING
DGMGRL>
From the above output there are no errors or warnings from primary database and we can see warning from the standby database. Each DBA could follow a different path of troublshooting, Here in this example we will see how easy it is to diagnosis and fix the issues of Data Guard Broker. In my opinion the best utility to know the error details is "oerr" , we can know the description of error with oerr as follows.
-bash-3.2$ oerr ora 16792
16792, 0000, "configurable property value is inconsistent with database setting"
// *Cause: The values of one or more configurable properties were
// inconsistent with database in-memory settings or server parameter
// file settings. This may happen by directly altering initialization
// parameters instead of editing configurable property values using
// Data Guard broker.
// *Action: Query the InconsistentProperties property on the database or check
// the Data Guard broker log to find which properties are set
// inconsistently. Reset these properties to make them consistent
// with the database settings. Alternatively, enable the database
// or the entire configuration to allow the configurable property
// settings to be propagated to the initialization parameters.
-bash-3.2$
So from the error description it shows “Configurable property value is inconsistent with database settings” , from this we can sense the issue is around inconsistent properties. With the only output of configuration status it’s not easy to identify and fix the things, we may need to review many things to confirm. After reviewing the configuration of whole Data Guard, now we have to review the status of the database which is in trouble, we don’t need to review the status of other databases, because they are in good status.
DGMGRL> show database india
Database – india
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 0 seconds ago)
Apply Lag: 0 seconds (computed 0 seconds ago)
Apply Rate: 8.99 MByte/s
Real Time Query: ON
Instance(s):
drmcdb
Warning: ORA-16714: the value of property StandbyFileManagement is inconsistent with the database setting
Database Status:
WARNING
DGMGRL>
-bash-3.2$ oerr ora 16714
16714, 0000, "the value of property %s is inconsistent with the database setting"
// *Cause: The value of the specified configuration property was inconsistent
// with database in-memory settings or server parameter file
// settings. This may be caused by changing an initialization
// parameter that corresponds to a configuration property.
// *Action: Query the InconsistentProperties property on the database to
// determine the which properties are set inconsistently. Reset the
// properties to make them with the database settings.
-bash-3.2$
From the output of standby database, it's clear that issue is around the property StandbyFileManagement and its mismatching with the Database and the configuration file, There are several other commands of DGMGRL which can provide overall status report of the database i.e.
DGMGRL> show database india statusreport
STATUS REPORT
INSTANCE_NAME SEVERITY ERROR_TEXT
drmcdb WARNING ORA-16714: the value of property StandbyFileManagement is inconsistent with the database setting
DGMGRL>
If there are multiple errors with property values then we can query for the "Inconsistentproperties" for the database as
DGMGRL> show database india inconsistentproperties
INCONSISTENT PROPERTIES
INSTANCE_NAME PROPERTY_NAME MEMORY_VALUE SPFILE_VALUE BROKER_VALUE
drmcdb StandbyFileManagement AUTO AUTO MANUAL
DGMGRL>
If we want to review all the property values of standby then we can use "verbose" command associated with show database.
DGMGRL> show database verbose india
Database – india
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 0 seconds ago)
Apply Lag: 0 seconds (computed 0 seconds ago)
Apply Rate: 8.98 MByte/s
Real Time Query: ON
Instance(s):
drmcdb
Warning: ORA-16714: the value of property StandbyFileManagement is inconsistent with the database setting
Properties:
DGConnectIdentifier = 'india'
ObserverConnectIdentifier = ''
LogXptMode = 'sync'
RedoRoutes = ''
DelayMins = '0'
Binding = 'optional'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyLagThreshold = '0'
TransportLagThreshold = '0'
TransportDisconnectedThreshold = '30'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'MANUAL'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '4'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = ''
LogFileNameConvert = ''
FastStartFailoverTarget = ''
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
StaticConnectIdentifier = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=CKPT-ORA-04(PORT=1522))(CONNECT_DATA=(SERVICE_NAME=INDIA_DGMGRL)(INSTANCE_NAME=drmcdb)(SERVER=DEDICATED)))'
StandbyArchiveLocation = 'USE_DB_RECOVERY_FILE_DEST'
AlternateLocation = ''
LogArchiveTrace = '0'
LogArchiveFormat = '%t_%s_%r.dbf'
TopWaitEvents = '(monitor)'
Database Status:
WARNING
DGMGRL>
Up to here, we are in conclusion that issue on standby is around "StandbyFileManagement" property value, We have fetched most of the information by using DGMGRL , apart from the commands we can also consult the broker log files and we can get some more information.
From Primary Broker log file
(as i mentioned above the broker log files will be located in the diag destination where the alert log is located.)
Primary completed health check
Data Guard Broker Status Summary:
Type Name Severity Status
Configuration ckpt12c Warning ORA-16608
Primary Database canada Success ORA-00000
Physical Standby Database india Warning ORA-16792
From the primary database the error code is "ORA-00000" which is in success status and the Standby database error code is "ORA-16792" which points to same output of "show configuration"
From Standby Broker Log file
By standby broker log file it is crystal clear that "StandbyFileManagement" has property value "MANUAL" in configuration file/metadata and where the value is "AUTO" at database level and the SPFILE we are using.
In order to confirm what we are viewing from the Broker log file, we can crosscheck the parameter output from the database level and the Broker level
From Broker Interface
DGMGRL> show database india StandbyFileManagement;
StandbyFileManagement = 'MANUAL'
DGMGRL>
From Database
SQL> show parameter standby_file_management
NAME TYPE VALUE
———————————— ———– ——————————
standby_file_management string AUTO
SQL>
It's now clear that parameter value differs from metadata and from the database level, So we need to fix this parameter to meet the configuration status to "SUCCESS".
DGMGRL> edit database india set property StandbyFileManagement = AUTO;
Property "standbyfilemanagement" updated
DGMGRL>
DGMGRL> show database india StandbyFileManagement;
StandbyFileManagement = 'auto'
DGMGRL>
After updating the property StandbyFileManagement , now we can check is there still any inconsistent properties with the database.
DGMGRL> show database india inconsistentproperties
INCONSISTENT PROPERTIES
INSTANCE_NAME PROPERTY_NAME MEMORY_VALUE SPFILE_VALUE BROKER_VALUE
DGMGRL>
There are no more inconsistent parameters from the standby after fixing the parameter "StandbyFileManagement", Finally we can check the configuration status and database status.
DGMGRL> show configuration;
Configuration – ckpt12c
Protection Mode: MaxAvailability
Databases:
canada – Primary database
india – Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
DGMGRL>
DGMGRL> show database india ;
Database – india
Role: PHYSICAL STANDBY
Intended State: APPLY-ON
Transport Lag: 0 seconds (computed 0 seconds ago)
Apply Lag: 0 seconds (computed 0 seconds ago)
Apply Rate: 8.91 MByte/s
Real Time Query: ON
Instance(s):
drmcdb
Database Status:
SUCCESS
DGMGRL>
Now we cannot see any more errors either from configuration level or from database, This is an example and we can come accros many errors/warnings related to configuration and hence depending on the requirement(sync issues, configuration issues) we may have to consider disable/enable configuration, disable/enable database and even more disable/enable broker.
Conclusion
Managing Data Guard with broker looks not easy to manage, but if we handle broker in a simplified and correct path its more easy to fix the issue and DGMGRL makes life so easy in monitoring the configuration of the Data Guard , databases and more easy to perform role transitions. There is no surprise that DGMGRL can do all operations what we can do using Generic method.
References:
http://docs.oracle.com/cd/B28359_01/server.111/b28295/dbpropref.htm#i105769
Start the discussion at forums.toadworld.com