Oracle System Architecture

Oracle is an Object-Relational Database Management System, or, ORDBMS for short. A traditional RDBMS (Oracle Version 7 and below) stores data in tables called relations. These relations are two dimensional representations of data where the rows, called tuples in relational jargon, represent records, and the columns, called attributes, are the pieces of information contained in the record.

Oracle provides object oriented extensions to the Oracle RDBMS forming the new entity the Object-Relational database. In an Object-Relational database columns can represent either a single value (as in standard relational databases), a varray (a fixed number of additional records) or a REF to a second table where a variable amount of data can be stored. This takes the two dimensional relational view and adds a third dimension. In addition, in an object relational database procedures known as methods can be tied to the tables.

Oracle introduced a new concept in its new version called Grid Computing and named the new version as 10g. As the name suggests, Grid Computing is a step towards the most flexible and cost effective way to manage information and applications. Grid Computing internally creates a large pool of Servers, Storage and networks wherein resources from one pool can easily be allocated to another depending upon the work load.

Grid Computing uses a Workload Manager which enables the applications to share resources across the network in a Grid. It thus offers a high performance and scalable system.

The Oracle Concepts of Database and Instance

In an Oracle configuration there is the base set of files, data files, redo logs, control files, parameter files, which make up the database and then there are one or more sets of operating system processes and shared memory areas (each known as an SGA) that make up an Oracle instance.

Oracle is more than just a collection of programs that allow ease of data access. Oracle can be compared to an operating system that overlays the operating system of the computer on which it resides. Oracle has its own file structures, buffer structures, global areas, and performance tuning parameters above and beyond those provided within the operating system. Oracle controls its own processes, monitors its own records and consistencies, and cleans up after itself.

Oracle Database Files

Oracle files are either used for storage of data, transaction information or parameter values. Oracle data is stored in files known as datafiles. The tablespace, a logical unit of storage in Oracle, maps to one or more datafiles. Each Oracle instance must have at least a single datafile, for the SYSTEM tablespace, a control file, to maintain transaction control numbers and datafile information, two redo logs to contain transaction redo information and a parameter file to specify constants used to initialize the Oracle system. Therefore the minimal Oracle installation will consist of a set of 5 files and the Oracle executables.

Once Oracle starts up, more files are created. These files fall into either event or trace logging categories. A single alert log that tracks overall instance status is opened and, trace files for all baseline Oracle processes are started. There will be a minimum of four baseline Oracle processes, PMON, SMON, DBWR, LGWR with a possible fifth (RECO) if the distributed option is loaded. Generally the trace files will contain process startup information and will only contain more if there are errors with their process. You should periodically check the contents of the location specified for trace files and clean out any old trace files. The alert log should be checked frequently during the database operation (in UNIX you can place a "tail -f" against all monitored database alert logs and have them in their own windows on your monitoring screen) some errors such as the archive logging process getting stuck or redo log corruption in a mirrored set, will only show up in the alert log.

If the database has more tablespaces than just the SYSTEM tablespace (and it should) then there will be additional datafiles, at least one for each additional tablespace. If there are optional processes started (such as ARCH, CKPT, Sxxx and Dxxx, SNPxx ) then they too will have trace files started for them. In most installations there will be a minimum of 3 redo log groups of two mirrored files each, generally more.

It is suggested that as a minimum the following tablespaces be created in any instance:

USERS
TEMP
RBS
TOOLS
SYSTEM (the required tablespace)

At least two additional tablespaces, APL_DATA and APL_INDEX (or ones named for the specific application), are usually created, as well as those mentioned previously. In complex databases (especially parallel server databases) some tables may be placed in their own tablespace adding to the number of required datafiles. The only datafile required for instance startup is the SYSTEM tablespace. Tablespaces are the logical repository for physically grouped data.

The SYSTEM tablespace should never be used for mundane day-to-day storage of common indexes and tables. The SYSTEM tablespace is the strict domain of the data dictionary tables and indexes. Being the only required tablespace, if the SYSTEM tablespace is somehow taken offline, the instance will immediately shutdown.

A tablespace used for rollback segments should also only be used for rollback segments, just as a temporary tablespace, such as TEMP, should only be used for temporary segments. In the absence of any application tablespaces for tables or indexes, the USER tablespace should be used for these items.

Loss of any of the datafiles will result in the instance not starting up if it is shutdown. Loss of a datafile while active will result in several possible scenarios depending on which datafile is lost. Loss of any datafile is considered a media failure, and recovery can be either incomplete or complete depending on the file lost.

Parameter files, control files, redo logs, and Datafiles make up the physical database.

The Control Files

The control files contain information on all physical database files (the database physical structure) and their current transaction state. The control files are read to mount and open the database and transaction numbers are recorded for each datafile. If the control files and datafiles are out of sync, the database will not startup and will report either that recovery is needed or that the datafiles are out of sync with the control files. Control files are required for database startup and recovery. The database is required to have one control file, however a minimum of two on separate physical disks or on separate disk farms in a raid configuration, are recommended.

Control files play an important role in database backup and recovery. Recovery Manager (RMAN) uses the control file of a target database to store some or all, depending upon the configuration, of the backup information. Even if a recovery catalog database is created for RMAN, RMAN uses Control files of the target database to store this information, which is then synchronized at regular intervals or manually.

Hence, it is strictly recommended to multiplex the Control files.

The Parameter File (init_<SID>.ora)

While not considered to be a part of the database since the Oracle processes do not write to it, the parameter file contains the initialization parameters that instruct the Oracle instance how to set itself up with buffers, processes, caches and the like. The parameter file is read while starting the instance during the mount and open phases of startup. Thus, the parameter file sizes all SGA components either through direct parameter values or indirectly from calculations based on parameter values specified in the parameter file. The DBA is responsible for tuning the database using the initialization parameters. Depending on the version of Oracle, there will be more or less parameters available.

Click Here to view a list of the init.ora parameters.

System Parameter File (SPFILE)

SPFILE was first introduced in Oracle 9i and is a major breakthrough in tuning of initialization parameters. The major drawback of PFILE was that any change in the database parameter was to be recorded to the PFILE and the instance was to be restarted to effect that change. Any parameter value that can be changed dynamically had to be modified manually in the PFILE. SPFILE gives the ability to change the value of a parameter dynamically while updating the SPFILE with this current change. Thus any subsequent restart of the database will have the modified value in the SPFILE. Oracle introduced this SPFILE as a dynamic means of maintaining initialization parameters.

Moreover, RMAN has the ability to take the backup of the SPFILE thus enabling us to restore this important file, in case of loss of this file.

SPFILE is a binary file and cannot be modified or read using any text editor.

To check whether the database has been restarted using a SPFILE, following query can be used:

SQL> select isspecified, count(*) from v$spparameter group by isspecified;

ISSPEC   COUNT(*)
------ ----------
FALSE         227
TRUE           32

2 rows selected.

SQL>

The Alert Log (alert_<SID>.log)

The alert log contains informational, warning and error messages dealing with the Oracle core processes and the SGA. Additionally, the alert log contains a history of any physical changes to the database such as the addition or status change of datafiles, redo logs or rollback segments. Using optional initialization parameters information concerning checkpoints can also be recorded in the alert log.

The alert log is the only location where errors such as detection of a corrupted member of a mirrored redo log or the filling of the archive log destination (archiver stuck) are reported. Other informational messages useful for tuning (such as excessive archive waits for a checkpoint or waits due to the redo logs writing to archive) are also reported in the alert log. Oracle recommends that the log be checked at least once per day.

The alert log is continuously written to, thus, it continuously grows in size. If the disk location of the alert log fills up, the instance will slow down or stop until space is made available. You should periodically compress and archive the alert log, or, if it contains no important information, remove it. The instance will recreate it if it's not present.

The Redo Log Files

The redo log files are set up at instance creation. A minimum of two one-member groups in a single thread is required for instance startup. We recommend running three or more groups of two mirrored members with each member being placed on a separate disk or separately controlled disk farm if possible to reduce contention. In most cases we have found five groups optimal to prevent checkpoint and archiver wait errors. The redo logs hold records used for recovery purposes. The redo logs contain information on all data modifying transactions that occur in the database unless these transactions have been run as non-recoverable. The LGWR process writes data on changes from the redo log buffers to the redo logs. A COMMIT is not considered complete until the LGWR signals that all redo log entries have been written to disk. Remember, the redo log buffer is a part of the SGA.

Database Datafiles

The Oracle system uses logical and physical storage. Logical storage uses the concept of tablespaces. A tablespace is physically implemented through one or more datafiles. Datafiles are subdivided into segments which are subdivided into extents which may be of several types depending on what they store or their usage:

Table segment
Index segment
Rollback segment
Temporary segment

A single segment extent cannot span multiple datafiles and must be contiguous.

The table segment contains data that corresponds to a table object in the database. These will usually be the most common type of segments since they store the actual data contained in the database.

The index segment contains table index information. The index segments will contain data and ROWID pointers and will be stored as either B-tree (the most common), cluster (either B-tree or hash) or a bitmapped format, depending on the type of index object the segment is assigned to.

The rollback segment contains records of changes for multiple transactions. Each transaction gets assigned to a single rollback segment extent. The most costly statement in terms of rollback segment space usage is an update because it must capture both the before and after image; the least expensive is a delete because the rollback only captures the deleted ROWIDs.

The Trace Files (*.trc)

The trace files are created either at process startup, when a process abnormally exits, or if certain errors occur in a process. Trace files are forced to be generated by setting initialization parameters, enabling session tracing, or by turning on tracing at the process level with the ALTER SESSION command. Traces are used to track internal process errors and to track what a user process actually does in the case of session traces. A special type of trace called a core dump is used for severe errors; it's a dump of the state of the memory in the memory core area of the SGA.

Session tracing should be used to capture information for tuning session SQL processes. Using a trace file and the TKPROF utility all of a processes SQL statements can be analyzed and the statements execution plans generated. Trace files are placed in the locations specified in the initialization file using the following parameters:

BACKGROUND_DUMP_DEST
CORE_DUMP_DEST
USER_DUMP_DEST

Trace file sizes are limited by the MAX_DUMP_FILE_SIZE initialization parameter. The trace file destinations and core dump destinations should be monitored for space usage, and periodically the files should be cleaned out.

Watch out for core dumps in development environments as they take up a large amount of disk space. (Any abnormal termination of a C process using Oracle may result in a core dump, as will abnormal terminations of other third party tools, such as Visual Basic and Power Builder). If the locations for these dump files fill up, it could result in the instance stalling when it attempts to write to a trace file.

Archive Logs

If the database is created in ARCHIVELOG Mode, the redo entries in the redo log files are written into Archive Log files. Hence, these files are exact copies of the online redo log files and are often called Offline Log Files. ARCn, Archiver Process, is responsible for creating these offline log files, as and when the redo log files get full. In a single thread instance, these files are created sequentially. Archive log file plays an important role in database recovery and need to be applied in the order they were generated. Any loss of an archive log file will require the database to be opened with the resetlogs option and results in a loss of data.

Archive Log files also provide the capability to recover a tablespace to any specific point in time in the database. For any application considered to be production- or mission-critical, archive logging must be turned on. These files can be stored to disk, tape, or even optical storage such as WORM. Using operating system backups such as BACKUP on VMS or TAR on UNIX, and with the application of archive logs, a database can be quickly recovered after disaster. Archive logs can only be used to recover when cold or hot backups are used for Oracle backup.

After each successful hot or cold backup of an Oracle database, the associated archive and export files may be removed and either placed in storage or deleted. In an active database these files may average tens of megabytes per day; storage for this amount of data needs to be planned for. Just for example, at one installation doing Oracle development with no active production databases, 244 MB of archives and over 170 MB of exports were generated in a one-week period. If archive logging is turned on and you run out of archive disk space, the database stops after the last redo log is filled. Plan ahead and monitor disk usage for instances using archive logging.

Archive Log Files are also required for configuring and applying changes to the standby servers, Oracle streams and dataguard. Log Miner also reads these files to capture changes done in the database.

The Oracle Instance

Oracle defines the Oracle instance as the shared global area (SGA) and the required background processes. The base and mandatory processes are SMON, PMON, DBWR, LGWR, CKPT and MMAN (Memory Manager). MMAN process is introduced in Oracle 10g and is used when Automatic Shared Memory Managemet is used. Depending upon the database and application type, there are other optional processes. Some of them are, ARCH, RECO (required on distributed databases), LCKn (where n is 0 to 9, one per parallel instance required), Dnnn (dispatcher process for multi-threaded server MTS), Snnn (server process for MTS), Pnnnn (parallel query slave process for Parallel Query Option [PQO]), SNPn (snapshot/job queue processes). Apart from these background processes, there are some other mandatory background processes when using Real Application Clusters. These are LCKn, LMDn, LMON. Some memory from the SGA is allocated for these background processes and can be seen in FIXED VALUE when querying SHOW SGA. The combination of the SGA and processes is the instance. If any of the required processes dies, the Oracle instance dies. Loss of optional processes such as ARCH, RECO, LCKn or the server or dispatcher processes may result in instance failure. This type of failure is an instance failure and is automatically recovered on the next startup or by one of the other instances in a parallel server configuration as soon as the failed instance's heartbeat loss is detected.

The SGA and its Contents

The SGA is made up of several memory components, the largest of these is usually the database base buffer cache, followed by the shared pool area, and the redo log buffers. The shared pool consists of the shared SQL area (also known as the library cache) where SQL and PL/SQL statements are kept, the data dictionary cache where data dictionary related entries are kept, control structures (latches and locks), control structures (such as character set data, and, if MTS is being used, request and response queues). The SGA also contains the sort extent pool in non-MTS instances; in MTS instances the User Global Area (UGA) contains the sort extent pool.

The Data Base Buffer Cache

The major component of the SGA is the database buffers. The database buffers are controlled by the initialization parameters DB_BLOCK_SIZE and DB_BLOCK_BUFFERS or can be specified using DB_CACHE_SIZE. The DB_BLOCK_SIZE sets the size of database blocks used in all other components of the SGA. The DB_BLOCK_SIZE is a multiple of the operating system block size which is usually a multiple of 512 bytes. On most systems the DB_BLOCK_SIZE is a minimum of 2k bytes. Either of the two can be used to configure the buffer cache. For example:

DB_BLOCK_SIZE = 8192
DB_BLOCK_BUFFERS = 10000

So Buffer Cache Size will be = 8192 * 10000 = 78MB or a single parameter:

DB_CACHE_SIZE = 78M

We strongly recommend using DB_CACHE_SIZE. The reason being db_cache_size is dynamically modifiable and provides a dynamic performance view called Buffer Cache Advisory which helps in tuning the parameter.

Using either of the two mechanism to configure the buffer cache, the internal mechanism remains the same. The database buffer area is divided into the dirty list (buffers which have been modified and are waiting to be written to disk) and the least recently used (LRU) list. Free buffers are buffers that have not been modified and are available for use. If an Oracle server (user) process attempts to move a buffer to the dirty list and the list is full, it signals DBWR, and DBWR writes the dirty buffers back to disk, placing them on the free list. The only action which adds blocks to the database buffer cache is a read by a server process at the request of a user. The database buffer cache stores copies of data blocks and is shared by all users.

Alternatively, multiple buffer pools can be used based on the application understanding. DB_KEEP_CACHE_SIZE and DB_RECYCLE_CACHE_SIZE are two different parameters which are used to configure KEEP and RECYCLE buffer pools in the SGA. KEEP pool can be used to maintain frequently accessed objects whereas a RECYCLE pool can be used to prevent large objects from consuming unnecessary space in cache. The size of the buffers in the KEEP and RECYCLE pools is the primary block size specified by the DB_BLOCK_SIZE parameter.

Starting with Oracle 9i, Oracle introduced multiple buffer size sub-cache to accommodate and make use of multiple block size objects. DB_nK_CACHE_SIZE parameter specifies the value of nK Buffers.

The Shared Pool

The next largest component of the SGA is the shared pool. The shared pool is controlled in size by the initialization parameter SHARED_POOL_SIZE. The parameter SHARED_POOL_SIZE is set in bytes and is generally set to 10-20 percent of available memory. Although in some heavy ad hoc environments it can be much larger. The shared pool holds the parsed SQL and PL/SQL code in the area known as the shared SQL area or library caches.

In dedicated server environments the shared pool can be smaller than the equivalent multi-threaded server environments (MTS). This happens because, in a dedicated server system, the sort areas used by each process are kept in the users Process Global Area (PGA) while in an MTS environment the User Global Area (UGA) of the Shared Pool holds the sort areas. In both environments the temporary extents used for sorting are tracked using the Sort Extent Pool (SEP) which is a part of the shared pool. Anytime a process needs a temporary segment for sorting, it checks the SEP first before allocating a new segment.

The Java Pool

Java uses the shared pool and the Java pool. The default for the Java pool is 20MB. If you get memory-related errors when dealing with Java objects, increase the size of the Java pool.

Error reporting in out-of-memory situations is inconsistent and in some cases may result in a core dump. If you suspect that memory usage is the cause of a core dump, adjust the JAVA_POOL_SIZE in the init.ora file. JAVA_POOL_SIZE should be set to 50,000,000 or higher for large applications. The default value of 20,000,000 should be adequate for typical Java Stored Procedure usage.

The Stream Pool

This memory area is newly introduced in Oracle 10g and is used to scale up the Stream Replication. In previous versions of Oracle 10% of the Shared Pool was used for Streams replication thus causing unnecessary waste of memory and overhead on Shared Pool. Stream Pool, if configured using STREAMS_POOL_SIZE, is a separate memory area assigned in the SGA. This parameter can be specified in bytes, K, M or G and can be dynamically modified.

The Large Pool

This is an optional pool and is specified by using the LARGE_POOL_SIZE initialization parameter. Unlike Shared Pool, Large Pool does not have an LRU list. Hence, Oracle does not attempts to age out the objects in the large pool. The memory to this pool is used and is advisable when any of these instances are configured:

If the queries use Parallel query slaves. Otherwise memory from the shared pool will be used to cache parallel execution message buffers, or
Shared Server Configuration is used. Else, the session memory will be used from the Shared Pool, or
RMAN is used to take backups. Else memory from the shared pool will be used to cache I/O buffers during backups and recovery operations. Large Pool will only be used during RMAN backup and recovery operation, when either of these two parameters are set: DBWR_IO_SLAVES to non-zero or BACKUP_TAPE_IO_SLAVES to True.

Other Shared Global Area Sections

The rest of the SGA is set with other initialization parameters such as:

LOG_ARCHIVE_BUFFER_SIZE which sets the size of each archival buffer in log file blocks (equivalent to operating system blocks)
LOG_ARCHIVE_BUFFERS which sets the number of LOG_ARCHIVE_BUFFER_SIZE sections of memory to allocate for archive log writing
LOG_BUFFER which sets the size of the redo circular buffer.

The PCM Cache in Parallel Server

In the parallel server enabled database, the parameter GC_FILES_TO_LOCKS and GC_DB_LOCKS can control a large if not the largest section of SGA in the parallel server configuration. The PCM (parallel cache management) lock cache is controlled by the GC_FILES_TO_LOCKS parameter that maps the PCM locks to database block buffer size sets of datafile blocks. While the GC_DB_LOCKS sets the maximum number of PCM locks, the locks aren't mapped to an actual cache site until they are mapped to datafile blocks using the GC_FILES_TO_LOCKS parameter. Each cache entry can take up to 115 bytes of memory storage in the SGA, this memory is not taken from other SGA components but is additive.

The PGA and its Contents

The Process Global Area (PGA) contains the user's session variables (session information) and arrays (stack space). Each user is assigned a PGA and in multithreaded server systems, a UGA. The following parameters determine the PGA size:

OPEN_LINKS – Determines the number of database links that each process can use simultaneously.
DB_FILES – Sets the soft limit for the number of database files which can be simultaneously accessed by the database (The actual limit is set by the database build parameter MAX_DATABASE_FILES).
LOG_FILES – Sets the soft limit on the maximum number of log files (redo logs) available for the system (the actual maximum is set by the database build parameter MAX_LOG_FILES and MAX_LOG_GROUPS).

Though not a defined part of the SGA, the user's PGA interacts intimately with the SGA.

The Oracle Instance Processes

The instance consists not only of the SGA and its components, but also the set of processes which interacts with the SGA.

Oracle's processes are listed below:

DBWn – Database Writer
LGWR – Log Writer
SMON – System Monitor
PMON – Process Monitor
ARCH – Archiver Process
RECO – Distributed Transaction Recoverer
LCKn – Lock Process
Dnnn – Dispatcher
Snnn – Servers
Pnnn – Parallel Query Slave
LISTENER – TCPIP server
CKPTxx – Checkpoint
SNPxx – Snapshot process queues
QMNxx – Advanced Queueing Processes
EXTPROC – Callout queues
ARCHMON – Archive Monitor
MMON – Memory Monitor
MMNL – Memory Monitor Light
MMAN – Background Process
RBAL – Background Process
ORBn – Background Process
OSMB – Background Process
RVWR – Recovery Writer
ASMB – Automatic Storage Management
ARBx – Background Process
CTWR – Change Tracking Writer

On multi-user capable systems each user process may spawn several sub-processes, depending on the type of activities being done by that process. Depending on how Oracle is configured a single parallel query may start dozens of query slave processes for a single user!

The Oracle Concepts of Database and Instance

Oracle Database Files

The Oracle Instance

The Oracle Instance Processes

Start the discussion at forums.toadworld.com