This topic summarizes hints for tuning different components of your parallel server database.

Data Separation

Follow these rules for separating data:

  • If at all possible, ensure that database instances are not accessing the same data.
     
  • Place data that are accessed by different instances on different disks.
     
  • If possible, store the data stored by different instances into different tables. For example, if your database runs the accounting systems for many different organizations, consider having a separate set of tables for each organization. Place each instance's tables into its own tablespace.

Minimizing PINGS and FALSE PINGS (sometimes known as FALSE COLLISIONS) will be the primary job of a shared instance DBA. One thing to remember is that if the application cannot be partitioned, you will not be able to eliminate PINGS. Click on the following links to view data separation rules for each object type:

Tuning Instances

One of the most significant advantages of using parallel server is that it allows you to tune the instance to suit the types of users using the node. If you have separated your OLTP users from the batch users, the batch instance will have a larger value for the DB_BLOCK_BUFFERS, SORT_AREA_SIZE, and LOG_BUFFER INIT.ORA parameters. The OLTP instance, will most likely have a large SHARED_POOL_SIZE, can use Multithreaded Servers etc. The OLTP machine will probably be more powerful and have more processors. The batch machine is likely to make more use of parallel query, so you will have to set PARALLEL_MIN_SERVERS and the other parallel query parameters for this instance.

Depending on the types and numbers of users on each of the instances, you may have a machine that is just the right size to suit your needs. The catch is that the machine must have adequate I/O slots to be able to share the disks with the other machines that share the database. The maximum number of I/O slots, and therefore the number of disks that the machine can handle is an essential piece of information. In some cases, the hardware vendor may advise you to use disk clusters (many disks on one channel) to overcome the lack of I/O slots, but this may turn out to be unsatisfactory because of the high potential for disk I/O bottlenecks.

Remember that if you have partitioned your application for parallel server, users must log on to the appropriate instance. If users have to log on to two applications, they may need to have two physical logins, one for each machine. Alternatively, there may be an operating system menu.

Ongoing Tuning

All of the tuning that you would perform on a single-instance database is also valid for parallel server. When tuning parallel server, start up the database in a consistent instance order. Also, tune all of the individual instances that make up the parallel server environment.

To obtain the number of blocks written to the database to satisfy pings, run the following query. The output indicates the percentage of physical writes that are the result of pings. The nearer to zero the better. Figures as high as 10% usually indicate that you should redesign your table or index access:

SELECT(a.value / b.value) * 10
  FROM v$sysstat a, v$sysstat b
 WHERE a.name = 'DBWR cross instance writes'
      AND b.name  = 'physical writes';

CATPARR.sql and Its Views

Oracle provides a script, catparr.sql, that you must use to get the most out of parallel server. The directory in which the file is located depends on your operating system. On UNIX systems, it is found in the $ORACLE_HOME/rdbms/admin directory. It must be run as SYS or INTERNAL. The script creates a series of objects that can be viewed to obtain all kinds of useful parallel server tuning information. Running this script creates the full set of tables and views listed below.

Using V_$SYSSTAT

V_$SYSSTAT has several other statistics that are useful in tuning parallel server. Run the following:

SELECT substr(name, 1, 25), COUNT(*)
   FROM v_$sysstat
  WHERE name LIKE 'global%'
     OR name LIKE 'nxt scns gotten%';

 "global lock converts (async)"  are mostly PCM locks
 "global lock converts (non sync)"  are mostly used for SCN locks
 "global lock gets (non async)"  are mostly Table and Transaction
       locks opens
 "global lock releases (async)"  are mostly Table and Transaction
       lock closes
 "next scns gotten without going to DLM" are SCN locks

Continue to tune as long as tuning continues to decrease the number of locks shown in the output for a consistent level of workload. Reducing lock converts will also improve your scalability. This report does not show fine-grained locking which is an option in Oracle7.3 and later.

Here is another ratio from the V$SYSSTAT that demonstrates the scalability of your parallel server site:

SELECT  ((a.value + c.value + d.value) - b.value) * 100
                            /
          (a.value + c.value + d.value)  "Lock Conversion Ratio"
    FROM v$sysstat a, v$sysstat b, v$sysstat c, v_$sysstat d
   WHERE a.name = 'consistent gets'
     AND b.name = 'async lock converts';
     AND c.name = 'db block changes';
     AND d.name = 'consistent changes';

If you have fully tuned your parallel server site and your lock conversion ratio drops below 95%, there is a strong possibility that your site will not scale as new nodes are added. Once again, remember that this query does not take Oracle7.3 and later fine-grained locking into account.

Dynamic Performance Views

Oracle provides many dynamic performance views that give information about the database, instance architecture and performance. Click here for more information on these views.

Select * from
	V_$sysstat
 where class = 32 or class = 40 ;

The table below gives the description of important statistics available by executing the above query:

Statistic Description
global lock gets (non-async) Number of instance locks (mainly DML locks) that were acquired when other tasks could run during the wait such as TM locks
global lock gets (asynchronous) Number of instance locks that were acquired when other tasks could be enabled during the wait
global lock get time Amount of time (1/100's of a second) spent waiting to acquire instance locks
global lock converts (non-async) Number of instance locks (mainly SCN locks) that were converted from one mode to another, while no other tasks could run
global lock converts (asynchronous) Number of instance locks (mainly PCM locks) that were converted from one mode to another while other tasks were enabled.
global lock convert time Amount of time (1/100's of a second) spent waiting for instance lock conversion
global lock releases (non-asynchronous) Number of instance locks that were released when other tasks could run during the wait
global lock releases (asynchronous) Number of instance locks (mainly DML locks) that were released when other tasks could be enabled during the wait
global lock release time Amount of time (1/100's of a second) spent waiting to release instance locks
DBWR cross instance writes Number of blocks written so that another instance could do a consistent read. This value equates to the number of blocks written to disk because of a ping
remote instance undo writes Number of times this instance wrote a dirty undo block so that another instance could read it
remote instance undo requests Number of times this instance acquired an SS lock on undo from another instance in order to perform a consistent read
cross instance CR read Number of times the current instance had to do a consistent read in parallel mode. Large value indicates instance is spending too much time waiting on blocks modified by other instances
next scns gotten without going to DLM Number of SCN's obtained without resorting to DLM. Divide this value by total user commits statistics to calculate the percentage of SCN gets satisfied from the cache and thus measure effectiveness of OPS's SCN cache
hash latch wait gets Number of times latches had to be obtained using waits rather than no-waits gets
lock element waits Number of waits for a PCM lock

Start the discussion at forums.toadworld.com