Previous Table of Contents Next


CHAPTER 11
Oracle Application Monitoring

Because Oracle is the world’s leading relational database, many vendors will offer tools that claim to monitor Oracle and alert the DBA to performance problems. Even Oracle itself is entering the monitoring marketplace with an Oracle Expert tool that was released in the second quarter of 1996.

These tools fall into a couple of broad categories. First are the reactive tools that diagnose a problem after it has been reported. The second group of tools is able to proactively monitor the database and search for trouble before it ever impacts the user.

An example of the reactive class is the AdHawk tool by Eventus Software. AdHawk allows the DBA to rapidly view database locks and SQL in graphical format, quickly pinpointing the cause of a specific performance problem. This tool is especially useful for situations where only a few users are plagued with poor performance, while other users on the database are not experiencing problems.

The second, proactive category consists of tools that constantly monitor the database and create “alerts” for unusual conditions. These tools allow the developer to set predefined thresholds, which trigger an alert when the threshold has been exceeded. These alerts may take a number of different forms, including a report, a telephone call, or a pager. Some of the more sophisticated tools, such as BMC’s Patrol, allow the developer to program corrective actions directly into the rule base and automatically take corrective action when the alert occurs. For example, if a table’s next extent is 10 megabytes but only 8 megabytes of free space exist in the tablespace, the alert will alter the table’s next extent size to 5 megabytes and alert the DBA on call. This automatic action buys valuable time for the DBA to investigate the problem before a stoppage occurs. Table 11.1 offers a compact glimpse of the various monitoring tools on today’s market, complete with vendor.

Table 11.1 Representative Oracle monitoring tools.
Tool Vendor
DBAnalyzer Database Solutions
iVIEW Performance Logger Independence Technologies
Patrol BMC software
EcoTOOLS Compuware
DBVision Platinum Technology
AdHawk Eventus Software
R*SQLab R*Tech Systems

Proactive Vs. Reactive Measuring Techniques

When deciding how to monitor your servers, the first conscious choice you will need to make is whether to adopt a proactive or a reactive system. It is unfair to place a judgment on either one of these methods, since each has a legitimate place in Oracle monitoring. Let’s review each one in greater detail.

A proactive system is one that makes forecasts based upon statistical information and decides that intervention is necessary in order to prevent a performance problem. For example, a proactive system could detect that a table will reach its maximum allocation of extents tomorrow, thereby hanging the system unless the DBA compresses the table extents.

A reactive system is used after performance problems have been noticed. These types of systems give detailed information about the current status of the Oracle instance and allow the DBA to quickly see—and hopefully correct—the problem.

Creating An Oracle-Based Monitor

Now that we have covered the basics, we can take a look at designing an expert system for monitoring Oracle performance. We will need to address the following issues:

  What data will we collect, and how will we collect it?
  How do we create a centralized repository for performance information?
  How will the centralized database be accessed?
  How do we automate the exception reporting?

As we already know, the V$ tables are of limited value, since they keep running statistics for the total time that the Oracle instance has been running.

Instead, we can alter the existing statistics report to collect information, storing it into a database that we define especially for this purpose. Once we have the data in all of our local hosts, we can transfer the performance and tuning data into a single Oracle database. This database will be the foundation for the creation of automated exception reporting, as well as alerts.

Gathering The Oracle Performance Statistics

The fastest and simplest way to check the status of a database is to interrogate the V$ tables. While this provides only a crude measure of the overall health, it is usually one of the first techniques that is used. The script in Listing 11.1 is a collection of the most useful V$ scripts, all packaged into a single collection.

Listing 11.1 snapshot.sql—a full snapshot of the V$ tables.

REM   Remember, you must first run $ORACLE_HOME/rdbms/admin/catblock.sql
REM   before this script will work. . .
SET LINESIZE 75;
SET PAGESIZE 9999;
SET PAUSE OFF;
SET ECHO OFF;
SET TERMOUT ON;
SET SHOWMODE OFF;
SET FEEDBACK OFF;
SET NEWPAGE 1;
SET VERIFY OFF;

--spool /tmp/snap;

PROMPT **********************************************************
PROMPT  Hit Ratio Section
PROMPT **********************************************************
PROMPT
PROMPT         =========================
PROMPT         BUFFER HIT RATIO
PROMPT         =========================
PROMPT (SHOULD BE > 70, ELSE INCREASE db_block_buffers IN init.ora)

--SELECT trunc((1-(sum(decode(name,'physical reads',value,0))/
--                (sum(decode(name,'db block gets',value,0))+
--                (sum(decode(name,'consistent gets',value,0)))))
--             )* 100) "Buffer Hit Ratio"
--FROM v$sysstat;

COLUMN "logical_reads" FORMAT 99,999,999,999
COLUMN "phys_reads"    FORMAT 999,999,999
COLUMN "phy_writes"    FORMAT 999,999,999
SELECT a.value + b.value  "logical_reads",
       c.value            "phys_reads",
       d.value            "phy_writes",
       ROUND(100 * ((a.value+b.value)-c.value) / (a.value+b.value))
         "BUFFER HIT RATIO"
FROM v$sysstat a, v$sysstat b, v$sysstat c, v$sysstat d
WHERE
   a.statistic# = 37
AND
   b.statistic# = 38
AND
   c.statistic# = 39
AND
   d.statistic# = 40;

PROMPT
PROMPT
PROMPT         =========================
PROMPT         DATA DICT HIT RATIO
PROMPT         =========================
PROMPT (SHOULD BE HIGHER THAN 90 ELSE INCREASE shared_pool_size IN
  init.ora)
PROMPT

COLUMN "Data Dict. Gets"            FORMAT 999,999,999
COLUMN "Data Dict. cache misses"    FORMAT 999,999,999
SELECT sum(gets) "Data Dict. Gets",
       sum(getmisses) "Data Dict. cache misses",
       trunc((1-(sum(getmisses)/sum(gets)))*100)
       "DATA DICT CACHE HIT RATIO"
FROM v$rowcache;

PROMPT
PROMPT         =========================
PROMPT         LIBRARY CACHE MISS RATIO
PROMPT         =========================
PROMPT (IF > 1 THEN INCREASE THE shared_pool_size IN init.ora)
PROMPT
COLUMN "LIBRARY CACHE MISS RATIO"        FORMAT 99.9999
COLUMN "executions"                      FORMAT 999,999,999
COLUMN "Cache misses while executing"    FORMAT 999,999,999
SELECT sum(pins) "executions", sum(reloads) "Cache misses while
       executing",
       (((sum(reloads)/sum(pins)))) "LIBRARY CACHE MISS RATIO"
FROM v$librarycache;

PROMPT
PROMPT         =========================
PROMPT          LIBRARY CACHE SECTION
PROMPT         =========================
PROMPT HIT RATIO SHOULD BE > 70, AND PIN RATIO > 70 ...
PROMPT

COLUMN "reloads" FORMAT 999,999,999
SELECT namespace, trunc(gethitratio * 100) "Hit ratio",
       trunc(pinhitratio * 100) "pin hit ratio", RELOADS "reloads"
FROM v$librarycache;
PROMPT
PROMPT
PROMPT         =========================
PROMPT         REDO LOG BUFFER
PROMPT         =========================
PROMPT (SHOULD BE NEAR 0, ELSE INCREASE SIZE OF LOG_BUFFER IN init.ora)
PROMPT
SET HEADING OFF
COLUMN VALUE FORMAT 999,999,999
SELECT substr(name,1,30),
       value
FROM v$sysstat WHERE NAME = 'redo log space requests';

SET HEADING ON
PROMPT
PROMPT
PROMPT **********************************************************
PROMPT free memory should be > 1,000
PROMPT **********************************************************
PROMPT

COLUMN BYTES FORMAT 999,999,999
SELECT NAME, BYTES FROM v$sgastat WHERE NAME = 'free memory';

PROMPT
PROMPT **********************************************************
PROMPT  SQL SUMMARY SECTION
PROMPT **********************************************************
PROMPT
COLUMN "Tot SQL run since startup"    FORMAT 999,999,999
COLUMN "SQL executing now"            FORMAT 999,999,999
SELECT sum(executions) "Tot SQL run since startup",
       sum(users_executing) "SQL executing now"
       from v$sqlarea;

prompt
prompt
prompt **********************************************************
prompt  Lock Section
prompt **********************************************************
prompt
prompt         =========================
PROMPT          SYSTEM-WIDE LOCKS - all requests for locks or latches
PROMPT         =========================
PROMPT
SELECT substr(username,1,12)  "User",
       substr(lock_type,1,18) "Lock Type",
       substr(mode_held,1,18) "Mode Held"
FROM sys.dba_lock a, v$session b
WHERE lock_type not in ('Media Recovery','Redo Thread')
AND a.session_id = b.sid;
PROMPT
PROMPT         =========================
PROMPT          DDL LOCKS - These are usually triggers or other DDL
PROMPT         =========================
PROMPT
SELECT substr(username,1,12) "User",
       substr(owner,1,8) "Owner",
       substr(name,1,15)  "Name",
       substr(a.type,1,20)  "Type",
       substr(mode_held,1,11) "Mode held"
FROM sys.dba_ddl_locks a, v$session b
WHERE a.session_id = b.sid;

PROMPT
PROMPT         =========================
PROMPT           DML LOCKS - These are table and row locks...
PROMPT         =========================
PROMPT
SELECT substr(username,1,12) "User",
       substr(owner,1,8) "Owner",
       substr(name,1,20)  "Name",
       substr(mode_held,1,21) "Mode held"
FROM sys.dba_dml_locks a, v$session b
WHERE a.session_id = b.sid;

PROMPT
PROMPT
PROMPT **********************************************************
PROMPT  LATCH SECTION
PROMPT **********************************************************
PROMPT If miss_ratio or IMMEDIATE_MISS_RATIO > 1 then  latch
PROMPT Contention exists, decrease LOG_SMALL_ENTRY_MAX_SIZE in init.ora
PROMPT
COLUMN "miss_ratio"           FORMAT .99
COLUMN "immediate_miss_ratio" FORMAT .99
SELECT substr(l.name,1,30) name,
       (misses/(gets+.001))*100 "miss_ratio",
       (immediate_misses/(immediate_gets+.001))*100
       "immediate_miss_ratio"
FROM v$latch l, v$latchname ln
WHERE l.latch# = ln.latch#
AND (
(misses/(gets+.001))*100 > .2
OR
(immediate_misses/(immediate_gets+.001))*100 > .2
)
ORDER BY l.name;

PROMPT
PROMPT
PROMPT **************************************************************
PROMPT  ROLLBACK SEGMENT SECTION
PROMPT **************************************************************
PROMPT If any count below is > 1% of the total number of requests for
  data
PROMPT then more rollback segments are needed
PROMPT If free list > 1% then increase freelist in init.ora

--COLUMN COUNT FORMAT 999,999,999
SELECT CLASS, COUNT
FROM v$waitstat
WHERE CLASS IN ('free list','system undo header','system undo block',
                'undo header','undo block')
GROUP BY class,count;

COLUMN "Tot # of Requests for Data" FORMAT 999,999,999
SELECT sum(value) "Tot # of Requests for Data" FROM v$sysstat WHERE
NAME IN ('db block gets', 'consistent gets');

PROMPT
PROMPT          ===========================
PROMPT          ROLLBACK SEGMENT CONTENTION
PROMPT          ===========================
PROMPT
PROMPT          If any ratio is > .01 then more rollback segments are needed

COLUMN "Ratio" FORMAT 99.99999
SELECT name, waits, gets, waits/gets "Ratio"
FROM v$rollstat a, v$rollname b
WHERE a.usn = b.usn;

COLUMN "total_waits"    FORMAT 999,999,999
COLUMN "total_timeouts" FORMAT 999,999,999
PROMPT
PROMPT
SET FEEDBACK ON;
PROMPT **********************************************************
PROMPT  SESSION EVENT SECTION
PROMPT **********************************************************
PROMPT IF AVERAGE-WAIT > 0 THEN CONTENTION EXISTS
PROMPT
SELECT substr(event,1,30) event,
       total_waits, total_timeouts, average_wait
FROM v$session_event
WHERE average_wait > 0 ;
--OR total_timeouts > 0;

PROMPT
PROMPT
PROMPT **********************************************************
PROMPT QUEUE SECTION
PROMPT **********************************************************
PROMPT Average wait for queues should be near zero ...
PROMPT
COLUMN "totalq"   FORMAT 999,999,999
COLUMN "# queued" FORMAT 999,999,999
SELECT paddr, type "Queue type", queued "# queued", wait, totalq,
DECODE(totalq,0,0,wait/totalq) "AVG WAIT" FROM v$queue;

SET FEEDBACK ON;
PROMPT
PROMPT
PROMPT **********************************************************
PROMPT  MULTI-THREADED SERVER SECTION
PROMPT **********************************************************
PROMPT
PROMPT    If the following number is > 1
PROMPT    Then increase MTS_MAX_SERVERS parm in init.ora
PROMPT
SELECT DECODE( totalq, 0, 'No Requests',
               wait/totalq || ' hundredths of seconds')
               "Avg wait per request queue"
FROM v$queue
WHERE TYPE = 'COMMON';

PROMPT
PROMPT If the following number increases, consider adding dispatcher
  processes
PROMPT
SELECT DECODE( sum(totalq), 0, 'No Responses',
               sum(wait)/sum(totalq) || ' hundredths of seconds')
               "Avg wait per response queue"
FROM v$queue q, v$dispatcher d
WHERE q.type = 'DISPATCHER'
AND q.paddr = d.paddr;

SET FEEDBACK OFF;
PROMPT
PROMPT
PROMPT         =========================
PROMPT          DISPATCHER USAGE
PROMPT         =========================
PROMPT (If Time Busy > 50, then change MTS_MAX_DISPATCHERS in init.ora)
COLUMN "Time Busy" FORMAT 999,999.999
COLUMN busy        FORMAT 999,999,999
COLUMN idle        FORMAT 999,999,999
SELECT name, status, idle, busy,
       (busy/(busy+idle))*100 "Time Busy"
FROM v$dispatcher;

PROMPT
PROMPT
SELECT COUNT(*) "Shared Server Processes"
FROM v$shared_server
WHERE STATUS = 'QUIT';

PROMPT
PROMPT
PROMPT high-water mark for the multi-threaded server
PROMPT

SELECT * FROM v$mts;

PROMPT
PROMPT *****************************************************************
PROMPT file i/o should be evenly distributed across drives.
PROMPT

SELECT
       substr(a.file#,1,2) "#",
       substr(a.name,1,30) "Name",
       a.status,
       a.bytes,
       b.phyrds,
       b.phywrts
FROM v$datafile a, v$filestat b
WHERE a.file# = b.file#;

SELECT substr(name,1,55) system_statistic, VALUE
FROM v$sysstat
ORDER BY name;
SPOOL OFF;


Previous Table of Contents Next