Previous | Table of Contents | Next |
Because Oracle is the worlds leading relational database, many vendors will offer tools that claim to monitor Oracle and alert the DBA to performance problems. Even Oracle itself is entering the monitoring marketplace with an Oracle Expert tool that was released in the second quarter of 1996.
These tools fall into a couple of broad categories. First are the reactive tools that diagnose a problem after it has been reported. The second group of tools is able to proactively monitor the database and search for trouble before it ever impacts the user.
An example of the reactive class is the AdHawk tool by Eventus Software. AdHawk allows the DBA to rapidly view database locks and SQL in graphical format, quickly pinpointing the cause of a specific performance problem. This tool is especially useful for situations where only a few users are plagued with poor performance, while other users on the database are not experiencing problems.
The second, proactive category consists of tools that constantly monitor the database and create alerts for unusual conditions. These tools allow the developer to set predefined thresholds, which trigger an alert when the threshold has been exceeded. These alerts may take a number of different forms, including a report, a telephone call, or a pager. Some of the more sophisticated tools, such as BMCs Patrol, allow the developer to program corrective actions directly into the rule base and automatically take corrective action when the alert occurs. For example, if a tables next extent is 10 megabytes but only 8 megabytes of free space exist in the tablespace, the alert will alter the tables next extent size to 5 megabytes and alert the DBA on call. This automatic action buys valuable time for the DBA to investigate the problem before a stoppage occurs. Table 11.1 offers a compact glimpse of the various monitoring tools on todays market, complete with vendor.
Tool | Vendor |
---|---|
DBAnalyzer | Database Solutions |
iVIEW Performance Logger | Independence Technologies |
Patrol | BMC software |
EcoTOOLS | Compuware |
DBVision | Platinum Technology |
AdHawk | Eventus Software |
R*SQLab | R*Tech Systems |
When deciding how to monitor your servers, the first conscious choice you will need to make is whether to adopt a proactive or a reactive system. It is unfair to place a judgment on either one of these methods, since each has a legitimate place in Oracle monitoring. Lets review each one in greater detail.
A proactive system is one that makes forecasts based upon statistical information and decides that intervention is necessary in order to prevent a performance problem. For example, a proactive system could detect that a table will reach its maximum allocation of extents tomorrow, thereby hanging the system unless the DBA compresses the table extents.
A reactive system is used after performance problems have been noticed. These types of systems give detailed information about the current status of the Oracle instance and allow the DBA to quickly seeand hopefully correctthe problem.
Now that we have covered the basics, we can take a look at designing an expert system for monitoring Oracle performance. We will need to address the following issues:
As we already know, the V$ tables are of limited value, since they keep running statistics for the total time that the Oracle instance has been running.
Instead, we can alter the existing statistics report to collect information, storing it into a database that we define especially for this purpose. Once we have the data in all of our local hosts, we can transfer the performance and tuning data into a single Oracle database. This database will be the foundation for the creation of automated exception reporting, as well as alerts.
The fastest and simplest way to check the status of a database is to interrogate the V$ tables. While this provides only a crude measure of the overall health, it is usually one of the first techniques that is used. The script in Listing 11.1 is a collection of the most useful V$ scripts, all packaged into a single collection.
Listing 11.1 snapshot.sqla full snapshot of the V$ tables.
REM Remember, you must first run $ORACLE_HOME/rdbms/admin/catblock.sql REM before this script will work. . . SET LINESIZE 75; SET PAGESIZE 9999; SET PAUSE OFF; SET ECHO OFF; SET TERMOUT ON; SET SHOWMODE OFF; SET FEEDBACK OFF; SET NEWPAGE 1; SET VERIFY OFF; --spool /tmp/snap; PROMPT ********************************************************** PROMPT Hit Ratio Section PROMPT ********************************************************** PROMPT PROMPT ========================= PROMPT BUFFER HIT RATIO PROMPT ========================= PROMPT (SHOULD BE > 70, ELSE INCREASE db_block_buffers IN init.ora) --SELECT trunc((1-(sum(decode(name,'physical reads',value,0))/ -- (sum(decode(name,'db block gets',value,0))+ -- (sum(decode(name,'consistent gets',value,0))))) -- )* 100) "Buffer Hit Ratio" --FROM v$sysstat; COLUMN "logical_reads" FORMAT 99,999,999,999 COLUMN "phys_reads" FORMAT 999,999,999 COLUMN "phy_writes" FORMAT 999,999,999 SELECT a.value + b.value "logical_reads", c.value "phys_reads", d.value "phy_writes", ROUND(100 * ((a.value+b.value)-c.value) / (a.value+b.value)) "BUFFER HIT RATIO" FROM v$sysstat a, v$sysstat b, v$sysstat c, v$sysstat d WHERE a.statistic# = 37 AND b.statistic# = 38 AND c.statistic# = 39 AND d.statistic# = 40; PROMPT PROMPT PROMPT ========================= PROMPT DATA DICT HIT RATIO PROMPT ========================= PROMPT (SHOULD BE HIGHER THAN 90 ELSE INCREASE shared_pool_size IN init.ora) PROMPT COLUMN "Data Dict. Gets" FORMAT 999,999,999 COLUMN "Data Dict. cache misses" FORMAT 999,999,999 SELECT sum(gets) "Data Dict. Gets", sum(getmisses) "Data Dict. cache misses", trunc((1-(sum(getmisses)/sum(gets)))*100) "DATA DICT CACHE HIT RATIO" FROM v$rowcache; PROMPT PROMPT ========================= PROMPT LIBRARY CACHE MISS RATIO PROMPT ========================= PROMPT (IF > 1 THEN INCREASE THE shared_pool_size IN init.ora) PROMPT COLUMN "LIBRARY CACHE MISS RATIO" FORMAT 99.9999 COLUMN "executions" FORMAT 999,999,999 COLUMN "Cache misses while executing" FORMAT 999,999,999 SELECT sum(pins) "executions", sum(reloads) "Cache misses while executing", (((sum(reloads)/sum(pins)))) "LIBRARY CACHE MISS RATIO" FROM v$librarycache; PROMPT PROMPT ========================= PROMPT LIBRARY CACHE SECTION PROMPT ========================= PROMPT HIT RATIO SHOULD BE > 70, AND PIN RATIO > 70 ... PROMPT COLUMN "reloads" FORMAT 999,999,999 SELECT namespace, trunc(gethitratio * 100) "Hit ratio", trunc(pinhitratio * 100) "pin hit ratio", RELOADS "reloads" FROM v$librarycache; PROMPT PROMPT PROMPT ========================= PROMPT REDO LOG BUFFER PROMPT ========================= PROMPT (SHOULD BE NEAR 0, ELSE INCREASE SIZE OF LOG_BUFFER IN init.ora) PROMPT SET HEADING OFF COLUMN VALUE FORMAT 999,999,999 SELECT substr(name,1,30), value FROM v$sysstat WHERE NAME = 'redo log space requests'; SET HEADING ON PROMPT PROMPT PROMPT ********************************************************** PROMPT free memory should be > 1,000 PROMPT ********************************************************** PROMPT COLUMN BYTES FORMAT 999,999,999 SELECT NAME, BYTES FROM v$sgastat WHERE NAME = 'free memory'; PROMPT PROMPT ********************************************************** PROMPT SQL SUMMARY SECTION PROMPT ********************************************************** PROMPT COLUMN "Tot SQL run since startup" FORMAT 999,999,999 COLUMN "SQL executing now" FORMAT 999,999,999 SELECT sum(executions) "Tot SQL run since startup", sum(users_executing) "SQL executing now" from v$sqlarea; prompt prompt prompt ********************************************************** prompt Lock Section prompt ********************************************************** prompt prompt ========================= PROMPT SYSTEM-WIDE LOCKS - all requests for locks or latches PROMPT ========================= PROMPT SELECT substr(username,1,12) "User", substr(lock_type,1,18) "Lock Type", substr(mode_held,1,18) "Mode Held" FROM sys.dba_lock a, v$session b WHERE lock_type not in ('Media Recovery','Redo Thread') AND a.session_id = b.sid; PROMPT PROMPT ========================= PROMPT DDL LOCKS - These are usually triggers or other DDL PROMPT ========================= PROMPT SELECT substr(username,1,12) "User", substr(owner,1,8) "Owner", substr(name,1,15) "Name", substr(a.type,1,20) "Type", substr(mode_held,1,11) "Mode held" FROM sys.dba_ddl_locks a, v$session b WHERE a.session_id = b.sid; PROMPT PROMPT ========================= PROMPT DML LOCKS - These are table and row locks... PROMPT ========================= PROMPT SELECT substr(username,1,12) "User", substr(owner,1,8) "Owner", substr(name,1,20) "Name", substr(mode_held,1,21) "Mode held" FROM sys.dba_dml_locks a, v$session b WHERE a.session_id = b.sid; PROMPT PROMPT PROMPT ********************************************************** PROMPT LATCH SECTION PROMPT ********************************************************** PROMPT If miss_ratio or IMMEDIATE_MISS_RATIO > 1 then latch PROMPT Contention exists, decrease LOG_SMALL_ENTRY_MAX_SIZE in init.ora PROMPT COLUMN "miss_ratio" FORMAT .99 COLUMN "immediate_miss_ratio" FORMAT .99 SELECT substr(l.name,1,30) name, (misses/(gets+.001))*100 "miss_ratio", (immediate_misses/(immediate_gets+.001))*100 "immediate_miss_ratio" FROM v$latch l, v$latchname ln WHERE l.latch# = ln.latch# AND ( (misses/(gets+.001))*100 > .2 OR (immediate_misses/(immediate_gets+.001))*100 > .2 ) ORDER BY l.name; PROMPT PROMPT PROMPT ************************************************************** PROMPT ROLLBACK SEGMENT SECTION PROMPT ************************************************************** PROMPT If any count below is > 1% of the total number of requests for data PROMPT then more rollback segments are needed PROMPT If free list > 1% then increase freelist in init.ora --COLUMN COUNT FORMAT 999,999,999 SELECT CLASS, COUNT FROM v$waitstat WHERE CLASS IN ('free list','system undo header','system undo block', 'undo header','undo block') GROUP BY class,count; COLUMN "Tot # of Requests for Data" FORMAT 999,999,999 SELECT sum(value) "Tot # of Requests for Data" FROM v$sysstat WHERE NAME IN ('db block gets', 'consistent gets'); PROMPT PROMPT =========================== PROMPT ROLLBACK SEGMENT CONTENTION PROMPT =========================== PROMPT PROMPT If any ratio is > .01 then more rollback segments are needed COLUMN "Ratio" FORMAT 99.99999 SELECT name, waits, gets, waits/gets "Ratio" FROM v$rollstat a, v$rollname b WHERE a.usn = b.usn; COLUMN "total_waits" FORMAT 999,999,999 COLUMN "total_timeouts" FORMAT 999,999,999 PROMPT PROMPT SET FEEDBACK ON; PROMPT ********************************************************** PROMPT SESSION EVENT SECTION PROMPT ********************************************************** PROMPT IF AVERAGE-WAIT > 0 THEN CONTENTION EXISTS PROMPT SELECT substr(event,1,30) event, total_waits, total_timeouts, average_wait FROM v$session_event WHERE average_wait > 0 ; --OR total_timeouts > 0; PROMPT PROMPT PROMPT ********************************************************** PROMPT QUEUE SECTION PROMPT ********************************************************** PROMPT Average wait for queues should be near zero ... PROMPT COLUMN "totalq" FORMAT 999,999,999 COLUMN "# queued" FORMAT 999,999,999 SELECT paddr, type "Queue type", queued "# queued", wait, totalq, DECODE(totalq,0,0,wait/totalq) "AVG WAIT" FROM v$queue; SET FEEDBACK ON; PROMPT PROMPT PROMPT ********************************************************** PROMPT MULTI-THREADED SERVER SECTION PROMPT ********************************************************** PROMPT PROMPT If the following number is > 1 PROMPT Then increase MTS_MAX_SERVERS parm in init.ora PROMPT SELECT DECODE( totalq, 0, 'No Requests', wait/totalq || ' hundredths of seconds') "Avg wait per request queue" FROM v$queue WHERE TYPE = 'COMMON'; PROMPT PROMPT If the following number increases, consider adding dispatcher processes PROMPT SELECT DECODE( sum(totalq), 0, 'No Responses', sum(wait)/sum(totalq) || ' hundredths of seconds') "Avg wait per response queue" FROM v$queue q, v$dispatcher d WHERE q.type = 'DISPATCHER' AND q.paddr = d.paddr; SET FEEDBACK OFF; PROMPT PROMPT PROMPT ========================= PROMPT DISPATCHER USAGE PROMPT ========================= PROMPT (If Time Busy > 50, then change MTS_MAX_DISPATCHERS in init.ora) COLUMN "Time Busy" FORMAT 999,999.999 COLUMN busy FORMAT 999,999,999 COLUMN idle FORMAT 999,999,999 SELECT name, status, idle, busy, (busy/(busy+idle))*100 "Time Busy" FROM v$dispatcher; PROMPT PROMPT SELECT COUNT(*) "Shared Server Processes" FROM v$shared_server WHERE STATUS = 'QUIT'; PROMPT PROMPT PROMPT high-water mark for the multi-threaded server PROMPT SELECT * FROM v$mts; PROMPT PROMPT ***************************************************************** PROMPT file i/o should be evenly distributed across drives. PROMPT SELECT substr(a.file#,1,2) "#", substr(a.name,1,30) "Name", a.status, a.bytes, b.phyrds, b.phywrts FROM v$datafile a, v$filestat b WHERE a.file# = b.file#; SELECT substr(name,1,55) system_statistic, VALUE FROM v$sysstat ORDER BY name; SPOOL OFF;
Previous | Table of Contents | Next |