Previous | Table of Contents | Next |
Once a good logical model has been created, it is critical to translate the logical model into an Oracle implementation capable of performing at optimal speed. Because Oracle involves so many design issues, this chapter will take a detailed look at how specific physical design tools can be used to exploit client/server performance. Topics include:
In Oracle, an index is used to speed up the time that is required to access table information. Internally, Oracle indexes are B-tree data structures in which each tree node may contain many sets of key values and row-IDs.
In general, Oracle indexes exist for the purpose of preventing full-table scans. Full-table scans have two problems, the most important being the time lost in servicing the request as each and every row of the table is read into Oracles buffer pool. In addition to causing the individual task performance to suffer, a full-table scan will also cause performance degradation at the system level and all other tasks on the system may have to incur additional I/O, because the buffer block held by competing tasks will have been flushed by the full-table scan. As blocks are flushed from the buffer pool, other tasks are required to incur additional I/Os to reread information that would have remained in the buffer pool if the full-table scan had not been invoked. More information on optimizing full-table scans can be found in Chapter 6, Oracle DBA Performance And Tuning.
In general, just about any Oracle table will benefit from the use of indexes. The only exception to this rule would be a very small table that can be read in less than two-block I/Os. Two-block I/Os are used as this guideline because Oracle will need to perform at least one I/O to access the root node of the index tree and another I/O to retrieve the requested data. For example, assume that a lookup table contains rows of 25 bytes each, and you have configured Oracle to use 4 K block sizes. Since each data block would hold about 150 rows, using an index for up to 300 rows would not make the processing any faster than a full-table scan.
If you plan to use the Oracle parallel query facility, all tables specified in the SQL query must be optimized for a full-table scan. If an index exists, the cost-based optimizer must be used with a hint to invalidate the index in order to use parallel query. For the rule-based optimizer, indexes can be turned off by using an Oracle function in the WHERE clause. Look for details on this in Chapter 4, Tuning Oracle SQL.
One important concept in indexing is the selectivity or the uniqueness of the values in a column. To be the most effective, an index column must have many unique values. Columns that have only a few values (e.g., sex = m/f, status = y/n) would not be good candidates for indexing. Similar to the index itself, the sparse distribution of values would be less efficient than a full-table scan. To see the selectivity for a column, compare the total number of rows in the table with the number of distinct values for the column:
SELECT count(*) FROM CUSTOMER; SELECT DISTINCT STATUS FROM CUSTOMER;
Another concept used in indexing is called distribution, which refers to the frequency that each unique value is distributed within the database. For example, we may have a state_abbreviation column that contains one of 50 possible values. This would be acceptable to use as an index column, provided that the state abbreviations are uniformly distributed across the rows. However, if 90 percent of the values are for New York, then the index will not be very effective. Oracle has addressed the index data distribution issue with the ANALYZE TABLE command. When using Oracles cost-based SQL optimizer, ANALYZE TABLE will look at both the selectivity and distribution of the column values. If they are found to be out-of-bounds, Oracle may decide not to use the index.
Oracle recommends the following guidelines when considering whether to index on a column:
Most programmers do not realize that database deadlocks occur frequently within the database indexes. It is important to note that a SELECT of a single row from the database may cause more than one lock entry to be placed in the storage pool as all affected index rows are also locked. In other words, the individual row receives a lock, but each index node that contains the value for that row will also have locks assigned (see Figure 2.1). If the last entry in a sorted index is retrieved, the database will lock all index nodes that reference the indexed value, in case the user changes that value. Since many indexing schemes always carry the high-order key in multiple index nodes, an entire branch of the index tree can be lockedall the way up to the root node of the index. While each databases indexing scheme is different, some relational database vendors recommend that tables with ascending keys be loaded in descending order, so that the rows are loaded from Z to A on an alphabetic key field. Other databases such as Oracle recommend that the indexes be dropped and re-created after the rows have been loaded into an empty table.
Figure 2.1 An overview of Oracle locking.
Previous | Table of Contents | Next |