Previous | Table of Contents | Next |
Fortunately, some general rules are available for writing efficient SQL in Oracle. These rules may seem simplistic, but following them in a diligent manner will relieve more than half of the SQL tuning problems:
One of the historic problems with SQL involves queries. Simple queries can be written in many different ways, each variant of the query producing the same resultbut with widely different access methods and query speeds.
For example, a simple query such as What students received an A last semester? can be written in three ways, as shown in Listings 4.1, 4.2, and 4.3, each returning an identical result.
Listing 4.1 A standard join.
SELECT * FROM STUDENT, REGISTRATION WHERE STUDENT.student_id = REGISTRATION.student_id AND REGISTRATION.grade = 'A';
Listing 4.2 A nested query.
SELECT * FROM STUDENT WHERE student_id = (SELECT student_id FROM REGISTRATION WHERE grade = 'A' );
Listing 4.3 A correlated subquery.
SELECT * FROM STUDENT WHERE 0 < (SELECT count(*) FROM REGISTRATION WHERE grade = 'A' AND student_id = STUDENT.student_id );
Each of these queries will return identical results. The following discussion will review the basic components of an SQL query, showing how to optimize a query for remote execution.
It is important to note that several steps are required to understand how SQL is used in a distributed database. Distributed SQL queries function in the same way as queries within a single database, with the exception that cross-database joins and updates may utilize indexes that reside on different databases. Regardless, a basic understanding of the behavior of SQL can lead to dramatic performance improvements.
As a general rule, indexes will always increase the performance of a database query. In some databases, such as DB2, in situations where a query intends to sweep a table in the same sequence that the rows are physically stored, the indexes may actually hinder performance. For Oracle, indexes are recommended for two reasons: to speed the retrieval of a small set of rows from a table, and to presort result sets so that the SQL ORDER BY clause does not cause an internal sort.
In order to use an index, the SQL optimizer must recognize that the column has a valid value for index use. This is called a sargeable predicate, and is used to determine the index access. Listing 4.4 shows some valid predicates, and Listing 4.5 shows invalid predicates.
Listing 4.4 Valid predicates.
SELECT * FROM EMPLOYEE WHERE emp_no = 123; SELECT * FROM EMPLOYEE WHERE dept_no = 10;
Listing 4.5 Invalid predicates.
SELECT * FROM EMPLOYEE WHERE emp_no = "123"; SELECT * FROM EMPLOYEE WHERE salary * 2 < 50000; SELECT * FROM EMPLOYEE WHERE dept_no != 10;
Whenever a transformation to a field value takes place, the Oracle database will not be able to use the index for that column.
Some databases, such as DB2, will recognize a linear search and invoke a sequential prefetch to look ahead, reading the next data block while the previous data block is being fetched by the application. As a general rule, an SQL query that retrieves more than 15 percent of the table rows in a table will run faster if the optimizer chooses a full-table scan than if it chooses to use an index.
For example, assume that a student table has 1,000 rows, representing 900 undergraduate students and 100 graduate students. A non-unique index has been built on the student_level field that indicates UNDERGRAD or GRAD. The same query will benefit from different access methods depending upon the value of the literal in the WHERE clause. The following query will retrieve 90 percent of the rows in the table, and will run faster with a full-table scan than it will if the SQL optimizer chooses to use an index:
SELECT * FROM STUDENT WHERE student_level = 'UNDERGRAD';
This next query will only access 10 percent of the table rows, and will run faster by using the index on the student_level field:
SELECT * FROM STUDENT WHERE student_level = 'GRAD';
Unfortunately, the Oracle database cannot predict in advance the number of rows that will be returned from a query. Many SQL optimizers will invoke an index access even though it may not always be the fastest access method.
Previous | Table of Contents | Next |