Previous Table of Contents Next


General Tips For Efficient SQL

Fortunately, some general rules are available for writing efficient SQL in Oracle. These rules may seem simplistic, but following them in a diligent manner will relieve more than half of the SQL tuning problems:

  Never do a calculation on an indexed column (e.g., WHERE salary*5 > :myvalue).
  Whenever possible, use the UNION statement instead of OR.
  Avoid the use of NOT IN or HAVING in the WHERE clause. Instead use the NOT EXISTS clause.
  Always specify numeric values in numeric form, and character values in character form (e.g., WHERE emp_number = 565, WHERE emp_name = ‘Jones’).
  Avoid specifying NULL in an indexed column.
  Avoid the LIKE parameter if = will suffice. Using any Oracle function will invalidate the index, causing a full-table scan.
  Never mix data types in Oracle queries, as it will invalidate the index. If the column is numeric, remember not to use quotes (e.g., salary = 50000). For char index columns, always use quotes (e.g., name = ‘BURLESON’).
  Remember that Oracle’s rule-based optimizer looks at the order of table name in the FROM clause to determine the driving table. Always make sure that the last table specified in the FROM clause is the table that will return the smallest number of rows. In other words, specify multiple tables with the largest result set table specified first in the WHERE clause.
  Avoid using subqueries when a JOIN will do the job.
  Use the Oracle decode function to minimize the number of times that a table has to be selected.
  To turn off an index that you do not want to use (only with cost-based), concatenate a null string to the index column name (e.g., name||'), or add zero to a numeric column name (e.g., salary+0).
  If your query will return more than 20 percent of the rows in the table, use a full-table scan rather than an index scan.
  Always use table aliases when referencing columns.

One of the historic problems with SQL involves queries. Simple queries can be written in many different ways, each variant of the query producing the same result—but with widely different access methods and query speeds.

For example, a simple query such as “What students received an A last semester?” can be written in three ways, as shown in Listings 4.1, 4.2, and 4.3, each returning an identical result.

Listing 4.1 A standard join.

SELECT *
FROM STUDENT, REGISTRATION
WHERE
    STUDENT.student_id = REGISTRATION.student_id
AND
    REGISTRATION.grade = 'A';

Listing 4.2 A nested query.

SELECT *
FROM STUDENT
WHERE
    student_id =
    (SELECT student_id
        FROM REGISTRATION
        WHERE
        grade = 'A'
    );

Listing 4.3 A correlated subquery.

SELECT *
FROM STUDENT
WHERE
    0 <
    (SELECT count(*)
        FROM REGISTRATION
        WHERE
        grade = 'A'
        AND
        student_id = STUDENT.student_id
    );

Each of these queries will return identical results. The following discussion will review the basic components of an SQL query, showing how to optimize a query for remote execution.

It is important to note that several steps are required to understand how SQL is used in a distributed database. Distributed SQL queries function in the same way as queries within a single database, with the exception that cross-database joins and updates may utilize indexes that reside on different databases. Regardless, a basic understanding of the behavior of SQL can lead to dramatic performance improvements.

Tuning SQL With Indexes

As a general rule, indexes will always increase the performance of a database query. In some databases, such as DB2, in situations where a query intends to “sweep” a table in the same sequence that the rows are physically stored, the indexes may actually hinder performance. For Oracle, indexes are recommended for two reasons: to speed the retrieval of a small set of rows from a table, and to “presort” result sets so that the SQL ORDER BY clause does not cause an internal sort.

In order to use an index, the SQL optimizer must recognize that the column has a valid value for index use. This is called a sargeable predicate, and is used to determine the index access. Listing 4.4 shows some valid predicates, and Listing 4.5 shows invalid predicates.

Listing 4.4 Valid predicates.

SELECT * FROM EMPLOYEE WHERE emp_no = 123;

SELECT * FROM EMPLOYEE WHERE dept_no = 10;

Listing 4.5 Invalid predicates.

SELECT * FROM EMPLOYEE WHERE emp_no = "123";

SELECT * FROM EMPLOYEE WHERE salary * 2 < 50000;

SELECT * FROM EMPLOYEE WHERE dept_no != 10;

Whenever a transformation to a field value takes place, the Oracle database will not be able to use the index for that column.

Some databases, such as DB2, will recognize a linear search and invoke a “sequential prefetch” to look ahead, reading the next data block while the previous data block is being fetched by the application. As a general rule, an SQL query that retrieves more than 15 percent of the table rows in a table will run faster if the optimizer chooses a full-table scan than if it chooses to use an index.

For example, assume that a student table has 1,000 rows, representing 900 undergraduate students and 100 graduate students. A non-unique index has been built on the student_level field that indicates UNDERGRAD or GRAD. The same query will benefit from different access methods depending upon the value of the literal in the WHERE clause. The following query will retrieve 90 percent of the rows in the table, and will run faster with a full-table scan than it will if the SQL optimizer chooses to use an index:

SELECT * FROM STUDENT WHERE student_level = 'UNDERGRAD';

This next query will only access 10 percent of the table rows, and will run faster by using the index on the student_level field:

SELECT * FROM STUDENT WHERE student_level = 'GRAD';

Unfortunately, the Oracle database cannot predict in advance the number of rows that will be returned from a query. Many SQL optimizers will invoke an index access even though it may not always be the fastest access method.


Previous Table of Contents Next