Oracle7 Server Application Developer's Guide

Managing Hash Clusters and Clustered Tables

The following sections explain how to create, alter, and drop hash clusters and clustered tables using SQL commands.

Creating Hash Clusters and Clustered Tables

A hash cluster is used to store individual tables or a group of clustered tables that are static and often queried by equality queries. Once you create a hash cluster, you can create tables. To create a hash cluster, use the SQL command CREATE CLUSTER. The following statement creates a cluster named TRIAL_CLUSTER that is used to store the TRIAL table, clustered by the TRIALNO column:

CREATE CLUSTER trial_cluster (trialno NUMBER(5,0))
   PCTUSED 80     PCTFREE 5
   SIZE 2K
   HASH IS trialno HASHKEYS 100000;

CREATE TABLE trial (
   trialno NUMBER(5) PRIMARY KEY,
   ...)
   CLUSTER trial_cluster (trialno);

Controlling Space Usage Within a Hash Cluster

When you create a hash cluster, it is important that you correctly choose the cluster key and set the HASH IS, SIZE, and HASHKEYS parameters to achieve the desired performance and space usage for the cluster. The following sections provide guidance, as well as examples of setting these parameters.

Choosing the Key

Choosing the correct cluster key is dependent on the most common types of queries issued against the clustered tables. For example, consider the EMP table in a hash cluster. If queries often select rows by employee number, the EMPNO column should be the cluster key; if queries often select rows by department number, the DEPTNO column should be the cluster key. For hash clusters that contain a single table, the cluster key is typically the entire primary key of the contained table. A hash cluster with a composite key must use Oracle's internal hash function.

Setting HASH IS

Only specify the HASH IS parameter if the cluster key is a single column of the NUMBER datatype, and contains uniformly distributed integers. If the above conditions apply, you can distribute rows in the cluster such that each unique cluster key value hashes to a unique hash value (with no collisions). If the above conditions do not apply, you should use the internal hash function.

Dropping Hash Clusters

Drop a hash cluster using the SQL command DROP CLUSTER:

DROP CLUSTER emp_dept;

Drop a table in a hash cluster using the SQL command DROP TABLE. The implications of dropping hash clusters and tables in hash clusters are the same as for index clusters. See for more information about dropping clusters and the required privileges.

When to Use Hashing

Storing a table in a hash cluster is an alternative to storing the same table with an index. Hashing is useful in the following situations:

Most queries are equality queries on the cluster key. For example:

		SELECT . . . WHERE cluster_key = . . . ;

In such cases, the cluster key in the equality condition is hashed, and the corresponding hash key is usually found with a single read. With an indexed table, the key value must first be found in the index (usually several reads), and then the row is read from the table (another read).

The table or tables in the hash cluster are primarily static in size such that you can determine the number of rows and amount of space required for the tables in the cluster. If tables in a hash cluster require more space than the initial allocation for the cluster, performance degradation can be substantial because overflow blocks are required.

A hash cluster with the HASH IS col, HASHKEYS n, and SIZE m clauses is an ideal representation for an array (table) of n items (rows) where each item consists of m bytes of data. For example:

      ARRAY X[100] OF NUMBER(8)

could be represented as

      CREATE CLUSTER c(subscript INTEGER)
         HASH IS subscript HASHKEYS 100 SIZE 10;
      CREATE TABLE x(subscript NUMBER(2)), value NUMBER(8))
         CLUSTER c(subscript);

Alternatively, hashing is not advantageous in the following situations:

Most queries on the table retrieve rows over a range of cluster key values. For example, in full table scans, or queries such as

      SELECT . . . WHERE cluster_key < . . . ;

A hash function cannot be used to determine the location of specific hash keys; instead, the equivalent of a full table scan must be done to fetch the rows for the query. With an index, key values are ordered in the index, so cluster key values that satisfy the WHERE clause of a query can be found with relatively few I/Os.

A table is not static, but is continually growing. If a table grows without limit, the space required over the life of the table (thus, of its cluster) cannot be predetermined.

Applications frequently perform full table scans on the table and the table is sparsely populated. A full table scan in this situation takes longer under hashing.

You cannot afford to preallocate the space the hash cluster will eventually need.

In most cases, you should decide (based on the above information) whether to use hashing or indexing. If you use indexing, consider whether it is best to store a table individually or as part of a cluster; see

for guidance.

If you decide to use hashing, a table can still have separate indexes on any columns, including the cluster key. For additional guidelines on the performance characteristics of hash clusters, see the Oracle7 Server Tuning manual.