Previous | Table of Contents | Next |
Bitmaps are especially important for data warehouse/decision support systems, where ad-hoc, unanticipated queries make it impractical for the Oracle DBA to index on all possible combinations of columns. Assume that a manager wants to know the average income for all college-educated customers who drive red or blue cars in Wyoming or Nevada. Furthermore, assume that there are 1 million rows in the customer table. The following query would be very hard to service using traditional indexing:
SELECT avg(yearly_income) FROM CUSTOMER WHERE education IN ('B','M','D') AND car_color IN ('RED','BLUE') AND state_residence IN ('WY','NV') ORDER BY AVG(yearly_income);
In a bitmapped index, it is not necessary to read all 1 million rows in the customer table. Instead, the query manager will build row ID lists for all 1 values for education, car_color, and state_residence, and then match up the row IDs for those that appear in all three columns. When the query is ready to access the rows, it already has a list of row IDs for all rows that meet the selection criteria.
To understand bitmapped indexes, imagine a very wide, fat table with only a few rows. In a bitmapped index, each unique value has one row, such that our REGION index contains only four rows. Across the bitmap, each row in the base table is represented by a column, with a 1 in the bitmap array if the value is true, and a 0 if it is false. Because of the high amount of repeating ones and zeros, bitmapped indexes can be very effectively compressed and expanded at runtime. In fact, the lower the cardinality, the better the compression, such that we can expect a higher compression of a GENDER index with two distinct values than with a STATE index with 50 distinct values. Uncompressed, the STATE index would be 48 times larger than the GENDER bitmap, because one row in the bitmap array is required for each unique value.
Oracle bitmapped indexes can consume far less space than a traditional B-tree Oracle index. In fact, their size can easily be computed as follows:
bitmap size = (cardinality_of_column * rows_in_table)/8
For example, suppose that the REGION index would have four distinct values with 800,000 rows. The entire index would only consume 100,000 bytes uncompressedand with Oracles compression, this index would be far smaller than 100,000 bytes. In fact, it could probably be read entirely into the Oracle buffer with a few I/Os.
As we can see from the diagram in Figure 6.2, Oracle bitmapped indexes can dramatically reduce I/O for certain types of operations. For example, assume that we are interested in knowing the number of corporations in the western region. Because all of this information is contained entirely within the bitmapped indexes, we have no need to access the table. In other words, the query can be resolved entirely within the index.
Figure 6.2 Oracle bitmapped indexes.
How do we identify candidates for bitmapped indexes? Your existing database is the starting place. Listing 6.5 will check all of your B-tree indexes and provide you with a list of candidates in ascending order of cardinality.
Listing 6.5 bitmap.sql identifies low cardinality indexes for bitmapped indexes.
REM Written by Don Burleson PROMPT Be patient. This can take awhile . . . SET PAUSE OFF; SET ECHO OFF; SET TERMOUT OFF; SET LINESIZE 300; SET PAGESIZE 999; SET NEWPAGE 0; SET FEEDBACK OFF; SET HEADING OFF; SET VERIFY OFF; REM First create the syntax to determine the cardinality . . . SPOOL idx1.sql; SELECT 'set termout off;' FROM DUAL; SELECT 'spool idx2.lst;' FROM DUAL; SELECT 'column card format 9,999,999;' FROM DUAL; SELECT 'select distinct count(distinct ' ||a.column_name ||') card, ' ||''' is the cardinality of ' ||'Index ' ||a.index_name ||' on column ' ||a.column_name ||' of table ' ||a.table_owner ||'.' ||a.table_name ||''' from ' ||index_owner||'.'||a.table_name ||';' FROM dba_ind_columns a, dba_indexes b WHERE a.index_name = b.index_name AND tablespace_name NOT IN ('SYS','SYSTEM') ; SELECT 'spool off;' from dual; SPOOL OFF; SET TERMOUT ON; @idx1 !SORT idx2.lst
Listing 6.6 shows the listing from bitmap.sql.
Listing 6.6 The results of bitmap.sql.
3 is the cardinality of Index GEO_LOC_PK on column GEO_LOC_TY_CD of table RPT.GEM_LCC 4 is the cardinality of Index REGION_IDX on column REHION of table RPT.CUSTOMER 7 is the cardinality of Index GM_LCK on column GEO_LC_TCD of table RPT.GEM_LCC 8 is the cardinality of Index USR_IDX on column USR_CD of table RPT.CUSTOMER 50 is the cardinality of Index STATE_IDX on column STATE_ABBR of table RPT.CUSTOMER 3117 is the cardinality of Index ZIP_IDX on column ZIP_CD of table RPT.GEM_LCC 71,513 is the cardinality of Index GEO_LOC_PK on column GEO_LOC_CD of table RPT.GEM_LCC 83,459 is the cardinality of Index GEO_KEY_PK on column GEO_LOC_TY_CD of table RPT.GEM_LCC
Of course, columns such as GENDER and REGION should be made into bitmapsbut what about STATE with 50 values, or area_code with a few hundred values? Intuition tells us that the benefits from bitmapped indexes are a function of the cardinality and the number of rows in the table, but no hard and fast rule exists for identifying what type of index is always best. An heuristic approach is called for, and it is relatively easy for the DBA to create a B-tree index, run a timed query, re-create the index with the BITMAP option, and re-execute the timed query.
Regardless, bitmapped indexes are critical to the performance of decision support system, especially those that are queried in an ad hoc fashion against hundreds of column values. See Chapter 10, Tuning Oracle Data Warehouse And OLAP Applications, for a discussion of the applications of bitmapped indexes.
While this chapter has been devoted to the obvious DBA issues relating to performance and tuning, it is imperative that the Oracle DBA become intimate with all of the various areas of Oracle tuning. The DBA, more than any other participant in the development of the system, has the ability to influence the systems overall performance.
Previous | Table of Contents | Next |