Oracle7 Server Application Developer's Guide

Contents Index Home Previous Next

Oracle Datatypes

The following sections describe the Oracle datatypes that can be used for column definitions.

Character Datatypes

The CHAR and VARCHAR2 datatypes are used to store alphanumeric data; any character can be stored in a column of these datatypes. Character data is stored in strings with byte values corresponding to the character encoding scheme (usually called a character set or code page) defined for the database when it was created; for example, 7-bit ASCII, EBCDIC Code Page 500, or Japan Extended UNIX. Oracle supports both single-byte and multi-byte encoding schemes. See the Oracle7 Server Reference manual for more information about National Language Support features of Oracle and support for different character encoding schemes.

Because Oracle blank-pads values stored in CHAR columns but not in VARCHAR2 columns, a value stored in a VARCHAR2 column may take up less space than if it were stored in a CHAR column. For this reason, a full table scan on a large table containing VARCHAR2 columns may read fewer data blocks than a full table scan on a table containing the same data stored in CHAR columns. If your application often performs full table scans on large tables containing character data, you might be able to improve performance by storing this data in VARCHAR2 columns rather than in CHAR columns.

However, performance is not the only factor to consider when deciding which of these datatypes to use. Oracle uses different semantics to compare values of each datatype. You might choose one datatype over the other if your application is sensitive to the differences between these semantics. For example, if you want Oracle to ignore trailing blanks when comparing character values, you must store these values in CHAR columns. For information on the comparison semantics for these datatypes, see the Oracle7 Server SQL Reference manual.

CHAR Datatype

The CHAR datatype is used to store fixed-length character strings. When you create a table with a CHAR column, specify a column length (in bytes, not characters) between 1 and 255 (default is 1). Oracle then guarantees the following:

Oracle compares CHAR values using blank-padded comparison semantics. If two values have different lengths, Oracle adds blanks to the shorter value, until the two values are the same length. Two values that differ only in the number of trailing blanks are considered equal.

VARCHAR2 Datatype

Use the VARCHAR2 datatype to store variable-length character strings. When you create a table with a VARCHAR2 column, specify a maximum column length (in bytes, not characters) between 1 and 2000. For each row, each value in the column is stored as a variable-length field. For example, assume a column is declared VARCHAR2 with a maximum size of 50 characters. If only 10 single-byte characters are given for the VARCHAR2 column value in a particular row, the column in the row's row piece only stores the 10 characters (10 bytes), not 50.

Oracle compares VARCHAR2 values using non-padded comparison semantics. Two values are only considered equal if they have the same characters and are of equal length.

VARCHAR Datatype

The VARCHAR datatype is currently synonymous with the VARCHAR2 datatype. However, in a future version of Oracle, the VARCHAR datatype might be changed to use different comparison semantics. Therefore, you should use the VARCHAR2 datatype to store variable-length character strings.

Column Lengths for Character Datatypes

The lengths of CHAR and VARCHAR2 columns are specified in bytes rather than characters, and are constrained as such. If the database character encoding scheme is single-byte, the number of bytes and the number of characters in a column is the same. If it is multi-byte, there generally is no such correspondence. A character might be comprised of one or more bytes depending upon the specific multi-byte encoding scheme, and whether shift-in/shift-out control codes are present. When using a multi-byte database character encoding scheme, consider carefully the space required for tables with character columns.

NUMBER Datatype

Use the NUMBER datatype to store real numbers in a fixed-point format. Numbers using this datatype are guaranteed to be portable among different Oracle platforms, and offer up to 38 decimal digits of precision. You can store positive and negative numbers of magnitude 1 x 10^-130 to 9.99...x10^125, as well as zero, in a NUMBER column.

For numeric columns you can specify the column as

column_name NUMBER

or you can specify a precision (total number of digits) and scale (number of digits to right of decimal point):

column_name NUMBER (precision, scale)

If a precision is not specified, the column stores values as given. If no scale is specified, it is implied that the scale is zero. The scale can range from -84 to 127.

Although not required, specifying the precision and scale for numeric fields provides extra integrity checking on input. Table 5 - 1 shows examples of how data would be stored using different scale factors.

Notice that if a negative scale is specified, the actual data is rounded to the specified number of places to the left of the decimal point. For example, a specification of (7,-2) means to round to the nearest hundreds, as shown in Table 5 - 1.

Input Data Specified As Stored As
7,456,123.89 NUMBER 7456123.89
7,456,123.89 NUMBER (9) 7456124
7,456,123.89 NUMBER (9,2) 7456123.89
7,456,123.89 NUMBER (9,1) 7456123.9
7,456,123.89 NUMBER (6) (not accepted, exceeds precision)
7,456,123.89 NUMBER (7, -2) 7456100
Table 5 - 1. How Scale Factors Affect Numeric Data Storage

DATE Datatype

Use the DATE datatype to store point-in-time values (dates and times) in a table. The DATE datatype stores the century, year, month, day, hours, minutes, and seconds. Oracle can store dates in the Julian era, ranging from January 1, 4712 BCE through December 31, 4712 CE (Common Era). Unless BCE ('BC' in the format mask) is specifically used, CE date entries are the default.

Oracle uses its own internal format to store dates. Date data is stored in fixed-length fields of seven bytes each, corresponding to century, year, month, day, hour, minute, and second. See Chapter 3 in the Programmer's Guide to the Oracle Call Interface for a complete description of the Oracle internal date format.

For input and output of dates, the standard Oracle default date format is DD-MON-YY, as in

'13-NOV-92'

To change this default date format on an instance-wide basis, use the NLS_DATE_FORMAT parameter. To change the format during a session, use the ALTER SESSION statement. To enter dates that are not in the current default date format, use the TO_DATE function with a format mask, as in

TO_DATE ('November 13, 1992', 'MONTH DD, YYYY')

If the date format DD-MON-YY is used, YY indicates the year in the 20th century (for example, 31-DEC-92 is December 31, 1992). If you want to indicate years in any century other than the 20th century, use a different format mask, as shown above.

Time is stored in 24-hour format--HH:MM:SS. By default, the time in a date field is 12:00:00 A.M. if no time portion is entered. In a time-only entry, the date portion defaults to the first day of the current month. To enter the time portion of a date, use the TO_DATE function with a format mask indicating the time portion, as in

INSERT INTO birthdays (bname, bday) VALUES
('ANNIE',TO_DATE('13-NOV-92 10:56 A.M.','DD-MON-YY HH:MI A.M.'));

To compare dates that have time data, use the SQL function TRUNC if you want to ignore the time component. Use the SQL function SYSDATE to return the system date and time. The FIXED_DATE initialization parameter allows you to set SYSDATE to a constant; this can be useful for testing.

Using Julian Dates

Julian dates allow continuous dating from a common reference. (The epoch is 1 January 4712 BCE, so current dates are somewhere in the 2.4 million range.) A Julian date is nominally a non-integer, the fractional part being a portion of a day. Oracle uses a simplified approach that results in integer values. Julian dates can be calculated and interpreted differently; the calculation method used by Oracle results in a seven-digit number for dates most often used, as in 2448355 for 08-APR-1991.

Use the format mask `J' with date functions (TO_DATE or TO_CHAR, but not TO_NUMBER) to convert date data into Julian dates. For example, the following query returns all dates in Julian date format:

SELECT TO_CHAR (hiredate, 'J') FROM emp;

Use the TO_NUMBER function to use Julian dates in calculations. You can use the TO_DATE function to enter Julian dates:

INSERT INTO emp (hiredate) VALUES (TO_DATE(2448921, 'J'));

Note: Oracle Julian dates might not be compatible with Julian dates generated by other date algorithms.

Date Arithmetic

Oracle date arithmetic takes into account the anomalies of the Western calendars used throughout history. For example, the switch from the Julian to the Gregorian calendar, 15 October 1582, eliminated the previous 10 days (05 October - 14 October). Additionally, the year 0 does not exist.

Missing dates can be entered into the database, but are ignored in date arithmetic and treated as the next date. For example, the next day after 04 October 1582 is 15 October 1582 and the day following 05 October 1582 is 16 Oct 1582.

Note: This discussion of date arithmetic may not apply to all countries' date standards (for example, Asian countries).

LONG Datatype

Columns defined as LONG can store variable-length character data containing up to 2 gigabytes of information. LONG columns have many of the characteristics of VARCHAR2 columns. The length of LONG values might be limited by the memory available on your computer.

Uses of LONG Data

Columns defined with the LONG datatype are used in the data dictionary to store the text of view definitions. You can use columns defined as LONG in SELECT lists, SET clauses of UPDATE statements, and VALUES clauses of INSERT statements.

Restrictions on LONG and LONG RAW Data

Though LONG (and LONG RAW) columns have many uses, there are some restrictions on their use:

Suggestion: When you design tables containing LONG or LONG RAW data, you should place each LONG or LONG RAW column in a table separate from any other data associated with it, rather than storing the LONG or LONG RAW column and its associated data together in the same table. You can then relate the two tables with a referential integrity constraint. This design allows SQL statements that access only the associated data to avoid reading through LONG or LONG RAW data.

Example

To store information on magazine articles, including the texts of each article, create two tables:

CREATE TABLE article_header (id             NUMBER
                                            PRIMARY KEY,
                             title          VARCHAR2(200),
                             first_author   VARCHAR2(30),
                             journal        VARCHAR2(50),
                             pub_date       DATE)
CREATE TABLE article_text   (id             NUMBER
                                            REFERENCES
                                            article_header,
                             text           LONG)

The ARTICLE_TEXT table stores only the text of each article. The ARTICLE_HEADER table stores all other information about the article, including the title, first author, and journal and date of publication. The tables are related by the referential integrity constraint on the ID column of each table.

This design allows SQL statements to query data other than the text of an article without reading through the text. If you want to select all first authors published in Nature magazine during July 1991, you can issue this statement that queries the ARTICLE_HEADER table:

SELECT first_author
   FROM article_header
   WHERE journal = 'NATURE'
      AND TO_CHAR(pub_date, 'MM YYYY') = '07 1991')

If the text of each article were stored in the same table with the first author, publication, and publication date, Oracle would have to read through the text to perform this query.

RAW and LONG RAW Datatypes

Use the RAW and LONG RAW datatypes for data that is not to be interpreted (not to be converted when moving data between different systems) by Oracle. These datatypes are intended for binary data and byte strings. For example, LONG RAW can be used to store graphics, sound, documents, and arrays of binary data; the interpretation is dependent on the use.

Like the character datatype VARCHAR2, RAW and LONG RAW are variable length; however, the length of RAW cannot exceed 255 bytes, and there is no character set conversion done for RAW or LONG RAW data. CHAR, VARCHAR2, and LONG data is automatically converted from the database character set to the character set defined for the user session by the NLS_LANG parameter, where these are different.

When Oracle automatically converts RAW or LONG RAW data to and from CHAR data (as is the case when entering RAW data as a literal in an INSERT statement), the data is represented as one hexadecimal character representing the bit pattern for every four bits of RAW data. For example, one byte of RAW data with bits 11001011 is displayed and entered as 'CB'.

LONG RAW data cannot be indexed, but RAW data can be indexed.

ROWIDs and the ROWID Datatype

Every row in a non-clustered table of an Oracle database is assigned a unique ROWID that corresponds to the physical address of a row's row piece (initial row piece if the row is chained among multiple row pieces). In the case of clustered tables, rows in different tables that are in the same data block can have the same ROWID.

Each table in an Oracle database internally has a pseudo-column named ROWID; this pseudo-column is not evident when listing the structure of a table by executing a SELECT * FROM . . . statement, or a DESCRIBE . . . statement using SQL*Plus. However, each row's address can be retrieved with a SQL query using the reserved word ROWID as a column name, as in

SELECT ROWID, ename FROM emp;

ROWIDs use a binary representation of the physical address for each row selected. When queried using SQL*Plus or SQL*DBA, the binary representation is converted to a VARCHAR2/hexadecimal representation. The above query might return the following row information:

ROWID              ENAME
------------------ ----------
00000DD5.0000.0001 SMITH
00000DD5.0001.0001 ALLEN
00000DD5.0002.0001 WARD

As shown above, a ROWID's VARCHAR2/hexadecimal representation is divided into three pieces: block.row.file.

A row's assigned ROWID remains unchanged unless the row is exported and imported (using the IMPORT and EXPORT utilities). When you delete a row from a table (and commit the encompassing transaction), the deleted row's associated ROWID can be assigned to a row inserted in a subsequent transaction.

You cannot set the value of the pseudo-column ROWID in INSERT or UPDATE statements. The ROWIDs in the pseudo-column ROWID are used internally by Oracle for various operations (see the next section). Though you can reference ROWIDs in the pseudo-column ROWID like other table columns (used in SELECT lists and WHERE clauses), ROWIDs in this pseudo-column are not stored in the database, nor are they database data.

ROWIDs and Non-Oracle Databases To execute Oracle database applications against non-Oracle database servers, use the Oracle Open Gateway. In such cases, the binary format of ROWIDs varies according to the characteristics of the non-Oracle system. Furthermore, no standard translation to VARCHAR2/hexadecimal format is available. Programs can still use the ROWID datatype; however, they must use a non-standard translation to hexadecimal format of length up to 256 bytes. Refer to the relevant manual for OCIs or Precompilers for further details on handling ROWIDs with non-Oracle systems.

How ROWIDs Are Used

ROWIDs are used internally by Oracle in the construction of indexes. Each key in an index is associated with a ROWID that points to the associated row's address for fast access. Some of the characteristics of ROWIDs include the following:

Before using ROWIDs in DML statements, ROWIDs should be verified and guaranteed not to change; that is, the intended rows should be locked so they cannot be deleted. Attempting to retrieve a row using an invalid ROWID results in either no row being returned or error 1410, invalid ROWID.

You can also create tables with columns defined using the ROWID datatype; for example, you might define an exception table with a column of datatype ROWID to store the ROWIDs of rows in the database that violate integrity constraints. Such columns defined using the ROWID datatype behave like other table columns; that is, values can be updated, etc. All values in a column defined as datatype ROWID require six bytes to store pertinent column data.

Examples of Using ROWIDs

Using some group functions with ROWID, you can see how data is internally stored in an Oracle database. The function SUBSTR can be used to break the data in ROWID into its three components (file, block, and row). For example, the query

SELECT ROWID, SUBSTR(ROWID,15,4) "FILE",
      SUBSTR(ROWID,1,8) "BLOCK",
      SUBSTR(ROWID,10,4) "ROW"
      FROM emp;

might return the following data:

ROWID               FILE  BLOCK     ROW
------------------  ----  --------  ----
00000DD5.0000.0001  0001  00000DD5  0000
00000DD5.0001.0001  0001  00000DD5  0001
00000DD5.0002.0001  0001  00000DD5  0002

ROWIDs can be useful to reveal information about the physical storage of a table's data. For example, if you are interested in the physical location of a table's rows (for example, for table striping), the following query tells how many datafiles contain rows of a given table:

SELECT COUNT(DISTINCT(SUBSTR(ROWID,15,4))) "FILES" FROM tablename;

which might return

   FILES
--------

       2

Summary of Oracle Datatype Information

For quick reference, Table 5 - 2 summarizes the important information about each Oracle datatype.

Datatype Description
CHAR (size) Fixed-length character data of length size. Maximum size is 255 bytes. Default size is 1 byte.
VARCHAR2 (size) Variable-length character data. Maximum size is 2000 bytes.
FLOAT (p) A floating-point number with binary precision p. FLOAT with no precision is the same as FLOAT (126).
NUMBER (p, s) Variable-length numeric data. The precision p (total number of digits) can range from 1 to 38. The scale is s (number of decimal places) and can range from -84 to 127.
DATE Fixed-length date and time data, ranging from January 1, 4712 BC to December 31, 4712 AD. Default format is the value of the NLS_DATE_FORMAT or ALTER SESSION parameter.
LONG Variable-length character date up to 2^31 * 1 bytes - 1, or 2 gigabytes - 1 byte.
RAW (size) Variable-length raw binary data. A maximum size must be specified, up to 255.
LONG RAW Variable-length raw binary data up to 2^31 * 1 bytes - 1, or 2 gigabytes - 1 byte.
ROWID Binary data representing row addresses.
MLSLABEL Variable-length tag (2 - 5 bytes) that maps to a binary operating system label. For use with Trusted Oracle; see [*] for details.
Table 5 - 2. Summary of Oracle Datatype Information


Contents Index Home Previous Next