SQL*Loader extracts data from a field in the input file, guided by the datatype specification in the control file. SQL*Loader then sends the field to the server to be stored in the appropriate column (as part of an array of row inserts). The server does any necessary data conversion to store the data in the proper internal format. The "Data Conversion and Datatype Specification" section beginning contains diagrams that illustrate these points.
The datatype of the data in the file does not necessarily have to be the same as the datatype of the column in the Oracle table. Oracle automatically performs conversions, but you need to ensure that the conversion makes sense and does not generate errors. For instance, when a datafile field with datatype CHAR is loaded into a database column with datatype NUMBER, you must make sure that the contents of the character field represent a valid number.
Note: SQL*Loader does not contain datatype specifications for Oracle internal datatypes such as NUMBER or VARCHAR2. SQL*Loader's datatypes describe data that can be produced with text editors (character datatypes) and with standard programming languages (native datatypes). However, although SQL*Loader does not recognize datatypes like NUMBER and VARCHAR2, any data that Oracle is capable of converting may be loaded into these or other database columns.
INTEGER ZONED SMALLINT VARCHAR FLOAT GRAPHIC DOUBLE GRAPHIC EXTERNAL BYTEINT VARGRAPHIC (packed) DECIMAL RAW MLSLABEL (Trusted Oracle)
Since these datatypes contain binary data, most of them do not readily transport across operating systems. (See "Loading Data Across Different Operating Systems" .) RAW data and GRAPHIC data is the exceptions. SQL*Loader does not attempt to interpret these datatypes, but simply stores them "as is".
Additional Information: Native datatypes cannot be specified with delimiters. The size of the native datatypes INTEGER, SMALLINT, FLOAT, and DOUBLE are determined by the host operating system. Their size is fixed -- it cannot be overridden in the control file. (Refer to your Oracle operating system-specific documentation for more information.) The sizes of the other native datatypes may be specified in the control file.
INTEGER
SMALLINT
Additional Information: This is the SHORT INT datatype in the C programming language. One way to determine its length is to make a small control file with no data and look at the resulting log file. This length cannot be overridden in the control file. See your Oracle operating system-specific documentation for details.
FLOAT
DOUBLE
The syntax for this datatype is
BYTEINT
An example is
(column1 position(1) BYTEINT,
column2 BYTEINT,
...
)
The syntax for this datatype is:
where precision is the number of digits in the number, and scale (if given) is the number of digits to the right of the (implied) decimal point. For example:
sal POSITION(32) ZONED(8),
specifies an 8-digit integer starting at position 32.
The syntax for the this datatype is:
where:
precision | Is the number of digits in the value. The character length of the field, as computed from digits, is (digits + 2) / 2, rounded down. |
scale | Is the scaling factor, or number of digits to the right of the decimal point. Default is zero (indicating an integer). May be greater than the number of digits but may not be negative. |
sal DECIMAL (7,2)
would load a number equivalent to +12345.67. In the data record, this field would take up 4 bytes, as shown in Figure 5 - 1. (The byte length of a DECIMAL field is equivalent to (N+1)/2, rounded up, where N is the number of digits in the value, and one is added for the sign.)
Figure 5 - 1. Packed Decimal Representation of +12345.67
The syntax for this datatype is
The length of this field is the number of bytes specified in the control file. This length is limited only by the length of the target column in the database and by memory resources.
The syntax for this datatype is
For both GRAPHIC and GRAPHIC EXTERNAL, if you specify POSITION(start:end) you give the exact location of the field in the logical record. If you specify the length after the GRAPHIC (EXTERNAL) keyword, however, then you give the number of double-byte graphic characters. That value is multiplied by 2 to find the length of the field in bytes. If the number of graphic characters is specified, then any length derived from POSITION is ignored.
The syntax for this datatype is:
where:
GRAPHIC | Data is double-byte characters. |
EXTERNAL | First and last characters are ignored. |
graphic_char_length | Length in DBCS (see GRAPHIC, above) |
To describe ####, use "POSITION(1:4) GRAPHIC" or "POSITION(1) GRAPHIC(2)".
To describe [####], use "POSITION(1:6) GRAPHIC EXTERNAL" or "POSITION(1) GRAPHIC EXTERNAL(2)".
Additional Information: The size of the length subfield is the size of the SQL*Loader SMALLINT datatype on your system (C type SHORT INT). See SMALLINT for more information.
The length of the current field is given in the first two bytes. This length is a count of graphic (double-byte) characters. So it is multiplied by two to determine the number of bytes to read.
The syntax for this datatype is
A maximum length specified after the VARGRAPHIC keyword does not include the size of the length subfield. The maximum length specifies the number of graphic (double byte) characters. So it is also multiplied by two to determine the maximum length of the field in bytes.
The default maximum field length is 4K graphic characters, or 8 Kb (2 * 4K). It is a good idea to specify a maximum length for such fields whenever possible, to minimize memory requirements. See "Determining the Size of the Bind Array" for more details.
The POSITION clause, if used, gives the location of the length subfield, not of the first graphic character. If you specify POSITION(start:end), the end location determines a maximum length for the field. Both start and end identify single-character (byte) positions in the file. Start is subtracted from (end + 1) to give the length of the field in bytes. If a maximum length is specified, it overrides any maximum length calculated from POSITION.
If a VARGRAPHIC field is truncated by the end of the logical record before its full length is read, a warning is issued. Because a VARCHAR field's length is embedded in every occurrence of the input data for that field, it is assumed to be accurate.
VARGRAPHIC data cannot be delimited.
Additional Information: The size of the length subfield is the size of the SQL*Loader SMALLINT datatype on your system (C type SHORT INT). See SMALLINT for more information.
The syntax for this datatype is:
A maximum length specified in the control file does not include the size of the length subfield. If you specify the optional maximum length after the VARCHAR keyword, then a buffer of that size is allocated for these fields.
The default buffer size is 4 Kb. Specifying the smallest maximum length that is needed to load your data can minimize SQL*Loader's memory requirements, especially if you have many VARCHAR fields. See "Determining the Size of the Bind Array" for more details.
The POSITION clause, if used, gives the location of the length subfield, not of the first text character. If you specify POSITION(start:end), the end location determines a maximum length for the field. Start is subtracted from (end + 1) to give the length of the field in bytes. If a maximum length is specified, it overrides any length calculated from POSITION.
If a VARCHAR field is truncated by the end of the logical record before its full length is read, a warning is issued. Because a VARCHAR field's length is embedded in every occurrence of the input data for that field, it is assumed to be accurate.
VARCHAR data cannot be delimited.
column1 POSITION(1:6) INTEGER
then a warning is issued, and the proper length (4) is used. In this case, the log file shows the actual length used under the heading "Len" in the column table:
Column Name Position Len Term Encl Datatype
----------------------- --------- ----- ---- ---- ---------
COLUMN1 1:6 4 INTEGER
Additional Information: In addition, the MLSLABEL character datatype exists in Trusted Oracle. See the Trusted Oracle7 Server Administrator's Guide for more information on this datatype.
A field of datatype CHAR may also be variable-length delimited or enclosed. See "Specifying Delimiters" .
Attention: If the column in the database table is defined as LONG, you must explicitly specify a maximum length (maximum for a LONG is 2 gigabytes) either with a length specifier on the CHAR keyword or with the POSITION keyword. This guarantees that a large enough buffer is allocated for the value and is necessary even if the data is delimited or enclosed.
For example:
LOAD DATA
INTO TABLE DATES (COL_A POSITION (1:15) DATE "DD-Mon-YYYY")
BEGINDATA
1-Jan-1991
1-Apr-1991 28-Feb-1991
Attention: Whitespace is ignored and dates are parsed from left to right unless delimiters are present.
The length specification is optional, unless a varying-length date mask is specified. In the example above, the date mask specifies a fixed-length date format of 11 characters. SQL*Loader counts 11 characters in the mask, and therefore expects a maximum of 11 characters in the field, so the specification works properly. But, with a specification such as
DATE "Month dd, YYYY"
the date mask is 14 characters, while the maximum length of a field such as
September 30, 1991
is 18 characters. In this case, a length must be specified. Similarly, a length is required for any Julian dates (date mask "J")--a field length is required any time the length of the date string could exceed the length of the mask (that is, the count of characters in the mask).
If an explicit length is not specified, it can be derived from the POSITION clause. It is a good idea to specify the length whenever you use a mask, unless you are absolutely sure that the length of the data is less than, or equal to, the length of the mask.
An explicit length specification, if present, overrides the length in the POSITION clause. Either of these overrides the length derived from the mask. The mask may be any valid Oracle date mask. If you omit the mask, the default Oracle date mask of "dd-mon-yy" is used. The length must be enclosed in parentheses and the mask in quotation marks. Case 3 has an example of the DATE datatype.
A field of datatype DATE may also be specified with delimiters. For more information, see "Specifying Delimiters" .
A date field that consists entirely of whitespace produces an error unless NULLIF BLANKS is specified. For more information, see "Loading All-Blank Fields" .
The data is a number in character form (not binary representation). As such, these datatypes are identical to CHAR and are treated identically, with one exception: the use of DEFAULTIF. If you want the default to be null, use CHAR. If you want it to be zero, use EXTERNAL.
Numeric EXTERNAL may be specified with lengths and delimiters, just like CHAR data. Length is optional, but if specified, overrides POSITION.
The syntax for this datatype is:
Delimited data can be TERMINATED or ENCLOSED.
If two delimiter characters are encountered next to each other, a single occurrence of the delimiter character is used in the data value. For example, 'DON''T' is stored as DON'T. However, if the field consists of just two delimiter characters, its value is null. You may specify a TERMINATED BY clause, an ENCLOSED BY clause, or both. If both are used, the TERMINATED BY clause must come first.
The syntax for delimiter specifications is:
where:
TERMINATED BY ',' a data string,
ENCLOSED BY '"' "a data string"
TERMINATED BY ',' ENCLOSED BY '"' "a data string",
ENCLOSED BY "(" AND ')' (a data string)
(The delimiters are left paren's, ((, and right paren's, )).)
with this field specification:
ENCLOSED BY "(" AND ")"
puts the following string into the database:
The delimiters are left paren's, (, and right paren's, ).
For this reason, problems can arise when adjacent fields use the same delimiters. For example, the following specification:
field1 TERMINATED BY "/"
field2 ENCLOSED by "/"
the following data will be interpreted properly:
This is the first string/ /This is the second string/
But if field1 and field2 were adjacent, then the results would be incorrect, because
This is the first string//This is the second string/
would be interpreted as a single character string with a "/" in the middle of it, and that string would belong to field1.
For example, if
position(1:10) char(15)
is specified, then the length of the field is 15.
If the expected delimiter is absent and no maximum length has been specified, then the end of record terminates the field. If TRAILING NULLCOLS is specified, remaining fields are null. If either the delimiter or the end of record produce a field that is longer than the specified maximum, SQL*Loader generates an error.
"Month dd, yyyy"
then "May 3, 1991" would occupy 11 character positions in the record, while "January 31, 1992" would occupy 16.
If starting and ending positions are specified, however, then the length calculated from the position specification overrides a length derived from the mask. A specified length such as "DATE (12)" overrides either of those. If the date field is also specified with terminating or enclosing delimiters, then the length specified in the control file is interpreted as a maximum length for the field.