Oracle7 Server Utilities

Contents Index Home Previous Next

Trimming of Blanks and Tabs

Blanks and tabs constitute whitespace. Depending on how the field is specified, whitespace at the start of a field (leading whitespace) and at the end of a field (trailing whitespace) may, or may not be, included when the field is inserted into the database. This section describes the way character data fields are recognized, and how they are loaded. In particular, it describes the conditions under which whitespace is trimmed from fields.

Note: Specifying PRESERVE BLANKS changes this behavior. See "Preserving Whitespace" [*] for more information.

Datatypes

The information in this section applies only to fields specified with one of the character-data datatypes:

VARCHAR Fields

Although VARCHAR fields also contain character data, these fields are never trimmed. A VARCHAR field includes all whitespace that is part of the field in the datafile.

Field Length Specifications

There are two ways to specify field length. If a field has a constant length that is defined in the control file, then it has a predetermined size. If a field's length is not known in advance, but depends on indicators in the record, then the field is delimited.

Record Formats

When all of the fields in a record have known positions and lengths, then the record is in fixed format, as shown in Case 2 [*]. If the size or position of any fields in the record vary from record to record, then the record has a variable format, as shown in Case 1 [*]. When the record size also varies, then the file is in stream format. Variable format fields are specified with delimiters. Fixed format fields are specified with predetermined sizes and fixed positions.

Predetermined Size Fields

Fields that have a predetermined size are specified with a starting position and ending position, or with a length, as in the following examples:

loc POSITION(19:31) 
loc CHAR(14) 

In the second case, even though the field's exact position is not specified, the field's length is predetermined.

Delimited Fields

Delimiters are characters that demarcate field boundaries. Enclosure delimiters surround a field, like the quotes in:

"__aa__" 

where "__" represents blanks or tabs. Termination delimiters signal the end of a field, like the comma in:

__aa__, 

Delimiters are specified with the control clauses TERMINATED BY and ENCLOSED BY, as shown in the following examples:

loc POSITION(19) TERMINATED BY "," 
loc POSITION(19) ENCLOSED BY '"' 
loc TERMINATED BY "." OPTIONALLY ENCLOSED BY '|' 

Combining Delimiters with Predetermined Size

If predetermined size is specified for a delimited field, and the delimiter is not found within the boundaries indicated by the size specification; then an error is generated. For example, if you specify:

loc POSITION(19:31) CHAR TERMINATED BY "," 

and no comma is found between positions 19 and 31 of the input record, then the record is rejected. If a comma is found, then it delimits the field.

Relative Positioning of Fields

When a starting position is not specified for a field, it begins immediately after the end of the previous field. Figure 5 - 2 illustrates this situation when the previous field has a predetermined size.

Figure 5 - 2. Relative positioning after a fixed field

If the previous field is terminated by a delimiter, then the next field begins immediately after the delimiter, as shown in Figure 5 - 3.

Figure 5 - 3. Relative positioning after a delimited field

When a field is specified both with enclosure delimiters and a termination delimiter, then the next field starts after the termination delimiter, as shown in Figure 5 - 4. If a non-whitespace character is found after the enclosure delimiter, but before the terminator, then SQL*Loader generates an error.

Figure 5 - 4. Relative positioning after enclosure delimiters

Leading Whitespace

In Figure 5 - 4, both fields are stored with leading whitespace. Fields do not include leading whitespace in the following cases:

These cases are illustrated in the following sections.

Previous Field Terminated by Whitespace

If the previous field is TERMINATED BY WHITESPACE, then all the whitespace after the field acts as the delimiter. The next field starts at the next non-whitespace character. Figure 5 - 5 illustrates this case.

Figure 5 - 5. Fields terminated by whitespace

This situation occurs when the previous field is explicitly specified with the TERMINATED BY WHITESPACE clause, as shown in the example. It also occurs when you use the global FIELDS TERMINATED BY WHITESPACE clause.

Optional Enclosure Delimiters

Leading whitespace is also removed from a field when optional enclosure delimiters are specified but not present.

Whenever optional enclosure delimiters are specified, SQL*Loader scans forward, looking for the first delimiter. If none is found, then the first non-whitespace character signals the start of the field. SQL*Loader skips over whitespace, eliminating it from the field. This situation is shown in Figure 5 - 6.

Figure 5 - 6. Fields terminated by optional enclosing delimiters

Unlike the case when the previous field is TERMINATED BY WHITESPACE, this specification removes leading whitespace even when a starting position is specified for the current field.

Note: If enclosure delimiters are present, leading whitespace after the initial enclosure delimiter is kept, but whitespace before this delimiter is discarded. See the first quote in FIELD1, Figure 5 - 6.

Trailing Whitespace

Trailing whitespace is only trimmed from character-data fields that have a predetermined size. It is always trimmed from those fields.

Enclosed Fields

If a field is enclosed, or terminated and enclosed, like the first field shown in Figure 5 - 6, then any whitespace outside the enclosure delimiters is not part of the field. Any whitespace between the enclosure delimiters belongs to the field, whether it is leading or trailing whitespace.

Trimming Whitespace: Summary

Table 5 - 5 summarizes when and how whitespace is removed from input data fields when PRESERVE BLANKS is not specified. See the following section, "Preserving Whitespace", for details on how to prevent trimming.

Specification Data Result Leading Whitespace Present(1) Trailing Whitespace Present(1)
Predetermined Size __aa__ __aa Y N
Terminated __aa__, __aa__ Y Y(2)
Enclosed "__aa__" __aa__ Y Y
Terminated and Enclosed "__aa__", __aa__ Y Y
Optional Enclosure (present) "__aa__", __aa__ Y Y
Optional Enclosure (absent) __aa__, aa__ N Y
Previous Field Terminated by Whitespace __aa__ aa(3) N (3)
(1) When an allow-blank field is trimmed, its value is null.
(2) Except for fields that are TERMINATED BY WHITESPACE
(3) Presence of trailing whitespace depends on the current field's specification, as shown by the other entries in the table.
Table 5 - 5. Trim Table


Contents Index Home Previous Next