SQL*Loader uses the SQL array-interface option to transfer data to the RDBMS. Multiple rows are read at one time and stored in the bind array. When SQL*Loader sends Oracle an INSERT command, the entire array is inserted at one time. After the rows in the bind array are inserted, a COMMIT is issued.
Although the entire bind array need not be in contiguous memory, the buffer for each field in the bind array must occupy contiguous memory. If the operating system cannot supply enough contiguous memory to store a field, SQL*Loader generates an error.
In general, any reasonably large size will permit SQL*Loader to operate effectively. It is not usually necessary to perform the detailed calculations described in this section. This section should be read when maximum performance is desired, or when an explanation of memory usage is needed.
As part of its initialization, SQL*Loader determines the space required to load a single row. If that size is too large to fit within the specified maximum, the load terminates with an error.
SQL*Loader then multiplies that size by the number of rows for the load, whether that value was specified with the command-line parameter ROWS (see ) or the OPTIONS clause in the control file (see ). If that size fits within the bind array maximum, the load continues--SQL*Loader does not try to expand the number of rows to reach the maximum bind array size. That is, if the number of rows and the maximum bind array size are both specified, SQL*Loader always uses the smaller value for the bind array.
If the maximum bind array size is too small to accommodate the initial number of rows, SQL*Loader uses a smaller number of rows that fits within the maximum.
bind array size = (number of rows) * (maximum row length)
where:
(maximum row length) = SUM(fixed field lengths) +
SUM(maximum varying field lengths) +
SUM(overhead for varying length fields)
Many fields do not vary in size. These fixed-length fields are the same for each loaded row. For those fields, the maximum length of the field is the field size, in bytes, as described in "Specifying Datatypes" . There is no overhead for these fields.
The fields that can vary in size from row to row are
VARCHAR VARGRAPHIC
CHAR DATE
numeric EXTERNAL
The maximum length of these datatypes is described in "Specifying Datatypes" . The maximum lengths describe the number of bytes, or character positions, that the fields can occupy in the input data record. That length also describes the amount of storage that each field occupies in the bind array, but the bind array includes additional overhead for fields that can vary in size.
When the character datatypes (CHAR, DATE, and numeric EXTERNAL) are specified with delimiters, any lengths specified for these fields are maximum lengths. When specified without delimiters, the size in the record is fixed, but the size of the inserted field may still vary, due to whitespace trimming. So internally, these datatypes are always treated as varying-length fields--even when they are fixed-length fields.
A length indicator is included for each of these fields in the bind array. The space reserved for the field in the bind array is large enough to hold the longest possible value of the field. The length indicator gives the actual length of the field for each row.
In summary:
bind array size =
(number of rows) * ( SUM(fixed field lengths)
+ SUM(maximum varying field lengths)
+ ( (number of varying length fields)
* (size of length-indicator) )
)
OPTIONS (ROWS=1)
LOAD DATA
INFILE *
APPEND
INTO TABLE DEPT
(deptno POSITION(1:1) CHAR)
BEGINDATA
a
This control file "loads" a one-character field using a one-row bind array. No data is actually loaded, due to the numeric conversion error that occurs when "a" is loaded as a number. The bind array size shown in the log file, minus one (the length of the character field) is the value of the length indicator.
Note: A similar technique can determine bind array size without doing any calculations. Run your control file without any data and with ROWS=1 to determine the memory requirements for a single row of data. Multiply by the number of rows you want in the bind array to get the bind array size.
Datatype | Size |
INTEGER | OS-dependent |
SMALLINT | |
FLOAT | |
DOUBLE | |
Datatype | Default Size | Length Specified with POSITION | Length Specified with DATATYPE |
GRAPHIC | None | L | 2*L |
GRAPHIC EXTERNAL | None | L - 2 | 2*(L-2) |
VARGRAPHIC | 4Kb*2 | L+S | (2*L)+S |
Datatype | Default Size | Maximum Length Specified (L) |
VARCHAR | 4Kb | L+S |
CHAR (delimited) DATE (delimited) numeric EXTERNAL (delimited) MLSLABEL (delimited) | 255 | L+S |
CHAR(10) TERMINATED BY ","
uses (10 + 2) * 64 = 768 bytes in the bind array, assuming that the length indicator is two bytes long. However:
CHAR TERMINATED BY ","
uses (255 + 2) * 64 = 16,448 bytes, because the default maximum size for a delimited field is 255. This can make a considerable difference in the number of rows that fit into the bind array.
If the same field in the data record is mentioned in multiple INTO TABLE clauses, it requires additional space in the bind array each time it is mentioned. So, it is especially important to minimize the buffer allocations for fields like these.