A Beginner’s Guide to SQL*Loader

English

SQL*Loader is an essential tool for efficiently loading large volumes of data into an Oracle database. This article provides a complete guide to meet all needs, from basic to practical applications, troubleshooting, and performance optimization.


1.What is SQL*Loader?

SQL*Loader is a utility used to load data from external files (such as CSV or fixed-width files) into an Oracle database. It has the following features:

  • High Speed: Capable of loading large-scale data quickly.
  • Flexibility: Supports conditional loading and data transformation.
  • Powerful Error Handling: Records details of error rows and ignored rows.

When is it used?

  • Data integration from external systems
  • Loading large amounts of data in batch processes
  • Data import during system migration

2.Basic Components of SQL*Loader

To use SQL*Loader, the following files are required.

File TypeRole
Data FileStores the data to be loaded.
Control FileDescribes the loading method (target table, data format, conditions, etc.).
Log FileRecords the processing results (auto-generated).
Bad File (.bad)Records data that could not be loaded due to errors.
Discard File (.dis)Records data that was discarded because it did not meet conditions.

Example: Loading Employee Data

Here is a specific example of loading employee data into the database.

1. Data File (employees.csv)

Create the file with the following content:

1,John,Sales,5000
2,Jane,HR,5500
3,Bob,IT,6000
4,Alice,Sales,7000
[oracle@v19single ~]$ vi employees.csv  ★Create employees.csv
[oracle@v19single ~]$ cat employees.csv
1,John,Sales,5000
2,Jane,HR,5500
3,Bob,IT,6000
4,Alice,Sales,7000
[oracle@v19single ~]$

2. Control File (employees.ctl)

Describe the load settings in the control file.

LOAD DATA
INFILE 'employees.csv'
INTO TABLE employees
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
    id INTEGER EXTERNAL,
    name CHAR,
    department CHAR,
    salary INTEGER EXTERNAL
)

Key Points:

  • INFILE: Path to the data file.
  • FIELDS TERMINATED BY ',': Specifies comma-delimited.
  • TRAILING NULLCOLS: Inserts NULL if data is missing at the end of a record.
[oracle@v19single ~]$ vi employees.ctl  ★Create employees.ctl
[oracle@v19single ~]$ cat employees.ctl
LOAD DATA
INFILE 'employees.csv'
INTO TABLE employees
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
    id INTEGER EXTERNAL,
    name CHAR,
    department CHAR,
    salary INTEGER EXTERNAL
)
[oracle@v19single ~]$

3. Table Creation (Database-side Preparation)

Create the table to store the data using the following SQL.

CREATE TABLE employees (
    id NUMBER PRIMARY KEY,
    name VARCHAR2(50),
    department VARCHAR2(50),
    salary NUMBER
);
[oracle@v19single ~]$ sqlplus user1/user1

SQL*Plus: Release 19.0.0.0.0 - Production on Tue Dec 10 20:36:04 2024
Version 19.21.0.0.0

Copyright (c) 1982, 2022, Oracle.  All rights reserved.



Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.21.0.0.0

SQL> CREATE TABLE employees (  ★Create table
  2      id NUMBER PRIMARY KEY,
  3      name VARCHAR2(50),
  4      department VARCHAR2(50),
  5      salary NUMBER
  6  );

Table created.

4. Executing SQL*Loader

Run SQL*Loader with the following command.

sqlldr userid=username/password control=employees.ctl log=employees.log
[oracle@v19single ~]$ sqlldr user1/user1 control=employees.ctl log=employees.log

SQL*Loader: Release 19.0.0.0.0 - Production on Tue Dec 10 20:39:05 2024
Version 19.21.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Path used:      Conventional
Commit point reached - logical record count 4

Table EMPLOYEES:
  4 Rows successfully loaded.

Check the log file:
  employees.log
for more information about the load.
[oracle@v19single ~]$

Verifying the Execution Results

Check the loaded data.

SELECT * FROM employees;

Result:

IDNAMEDEPARTMENTSALARY1JohnSales50002JaneHR55003BobIT60004AliceSales7000
[oracle@v19single ~]$ sqlplus user1/user1

SQL*Plus: Release 19.0.0.0.0 - Production on Tue Dec 10 20:40:32 2024
Version 19.21.0.0.0

Copyright (c) 1982, 2022, Oracle.  All rights reserved.

Last successful login time: Tue Dec 10 2024 20:39:15 +09:00


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.21.0.0.0

SQL> set lin 1000 pages 1000
SQL> col name for a20
SQL> col department for a20
SQL> col salary for 999999999
SQL> SELECT * FROM employees;

        ID NAME                 DEPARTMENT               SALARY
---------- -------------------- -------------------- ----------
         1 John                 Sales                      5000
         2 Jane                 HR                         5500
         3 Bob                  IT                         6000
         4 Alice                Sales                      7000

3.Practical Application Examples

Conditional Data Loading

An example of loading only data where the department is ‘Sales’:

WHEN department = 'Sales'

Data Manipulation and Transformation

An example of transforming data during the load. This converts the department name to uppercase:

FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
(
    id INTEGER EXTERNAL,
    name CHAR,
    department CHAR "UPPER(:department)",
    salary INTEGER EXTERNAL
)

Speeding up with Parallel Processing

To load large amounts of data at high speed, use the following options:

sqlldr userid=username/password@dbname control=employees.ctl direct=true parallel=true

Option Explanations:

  • direct=true: Uses a Direct Path load.
  • parallel=true: Enables parallel processing.

Automating Error Handling

Error rows are recorded in the Bad File (.bad). By preparing a script to automatically reprocess this data, you can improve work efficiency.


4.Troubleshooting

SQL*Loader errors are recorded in the log file and the bad file. Common errors and their solutions are shown below.

Error Example 1: Data Type Mismatch

Error Log:

Record 3: Rejected - Error on table EMPLOYEES, column SALARY.
ORA-01722: invalid number

Cause:

String data exists in the SALARY column.

Solution:

  • Correct the data file.
  • Specify data type conversion in the control file.

Error Example 2: Unique Constraint Violation

Error Log:

Record 1: Rejected - Error on table EMPLOYEES.
ORA-00001: unique constraint (EMPLOYEES_PK) violated

Cause:

Duplicate data in the id column.

Solution:

  • Exclude duplicate data beforehand.

5.SQL*Loader Performance Optimization

Adjusting the Bind Array Size

Improve load speed by adding the following to the control file:

OPTIONS (BINDSIZE=1048576, ROWS=5000)

Temporarily Disabling Indexes and Constraints

Speed improves if you disable indexes and constraints before the load and rebuild them afterward.


6.SQL*Loader Comparison: Differences from Data Pump

FeatureSQL*LoaderData Pump
Data FormatCSV, Fixed-widthOracle-specific binary format
Processing SpeedDepends on file sizeOptimal for very large data
Conditional LoadPossibleNot possible

7.Summary

SQL*Loader is a flexible and powerful data loading tool. By applying the contents introduced in this article to your work, you can improve the efficiency and accuracy of data imports.

Review of Key Points:

  • Understand basic operations to load data reliably.
  • Address business needs with conditional loading and data transformation.
  • Maximize operational efficiency with error handling and high-speed techniques.

Let’s put this into practice and master SQL*Loader!

[reference]
SQL*Loader – Utilities

コメント

Copied title and URL