Oracle SQL Scalar Subqueries: From Basics to Optimization Strategies

English

When extracting data with SQL, do you ever struggle with requirements like “I want to fetch just one value from another table” or “I want to filter based on a comparison with an aggregation result”?

Scalar Subqueries are a powerful technique that uses a subquery within SQL to return a “single row and single column” value, handling it just like a constant or a column.

In this article, assuming Oracle Database 19c, we will explain everything from the basic syntax of scalar subqueries to practical performance (speed-up) points, making it easy for beginners to understand.

Conclusion: Key Points of Scalar Subqueries

For those short on time, here is a summary of the characteristics and precautions of this feature.

  • Golden Rule for Return Values: It must strictly return a result of “1 row and 1 column” (if 0 rows, it becomes NULL).
  • Placement: Can be used in the SELECT clause, WHERE clause, ORDER BY clause, CASE expressions, etc.
  • Operational Image: Like a loop process, the subquery is executed for every single row of the main query (in the case of a correlated subquery).
  • Caution: While convenient, overuse can lead to slowness, so understanding execution plans and caching effects is necessary.

1. What is a Scalar Subquery?

A scalar subquery is a subquery that returns a “single value (scalar value).”

Usually, a SELECT statement returns a result in a table format (multiple rows and multiple columns), but a scalar subquery can be embedded as part of SQL just as if it were a numerical value 100 or a string ‘Tokyo’.

Scenarios where it is used

  • Master Reference: Retrieving a name corresponding to a code value from another table without writing a JOIN.
  • Aggregation Comparison: Using an aggregated value, such as “average salary per department,” as a condition.
  • Rank Calculation: Counting and displaying the number of records that meet specific conditions.

2. Basic Syntax and Mechanism

The most common syntax is as follows. The rule is to enclose it in parentheses ( ).

SELECT
    ColumnA,
    (SELECT TargetColumn FROM OtherTable WHERE Condition) AS AliasName
FROM
    MainTable;

Operational Flow

  1. Read 1 row from the MainTable.
  2. Execute the subquery (inside the parentheses) using the value from that row.
  3. Display the result of the subquery (only one value) as a result column of the main query.
  4. Proceed to the next row and repeat.

3. Practice: Scalar Subquery Execution Examples

Let’s check with actual working SQL. Here, we will use the Employees table and Departments table.

Prerequisites: Table Creation and Data Preparation

First, create the environment.

-- Delete existing tables if they exist (ignore errors)
-- DROP TABLE employees PURGE;
-- DROP TABLE departments PURGE;

-- Departments Table
CREATE TABLE departments (
    dept_id   NUMBER PRIMARY KEY,
    dept_name VARCHAR2(50)
);

-- Employees Table
CREATE TABLE employees (
    emp_id    NUMBER PRIMARY KEY,
    emp_name  VARCHAR2(50),
    dept_id   NUMBER,
    salary    NUMBER
);

-- Data Registration
INSERT INTO departments VALUES (1, 'Sales');
INSERT INTO departments VALUES (2, 'Dev');

INSERT INTO employees VALUES (1, 'Tanaka Taro', 1, 300000);
INSERT INTO employees VALUES (2, 'Yamada Hanako', 1, 350000);
INSERT INTO employees VALUES (3, 'Sato Jiro', 2, 400000);
INSERT INTO employees VALUES (4, 'Suzuki Ichiro', 2, 450000);
COMMIT;
SQL> SELECT * FROM departments;

DEPT_ID DEPT_NAME
---------- --------------------
1 Sales
2 Dev

SQL> SELECT * FROM employees;

EMP_ID EMP_NAME DEPT_ID SALARY
---------- -------------------- ---------- ----------
1 Tanaka Taro 1 300000
2 Yamada Hanako 1 350000
3 Sato Jiro 2 400000
4 Suzuki Ichiro 2 450000

Case 1: Retrieving values from another table as a “column” (SELECT clause)

Display the “Department Name” in the employee list without using JOIN.

SELECT 
    e.emp_name,
    e.salary,
    -- This is the scalar subquery
    (SELECT d.dept_name 
     FROM   departments d 
     WHERE  d.dept_id = e.dept_id) AS dept_name
FROM 
    employees e;

Execution Result:

SQL> SELECT
2 e.emp_name,
3 e.salary,
4 (SELECT d.dept_name
5 FROM departments d
6 WHERE d.dept_id = e.dept_id) AS dept_name
7 FROM
8 employees e;

EMP_NAME SALARY DEPT_NAME
-------------------- ---------- --------------------
Tanaka Taro 300000 Sales
Yamada Hanako 350000 Sales
Sato Jiro 400000 Dev
Suzuki Ichiro 450000 Dev

Explanation:

We use e.dept_id (value from the main query) inside the subquery (this is called “correlation”). If there is no corresponding ID in the departments table, the result becomes NULL. This behavior is similar to a LEFT OUTER JOIN.

Case 2: Filtering by comparing with aggregated values (WHERE clause)

Find employees who receive a salary higher than the “average salary of their department”.

SELECT 
    e1.emp_name,
    e1.salary
FROM 
    employees e1
WHERE 
    e1.salary > (
        -- Calculate and return the average salary per department
        SELECT AVG(e2.salary) 
        FROM   employees e2 
        WHERE  e2.dept_id = e1.dept_id
    );

Execution Result:

SQL> SELECT
2 e1.emp_name,
3 e1.salary
4 FROM
5 employees e1
6 WHERE
7 e1.salary > (
8 SELECT AVG(e2.salary)
9 FROM employees e2
10 WHERE e2.dept_id = e1.dept_id
11 );

EMP_NAME SALARY
-------------------- ----------
Yamada Hanako 350000
Suzuki Ichiro 450000

Explanation:

For each employee row, the “average of the department to which that employee belongs” is calculated, and a comparison is performed.

(Sales Dept Average: 325,000 / Dev Dept Average: 425,000)


4. Troubleshooting (Common Errors)

Here are frequent errors encountered when using scalar subqueries and how to deal with them.

Error CodeError MessageCause and Solution
ORA-01427single-row subquery returns more than one rowCause: The result of the subquery hits 2 or more records.
Solution: Review the join conditions, or narrow it down to 1 record using ROWNUM = 1 or MAX(), etc.
ORA-00936missing expressionCause: No column is specified in the SELECT clause inside the parentheses, or a comma error, etc.
Solution: Check the syntax.
(Incorrect Result)Unintentionally returning NULLCause: Data matching the condition of the subquery does not exist on the sub-side (0 rows).
Solution: Set a default value for NULL using NVL( (Subquery), 0 ).

5. Operations, Monitoring, and Performance Considerations

Scalar subqueries are convenient, but in professional environments, careful attention must be paid to “performance.”

Merits

  • SQL can be read intuitively (Join descriptions do not become complex).
  • Individual aggregated values can be attached without performing GROUP BY in the main query.

Demerits & Pitfalls

  • Behavior similar to N+1 problem: Since the subquery may be executed for the number of rows, if the main table has millions of rows, it will be severely delayed.
  • Context Switch: Context switching occurs within the SQL engine, which can become overhead.

Optimization Hints (Oracle Features)

Oracle Database has a “Scalar Subquery Caching feature.”

If the input value (value of the correlation condition) is the same, the past execution result is reused from the hash table in memory.

  • Effective: When the types of Department IDs are few (e.g., 10 types) but there are many employees (e.g., 10,000 people). The cache hit rate increases, making it fast.
  • Ineffective: When the value of the correlation condition is almost unique (all records are different). The cache does not work, and it runs for every record, becoming extremely slow. In this case, consider rewriting to a LEFT JOIN.

6. FAQ: Frequently Asked Questions

Q1. If the result of the subquery is 0 rows, does it cause an error?

A. No, it does not cause an error.

If the result of a scalar subquery is 0 rows, Oracle returns NULL. It is recommended to convert NULL to 0 or ‘-‘ using the NVL function, etc.

Q2. Which is better, scalar subquery or normal join (JOIN)?

A. It depends on the number of columns to retrieve and the data volume.

  • If you want to reference only 1 column and the variety of data (cardinality) is low: Scalar subqueries may be simple and fast.
  • If you want to reference multiple columns or the data volume is huge: A normal LEFT JOIN is more efficient.

Q3. Can it be used outside of the SELECT clause?

A. Yes.

It can be used in the SET clause (UPDATE statement), VALUES clause (INSERT statement), ORDER BY clause, etc. especially, ORDER BY (Subquery) is useful when you want to sort by a “Display Order Column” from a master table.


7. Summary

Scalar subqueries are an important technique that expands the expressiveness of SQL.

  • The absolute condition is to return 1 row and 1 column (returns ORA-01427 if multiple rows are returned).
  • Can be used for dynamic value retrieval and condition setting in SELECT and WHERE clauses.
  • It is fast when the cache function works effectively, but consider JOIN if there are many unique values.
  • Implement with care noting that it becomes NULL if there is no result.

Please try utilizing it in appropriate situations by first checking the behavior with a small dataset and looking at the execution plan.

This article explains concepts targeting Oracle Database 19c (screens and default values may differ in other versions).

[reference]
Oracle Database SQL Language Reference, 19c

コメント

Copied title and URL