SQL UNION and UNION ALL: Combining Query Results in SQL
Introduction
The UNION
and UNION ALL
clauses in SQL are used to combine the results of two or more SELECT queries
into a single result set. These operators are valuable for aggregating data from multiple sources or queries,
but they handle duplicates differently, making each suitable for specific scenarios.
1. What are UNION and UNION ALL?
- UNION combines the results of two or more queries into a single result set, removing duplicate rows. It performs a distinct operation, ensuring that each row in the result set is unique.
- UNION ALL combines the results of two or more queries into a single result set, including all duplicates. It does not perform a distinct operation, so all rows from the queries are included.
Syntax:
1
2
3
4
5
6
7
8
9
10
11
12
13
-- Using UNION
SELECT column1, column2, ...
FROM table1
UNION
SELECT column1, column2, ...
FROM table2;
-- Using UNION ALL
SELECT column1, column2, ...
FROM table1
UNION ALL
SELECT column1, column2, ...
FROM table2;
SELECT column1, column2, ...
: Specifies the columns to retrieve.FROM table1
andFROM table2
: Indicate the tables from which to retrieve data.UNION
orUNION ALL
: Combines the results of the queries.
2. Differences Between UNION and UNION ALL
2.1. Handling Duplicates
- UNION removes duplicate rows from the combined result set. This involves sorting and deduplication, which can impact performance.
- UNION ALL includes all rows from the combined result set, including duplicates. This is more efficient as it does not involve sorting or deduplication.
2.2. Performance
- UNION generally slower because it requires additional processing to eliminate duplicates.
- UNION ALL faster because it does not perform duplicate removal.
3. Using UNION and UNION ALL
3.1. Combining Results with UNION
Combine the results of two queries to retrieve a unique list of values.
Example:
Retrieve a unique list of customer names from two different tables, current_customers
and former_customers
:
1
2
3
4
5
SELECT customer_name
FROM current_customers
UNION
SELECT customer_name
FROM former_customers;
In this example:
- The query returns a distinct list of customer names from both tables.
3.2. Combining Results with UNION ALL
Combine the results of two queries and include all records, even duplicates.
Example: Retrieve all customer names from both tables, including duplicates:
1
2
3
4
5
SELECT customer_name
FROM current_customers
UNION ALL
SELECT customer_name
FROM former_customers;
In this example:
- The query returns all customer names from both tables, including those that appear in both.
4. Requirements for Using UNION and UNION ALL
- Column Matching
The columns in each SELECT query must have the same number of columns and compatible data types. The column names in the result set are derived from the first SELECT query. - Order of Columns
The order and data types of columns must match in all SELECT statements being combined.
Example:
Retrieve product information from two different tables with matching columns:
1
2
3
4
5
SELECT product_id, product_name, price
FROM electronics
UNION
SELECT product_id, product_name, price
FROM clothing;
In this example:
- Both queries must select
product_id
,product_name
, andprice
with the same data types.
5. Practical Use Cases
5.1. Merging Data from Multiple Tables
Combine data from similar tables to get a unified view.
Example: Combine current and archived orders:
1
2
3
4
5
SELECT order_id, order_date, amount
FROM current_orders
UNION
SELECT order_id, order_date, amount
FROM archived_orders;
5.2. Aggregating Results from Multiple Queries
Aggregate results for reporting purposes.
Example: Get a list of all unique products sold this year and last year:
1
2
3
4
5
SELECT product_name
FROM sales_2023
UNION
SELECT product_name
FROM sales_2024;
5.3. Handling Large Data Sets
Retrieve data for large datasets where duplicates are not a concern.
Example: Combine large result sets from different sources:
1
2
3
4
5
SELECT employee_id, department
FROM department_A
UNION ALL
SELECT employee_id, department
FROM department_B;
Conclusion
The UNION and UNION ALL clauses are powerful tools for combining results from multiple SQL queries. Understanding their differences and appropriate use cases helps in crafting efficient queries for aggregating and analyzing data. While UNION provides a distinct result set by eliminating duplicates, UNION ALL offers faster performance by including all rows. By leveraging these clauses effectively, you can manage and analyze data more comprehensively and efficiently.