UNION
and UNION ALL
are very similar - both combines result-set of SELECT
queries, but the final sets are different. Let me allow to show you differences.
The background
In the beginning, I want to show how my tables in DB look. For my tests, I have 2 tables: posts
and pages
. Both contains 2 rows:
Pages:
Posts:
The result-sets
I want to count how many contents (posts and pages) was created by every user (created_by_id
). Firstly, I'm trying using UNION. It's my query:
SELECT
created_by_id,
SUM(count) as total
FROM
(
SELECT
created_by_id,
count(*) as count
FROM pages
GROUP BY created_by_id
UNION
SELECT
created_by_id,
count(*) as count
FROM posts
GROUP BY created_by_id
) as counts
GROUP BY created_by_id
There's the result-set:
The result-set is wrong! It should contain 2 for every user.
Why the result-set is wrong?
SELECT created_by_id, count(*) as count FROM posts GROUP BY created_by_id
and
SELECT created_by_id, count(*) as count FROM pages GROUP BY created_by_id
return the same result-sets. Both tables contain 1 row for every user. UNION
combines 2 result-sets, but it only returns unique rows. In my case, both tables return the same set, so UNION
skips one of them.
How to solve it?
If you want to return all data (including duplicated results), you need to use UNION ALL
. Have a look on the query:
SELECT
created_by_id,
SUM(count) as total
FROM
(
SELECT
created_by_id,
count(*) as count
FROM pages
GROUP BY created_by_id
UNION ALL
SELECT
created_by_id,
count(*) as count
FROM posts
GROUP BY created_by_id
) as counts
GROUP BY created_by_id
I just replaced UNION
to UNION ALL
. Now the result-set looks correctly.
What is faster?
I inserted 10 000 rows into both tables. On my local machine, UNION ALL
is executed by 10-15 ms, UNION
is executed in the same time.
Summary
You have to be very careful while using UNION
and UNION ALL
. They return different data, so you have to know what exactly you expect. Debugging these issues is difficult.