Differences between UNION and UNION ALL in SQL

UNION and UNION ALL are very similar - both combines result-set of SELECT queries, but the final sets are different. Let me allow to show you differences.

The background

In the beginning, I want to show how my tables in DB look. For my tests, I have 2 tables: posts and pages. Both contains 2 rows:

Pages:

pages-table

Posts:

posts-table

The result-sets

I want to count how many contents (posts and pages) was created by every user (created_by_id). Firstly, I'm trying using UNION. It's my query:

SELECT
    created_by_id,
    SUM(count) as total
FROM
    (
    SELECT
        created_by_id,
        count(*) as count
    FROM pages
    GROUP BY created_by_id
    UNION
    SELECT
        created_by_id,
        count(*) as count
    FROM posts
    GROUP BY created_by_id
    ) as counts
GROUP BY created_by_id

There's the result-set:

result-set-for-union

The result-set is wrong! It should contain 2 for every user.

Why the result-set is wrong?

SELECT created_by_id, count(*) as count FROM posts GROUP BY created_by_id

and

SELECT created_by_id, count(*) as count FROM pages GROUP BY created_by_id

return the same result-sets. Both tables contain 1 row for every user. UNION combines 2 result-sets, but it only returns unique rows. In my case, both tables return the same set, so UNION skips one of them.

How to solve it?

If you want to return all data (including duplicated results), you need to use UNION ALL. Have a look on the query:

SELECT
    created_by_id,
    SUM(count) as total
FROM
    (
    SELECT
        created_by_id,
        count(*) as count
    FROM pages
    GROUP BY created_by_id
    UNION ALL
    SELECT
        created_by_id,
        count(*) as count
    FROM posts
    GROUP BY created_by_id
    ) as counts
GROUP BY created_by_id

I just replaced UNION to UNION ALL. Now the result-set looks correctly.

result-set-for-union-all

What is faster?

I inserted 10 000 rows into both tables. On my local machine, UNION ALL is executed by 10-15 ms, UNION is executed in the same time.

Summary

You have to be very careful while using UNION and UNION ALL. They return different data, so you have to know what exactly you expect. Debugging these issues is difficult.

Comments

Blog Comments powered by Disqus.

Older Post Newer Post