EXISTS vs IN vs JOIN in SQL: Which Should You Use?
Three ways to ask "does a related row exist?" — and they behave differently with NULLs and on large tables. Learn when EXISTS wins, when IN is fine, and when a JOIN is clearer.
A huge share of real queries boil down to one question: does a related row exist? "Customers who have placed an order", "products never sold", "accounts with a transaction this month". SQL gives you three tools for this — IN, EXISTS, and JOIN — and choosing well affects both correctness and speed. Keep the SQL cheat sheet handy as you read.
The Same Question, Three Ways
Take "users who have placed at least one order". You can express it by checking a list of IDs (IN), by checking for the existence of a matching row (EXISTS), or by joining and de-duplicating (JOIN). All three can return the same answer; they differ in how they handle NULLs and how the planner executes them.
IN with a Subquery
IN compares a column against the set of values returned by a subquery. It reads naturally and is perfectly fine when the subquery returns a modest list of non-NULL values.
-- Users who have placed an order
SELECT id, name
FROM users
WHERE id IN (SELECT user_id FROM orders);
EXISTS (Correlated)
EXISTS takes a correlated subquery — one that references the outer row — and returns true as soon as a single matching row is found. It does not build or compare a full list; it just asks "is there at least one?" and stops at the first hit.
-- Same result, expressed with EXISTS
SELECT id, name
FROM users u
WHERE EXISTS (
SELECT 1 FROM orders o WHERE o.user_id = u.id
);
The NOT IN NULL Trap
This is the single most important reason to prefer NOT EXISTS over NOT IN. If the subquery behind NOT IN returns even one NULL, the whole condition becomes unknown for every row, and the query returns no rows at all — silently.
-- DANGER: if any user_id in orders is NULL, this returns nothing
SELECT id FROM users
WHERE id NOT IN (SELECT user_id FROM orders);
-- Safe: NOT EXISTS is unaffected by NULLs
SELECT id FROM users u
WHERE NOT EXISTS (
SELECT 1 FROM orders o WHERE o.user_id = u.id
);
JOIN as an Alternative
You can also answer the question with a JOIN. An INNER JOIN keeps only matching rows (then you de-duplicate), and a LEFT JOIN with IS NULL finds the non-matches. JOINs shine when you also need columns from the other table.
-- Users with orders, via JOIN
SELECT DISTINCT u.id, u.name
FROM users u
JOIN orders o ON o.user_id = u.id;
-- Users with NO orders (anti-join)
SELECT u.id, u.name
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE o.id IS NULL;
Which Should You Use?
Modern PostgreSQL often optimizes IN, EXISTS, and semi-joins into the same plan, so for existence checks they perform similarly. The practical guidance: use EXISTS / NOT EXISTS for pure existence tests (especially the "no match" case, to dodge the NULL trap); use IN for a short, known, non-NULL list; and use a JOIN when you actually need columns from the related table.
Common Mistakes & Tips
- Using NOT IN against a subquery that can contain NULL — it silently returns nothing; use NOT EXISTS
- Forgetting to correlate the EXISTS subquery to the outer row (the o.user_id = u.id part)
- Using a JOIN for an existence check and then needing DISTINCT to undo the row multiplication
- Assuming IN is always slower than EXISTS — the planner usually treats them alike
- Writing SELECT * inside EXISTS — harmless, but SELECT 1 signals intent
Existence checks are interview favourites and everyday tools. Practice them on real tables to internalise the NOT IN / NOT EXISTS difference.