Stage 10. EXPLAIN and Query Optimization
A query can feel fast during development and become slow in production. On 1,000 rows, even a weak query may finish instantly. On 1,000,000 rows, the same query can suddenly take seconds. Sometimes indexes exist, but the database still does not use the one you expected.
The main rule is simple: you do not know what the database really does until you inspect the execution plan. Query optimization should start with facts, not guesses.
What EXPLAIN Shows
EXPLAIN shows the plan the database intends to use for a query. In PostgreSQL, it can show scan type, join strategy, sorting, estimated row count, and estimated cost. Think of it as an X-ray for SQL: the query text is what you wrote, but the plan shows how the database will actually execute it.
There are two common forms:
EXPLAIN
SELECT * FROM orders WHERE customer_id = 42;
EXPLAIN estimates the plan without actually running the query.
EXPLAIN ANALYZE
SELECT * FROM orders WHERE customer_id = 42;
EXPLAIN ANALYZE runs the query and shows real execution time and actual row counts. Use it carefully on heavy UPDATE, DELETE, or INSERT statements, because the statement is actually executed unless you wrap it in a transaction and roll it back.
How to Read a Plan Without Overload
Beginners do not need to understand every node immediately. Start with four signals.
| Plan item | What it means | Good or bad |
|---|---|---|
| Seq Scan | Full table scan | Bad on large tables when a selective index should exist |
| Index Scan | Database uses an index to find rows | Usually good for selective filters |
| Rows | How many rows the database expects or actually reads | Important for understanding selectivity |
| Cost / Actual time | Estimated work vs real runtime | Shows where the real problem is |
The most useful comparison is often estimated rows versus actual rows. If PostgreSQL expected 100 rows but found 500,000, the optimizer may choose a poor plan because its statistics or assumptions were wrong.
Main Anti-Pattern: Seq Scan on a Large Table
Suppose users has 5 million rows, and the application searches by email:
EXPLAIN ANALYZE
SELECT id, email, name
FROM users
WHERE email = '[email protected]';
If the plan says Seq Scan on users, PostgreSQL is reading the table row by row until it finds matching data. On a large table, this is usually the “aha” moment: the query is slow not because SQL is mysterious, but because the database has no efficient path to the row.
The usual fix is a targeted index:
CREATE INDEX idx_users_email ON users(email);
Then run the plan again:
EXPLAIN ANALYZE
SELECT id, email, name
FROM users
WHERE email = '[email protected]';
The plan should move from Seq Scan toward Index Scan or another index-based access path. Example result:
| State | Plan | Time |
|---|---|---|
| Before index | Seq Scan | 2.3s |
| After index | Index Scan | 12ms |
The exact numbers depend on hardware, cache, data size, and selectivity, but the workflow is the same: change one thing, measure again.
When an Index Does Not Help
Indexes are not magic. A query can ignore an index or use it poorly.
A common reason is a function around the indexed column:
WHERE LOWER(email) = '[email protected]'
If the index is on email, the database may not be able to use it for LOWER(email) as a normal lookup. Possible fixes are storing normalized email values, using a functional index, or changing the query pattern:
CREATE INDEX idx_users_lower_email ON users (LOWER(email));
Another reason is the wrong column order in a composite index. An index on (status, created_at) is useful for filtering by status and sorting by created_at, but it may not help much for a query that filters only by created_at.
Selectivity also matters. If a condition matches most of the table, an index may be slower than a sequential scan. For example, status = 'ACTIVE' is not selective if 95% of users are active. The database may correctly choose a full scan because reading most of the table through an index would be more expensive.
Optimize by Understanding, Not Guessing
Use a repeatable loop:
- Find the slow query.
- Run
EXPLAIN ANALYZE. - Identify the expensive node or wrong estimate.
- Change one thing: index, query shape, filter, selected columns, or statistics.
- Run
EXPLAIN ANALYZEagain. - Compare before and after.
Do not change five things at once. If performance improves, you will not know which change helped. If performance gets worse, you will not know which change caused it.
Signals You Will See in Plans
Some plan nodes are not automatically bad, but they are important signals.
| Plan signal | What it may indicate |
|---|---|
| Nested Loop on large inputs | A join may be repeatedly scanning many rows |
| Hash Join | Database builds a hash table for joining; often good for larger equality joins |
| Merge Join | Inputs are sorted and joined in order; useful when sorted data or indexes fit |
| Sort | Database must order rows; can be expensive on large result sets |
| Sort spilling to disk | Work memory is not enough, so sorting uses disk and slows down |
| Bitmap Index Scan | Database combines index matches before reading table rows; common for medium-selective filters |
The goal is not to memorize every node. The goal is to ask better questions: why is this table scanned, why are so many rows read, why is there a big sort, and why are estimates far from reality?
Mini Checklist
Before changing SQL or adding an index, check:
- is there a
Seq Scanon a large table? - is the expected index actually used?
- are there functions around indexed columns in
WHERE? - is the condition selective enough?
- is the composite index column order correct?
- are you selecting only the columns the application needs?
- did you compare
EXPLAIN ANALYZEbefore and after?
Practice Before the Next Lesson
Re-run the examples from this article on a tiny dataset of 5-10 rows. This step matters because SQL becomes much easier once you execute queries yourself and inspect real tabular output. Start with a plain version of the query, then change one thing at a time and compare behavior or execution plan. If results look wrong, validate each part: filters, selected columns, join conditions, and expected NULL handling.
Quick self-check:
- I can explain the lesson in one simple sentence.
- I can write a basic SQL example from memory.
- I can read query output and explain each column.
- I can name one common mistake and how to avoid it.
If you work in a team, store before/after measurements near the query change in the pull request description. This creates shared optimization history and prevents repeated trial-and-error.
Extra Scenario: Post-Release Plan Check
After shipping a new filter or sorting mode, re-check 2-3 critical queries with EXPLAIN. Even a small predicate change can switch a plan from index scan to full scan. Quick post-release review catches regressions early.
The main conclusion: professionals do not optimize queries by eye. They read what the database is doing. EXPLAIN is the move from guesses to facts.