Stage 9 - N+1 Problem: Detecting and Fixing Excess Queries
N+1 means one query for the main list plus one extra relation query for each item in that list. If one query loads 100 orders and Hibernate then runs 100 additional queries to load users, the application executed 1 + 100 queries. The code may look clean, but the database work is inefficient.
Why N+1 happens
N+1 often happens with lazy loading. Lazy loading is not wrong; it prevents unnecessary data from loading upfront. The problem appears when code iterates over many entities and touches a lazy relation for each entity.
List<Order> orders = orderRepository.findAll();
for (Order order : orders) {
System.out.println(order.getUser().getEmail());
}
The first line loads orders. The loop touches user for every order. If users were not already loaded, Hibernate sends more selects. With a small database this is easy to miss. With real data and network latency it becomes a performance issue.
How to see the problem
Enable SQL logging while learning:
spring.jpa.show-sql=true
logging.level.org.hibernate.SQL=DEBUG
Then execute the endpoint and count the queries. You are looking for a pattern: one select for the main list followed by many similar selects for the relation. Do not stop at "the endpoint works." Persistence performance must be observed through actual SQL.
Solutions
JOIN FETCH tells JPQL to load the relation in the same query for a specific use case:
@Query("SELECT o FROM Order o JOIN FETCH o.user")
List<Order> findAllWithUsers();
@EntityGraph lets you describe which relations should be fetched for a repository method without writing full JPQL:
@EntityGraph(attributePaths = "user")
List<Order> findAll();
DTO projection can be even better for read-only APIs because it loads exactly the fields needed by the response. Batch fetching reduces query count by loading related rows in groups. In Hibernate, hibernate.default_batch_fetch_size=20 can turn many single-row relation queries into fewer batched queries.
| What you see in SQL logs | What it means | Best first fix |
|---|---|---|
select * from orders followed by many select * from users where id = ? | The code loaded orders, then touched order.user inside a loop. | Add a use-case query with JOIN FETCH o.user or a DTO projection. |
| One endpoint needs users, another endpoint does not | The mapping should not become globally eager for one screen. | Keep relation lazy and create a fetch plan only for the endpoint that needs it. |
| A list response needs only order id, status, and user email | Full entities load more data than the response needs. | Use DTO projection and select exactly those fields. |
| Many lazy relations are accessed in small groups | There are still extra queries, but they can be grouped. | Configure batch fetching as a mitigation, not as the only design. |
Tradeoffs
Do not solve every N+1 by making mappings eager. That spreads the cost to every query. JOIN FETCH can duplicate rows when fetching collections and may require DISTINCT or pagination care. Entity graphs are clean but still need understanding of what is loaded. DTOs are fast for reads but are not managed entities. Batch fetching helps, but it does not replace use-case-specific queries.
Practice
Create orders linked to users. Load all orders and print user emails with SQL logging enabled. Count the queries and confirm the N+1 pattern. Then fix the repository with JOIN FETCH and count again. Repeat with @EntityGraph. Finally create a DTO query that returns order id, status, and user email, and compare the SQL and loaded data.
Understanding checklist
- I can explain why N+1 means one main query plus N relation queries.
- I know how lazy loading can produce N+1.
- I can identify N+1 from Hibernate SQL logs.
- I can fix N+1 with
JOIN FETCH,@EntityGraph, DTO projection, or batch fetching. - I understand why global EAGER loading is usually not the right fix.