Database Query Performance: Finding and Fixing the Slow Ones

Most API Performance Problems Are Database Problems

When an API endpoint is slow, the query profiling usually tells the same story: one or two database queries account for 80-95% of the response time. Everything else — network, serialization, business logic — is noise by comparison. This means that optimizing your database queries is typically the highest-leverage performance work available to you.

The challenge is that slow queries often hide. They don't throw exceptions. They don't fail visibly. They just make your users wait, and without active monitoring, you may not know which queries are slow or why.

This article walks through the systematic approach to finding slow queries and the common patterns that cause them.

Finding Slow Queries

Enable slow query logging. PostgreSQL's pg_stat_statements extension tracks query statistics including total execution time, number of calls, and average duration. Enable it in your database configuration and query the view regularly:

SELECT query, calls, total_exec_time, mean_exec_time, stddev_exec_time
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 20;

This shows you the 20 slowest queries by average execution time. Run this query after your application has been under normal load and you'll immediately see where the time is going.

Application-level query timing. In your ORM or database client, enable query logging with timing:

For Prisma:

const prisma = new PrismaClient({
 log: [{ emit: 'event', level: 'query' }],
})
prisma.$on('query', (e) => {
 if (e.duration > 100) { // log queries over 100ms
 console.warn(`Slow query (${e.duration}ms): ${e.query}`)
 }
})

For production, send these events to your observability platform (Datadog, Grafana, etc.) rather than logging to console.

N+1 query detection. The N+1 problem is one of the most common performance anti-patterns when using ORMs. It occurs when loading a list of N records triggers N additional queries to load related data — one for each record.

Example of N+1 with Prisma:

// This fires 1 query for users, then 1 query per user for their posts
const users = await prisma.user.findMany()
for (const user of users) {
 const posts = await prisma.post.findMany({ where: { authorId: user.id } })
}

// This fires 1 query with a JOIN — the correct approach
const users = await prisma.user.findMany({
 include: { posts: true }
})

ORM-level tooling like prisma-query-inspector or knex-query-debug can detect N+1 patterns automatically. In tests, you can assert on query count to catch regressions.

Understanding EXPLAIN

EXPLAIN ANALYZE is the most important diagnostic tool for slow queries. It shows the query execution plan — how PostgreSQL decided to execute your query — along with actual execution statistics.

EXPLAIN ANALYZE
SELECT u.*, p.*
FROM users u
JOIN posts p ON p.author_id = u.id
WHERE u.organization_id = '123'
AND p.published_at > NOW() - INTERVAL '30 days'
ORDER BY p.published_at DESC
LIMIT 20;

Key things to look for in the output:

Sequential scan (Seq Scan): The database is reading every row in the table. If your table has millions of rows, this is slow. Fix: add an index on the filtered column.

Index scan: The database is using an index to find rows directly. This is what you want.

Nested loop join: For small result sets, this is fine. For large tables, this can be slow if the inner loop executes many times. Consider the join order and indexing of join keys.

High row estimates vs. Actual rows: If the planner estimates 10 rows but actually fetches 10,000, the statistics are stale. Run ANALYZE tablename to update statistics, or tune autovacuum to run more frequently.

Costs: The numbers in parentheses are planner cost estimates. High costs indicate expensive operations. Compare the estimated startup cost to the actual execution time to understand the planning accuracy.

The Indexes That Matter

An index is a data structure that allows the database to find rows quickly without scanning the entire table. Building the right indexes is the most impactful optimization for read-heavy queries.

Index the columns in your WHERE clauses:

-- Slow without index
WHERE organization_id = '123'

-- Create a covering index
CREATE INDEX idx_users_org_id ON users(organization_id);

Composite indexes for multi-column filters:

-- This query benefits from a composite index
WHERE organization_id = '123' AND status = 'active'

-- Column order matters: put the most selective column first
CREATE INDEX idx_users_org_status ON users(organization_id, status);

Index for sort operations:

-- Without an index, sorting requires a full scan + sort
ORDER BY created_at DESC

-- With the index, the database can scan in reverse order
CREATE INDEX idx_posts_created_at ON posts(created_at DESC);

Partial indexes for common filter patterns:

-- If you frequently query for active subscriptions
WHERE status = 'active'

-- A partial index is smaller and faster than a full index
CREATE INDEX idx_subscriptions_active ON subscriptions(user_id) WHERE status = 'active';

The index that doesn't help: An index on a column with very low cardinality (few distinct values) — like a boolean is_deleted column — rarely helps because the database often decides a sequential scan is cheaper than using the index for columns where the index doesn't meaningfully narrow the search.

Common Query Anti-Patterns

SELECT * when you need a few columns. Fetching all columns transfers more data than necessary and prevents covering index optimizations. Select only the columns you actually use.

Functions in WHERE clauses that prevent index use:

-- This can't use an index on created_at
WHERE YEAR(created_at) = 2025

-- This can
WHERE created_at >= '2025-01-01' AND created_at < '2026-01-01'

LIKE with a leading wildcard:

-- Can't use a B-tree index
WHERE email LIKE '%gmail.com'

-- Can use an index
WHERE email LIKE 'james%'

For full-text search with leading wildcards, use PostgreSQL's full-text search capabilities (tsvector/tsquery) or an external search index (Elasticsearch, Meilisearch).

DISTINCT to cover up a bad join. If you're adding DISTINCT because your query is returning duplicate rows, that's usually a symptom of an incorrect join that needs to be fixed rather than filtered.

Large OFFSET for pagination. OFFSET 10000 LIMIT 20 requires the database to scan and discard 10,000 rows before returning 20. Use cursor-based pagination (keyset pagination) instead:

-- Instead of OFFSET
WHERE id > :lastSeenId
ORDER BY id
LIMIT 20

Query Optimization Workflow

When a query is identified as slow, I work through this sequence:

Run EXPLAIN ANALYZE on the exact query with representative parameters.
Identify the most expensive operation (highest row estimate, sequential scans on large tables).
Check whether an index would help — look at the columns in the WHERE clause, JOIN conditions, and ORDER BY.
Add the index and re-run EXPLAIN ANALYZE to verify the plan changed.
Measure the actual improvement in production with query timing metrics.
Check whether denormalization would help for frequently-accessed aggregates (cache a computed column rather than aggregating on every request).

Database query performance is one of the most tractable performance problems in web development — the diagnostic tools are good, the fixes are often straightforward, and the improvements are measurable and permanent. If you're working on an application with slow API response times and suspect the database is the bottleneck, book a call at calendly.com/jamesrossjr and let's find the slow queries together.