Database Performance Tuning Techniques & Best Practices

September 20, 2025·16 min read·Performance and Optimizationintermediate

Why fast queries matter when user expectations and data volumes keep rising

A simple server rack representing database infrastructure, with cables neatly connected to illustrate structured, reliable data flow

Most developers meet their first real performance problem not in a benchmark but in a production incident. One morning, an API that used to respond in 150ms starts timing out. The dashboard shows CPU is fine, memory is stable, but database latency spikes and the app server queues requests. It is rarely a single cause; it is a combination of missing indexes, inefficient queries, and a design that does not match the workload. I have lived through this pattern more than once, and the lesson is always the same: performance is a product of design and discipline, not luck.

In this post, I will walk through practical, battle-tested techniques for tuning database performance. I will avoid abstract theory and focus on patterns you can apply today, using PostgreSQL and MySQL as concrete examples. I will show code for realistic scenarios, explain tradeoffs, and share the kind of small mistakes that become big outages. If you are a developer building APIs, ETL jobs, or internal tools, this should help you diagnose issues confidently and fix them with intent.

Context: where database tuning fits in modern development

Databases are the backbone of most applications, and they remain the most common bottleneck. While frameworks and cloud services have made scaling compute easier, data access patterns are harder to get right. Teams lean on managed databases (Amazon RDS, Google Cloud SQL, Azure Database), but managed does not mean magic. The database still executes your SQL, and poor SQL remains the top cause of performance problems.

Developers who work on backend services, data pipelines, and analytics frequently touch the database directly. The most effective practitioners understand both the application code and the database engine. They reason about queries like engineers reason about algorithms. Compared to alternatives like NoSQL document stores, relational databases are not slower by default; they are often more predictable once you teach them how to serve your workload. NoSQL can be faster for narrow, high-throughput patterns but can be trickier to maintain as queries diversify. For most business applications, a relational database with tuned indexes and clean queries offers a strong blend of performance, consistency, and tooling.

If you are optimizing a small service, you might not need advanced strategies. But as data grows, the cost of ignoring tuning multiplies. A single missing index can turn a fast endpoint into a latency sink. A query that joins without care can saturate I/O and stall unrelated traffic. The techniques below address these realities.

Core concepts: what performance means at the database layer

Performance at the database layer usually breaks down into several measurable components:

Latency: how long a query takes to return.
Throughput: how many queries the database can serve per second.
Resource utilization: CPU, memory, I/O, and network.
Concurrency: how many clients can run queries simultaneously without thrashing.

In practice, you care about tail latency more than averages. A p99 latency of 100ms is more important than a p50 of 10ms because user experience is driven by outliers. Databases are sensitive to contention: locks, hot pages, and plan instability can cause latency spikes. Tuning is about eliminating contention and making the engine do less work per query.

A useful mental model is the query lifecycle:

The application sends a SQL statement.
The database parses and plans the execution.
It reads data from memory or disk, applies filters, and performs joins.
It writes changes if needed and commits, releasing locks.

Each step can be slow. Parsing is rarely the culprit. Planning can be slow if statistics are stale or queries are complex. Reading data is usually the heavy part. Writing can be slow if checkpoints or replication lag. Your job is to minimize work at steps 2 and 3 and avoid contention at step 4.

Practical techniques with real-world examples

Indexing: the first lever that usually works

Indexes speed up reads but add overhead to writes and storage. The goal is to index for your most common query patterns.

PostgreSQL example: a typical orders table with a foreign key to customers and a timestamp. Queries often filter by customer and date.

-- Table definition (PostgreSQL)
CREATE TABLE orders (
  id BIGSERIAL PRIMARY KEY,
  customer_id BIGINT NOT NULL REFERENCES customers(id),
  status VARCHAR(32) NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
  amount NUMERIC(12,2) NOT NULL
);

-- Indexes tuned for common access patterns
CREATE INDEX idx_orders_customer_created
  ON orders (customer_id, created_at DESC);

-- Partial index for active orders only
CREATE INDEX idx_orders_active
  ON orders (customer_id)
  WHERE status IN ('pending', 'processing');

Why this helps:

The composite index supports WHERE customer_id = ? ORDER BY created_at DESC, which is common in user dashboards.
The partial index keeps active orders fast without bloating storage for archived rows.

In MySQL, the approach is similar, but be mindful of how InnoDB handles secondary indexes:

-- Table definition (MySQL InnoDB)
CREATE TABLE orders (
  id BIGINT AUTO_INCREMENT PRIMARY KEY,
  customer_id BIGINT NOT NULL,
  status VARCHAR(32) NOT NULL,
  created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
  amount DECIMAL(12,2) NOT NULL,
  INDEX idx_customer_created (customer_id, created_at),
  INDEX idx_active_customer (customer_id)
) ENGINE=InnoDB;

-- MySQL can use index hints if the optimizer misbehaves
SELECT * FROM orders USE INDEX (idx_customer_created)
WHERE customer_id = 123 ORDER BY created_at DESC;

Tradeoffs:

Indexes speed reads but slow writes. In write-heavy tables, too many indexes can hurt throughput.
Composite index order matters: put equality filters first, then range or sort columns.
Partial indexes are great for skewed workloads (e.g., 90% of queries target active rows).

Query rewriting: doing less work is faster

Let’s start with a slow query I have seen in many codebases: fetching pages of records with OFFSET. It works until it doesn’t.

-- Anti-pattern: deep pagination with OFFSET
SELECT *
FROM orders
WHERE status = 'pending'
ORDER BY created_at
LIMIT 50 OFFSET 50000;  -- Slow; reads and discards 50,000 rows

A better pattern is keyset pagination:

-- Keyset pagination: stable and fast
SELECT *
FROM orders
WHERE status = 'pending'
  AND created_at > '2025-01-15 10:00:00+00' -- last seen value
ORDER BY created_at
LIMIT 50;

This avoids scanning rows you do not need. It also works well with the composite index (status, created_at). In a backend service, you can expose this via a cursor.

Example in a Python endpoint using FastAPI and SQLAlchemy:

from datetime import datetime, timezone
from sqlalchemy import select, and_
from fastapi import FastAPI, Query
from .models import Order, AsyncSession

app = FastAPI()

@app.get("/orders/pending")
async def list_pending_orders(
    session: AsyncSession,
    cursor: str | None = Query(None),
    limit: int = Query(50, ge=1, le=100),
):
    # Decode cursor as created_at timestamp in ISO format
    after = None
    if cursor:
        try:
            after = datetime.fromisoformat(cursor).replace(tzinfo=timezone.utc)
        except ValueError:
            raise HTTPException(status_code=400, detail="Invalid cursor")
    else:
        # Start from now if no cursor to avoid returning old data
        after = datetime.now(timezone.utc)

    stmt = (
        select(Order)
        .where(Order.status == "pending", Order.created_at > after)
        .order_by(Order.created_at)
        .limit(limit)
    )
    result = await session.execute(stmt)
    orders = result.scalars().all()

    # Build next cursor from the last row
    next_cursor = orders[-1].created_at.isoformat() if orders else None

    return {
        "data": [o.to_dict() for o in orders],
        "next_cursor": next_cursor,
    }

Notes:

This pattern yields stable performance, even for deep pages.
It avoids the plan instability that OFFSET can trigger with volatile data distributions.
For admin dashboards with complex filters, consider materialized views or search indexes.

Join strategies: match the shape of the data

Joins are powerful but expensive. Align your schema and query to the workload.

Scenario: a user dashboard needs a list of orders with customer name and total line items.

-- PostgreSQL: 1:N join then aggregation in two steps
SELECT o.id, o.created_at, o.amount, c.name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.customer_id = 123
ORDER BY o.created_at DESC
LIMIT 20;

-- Then fetch line items in a second query (avoid large cross-join expansion)
SELECT line_item_id, quantity, price
FROM order_items
WHERE order_id IN (:order_ids);

If the join is one-to-many and you fetch all children in one query, the result set can explode due to row duplication. That increases I/O and network cost. Sometimes a lateral join or a CTE is better:

-- PostgreSQL: LATERAL to get top N items per order
SELECT
  o.id,
  o.created_at,
  o.amount,
  li.line_item_id,
  li.quantity,
  li.price
FROM orders o
JOIN customers c ON c.id = o.customer_id
LEFT JOIN LATERAL (
  SELECT line_item_id, quantity, price
  FROM order_items oi
  WHERE oi.order_id = o.id
  ORDER BY oi.line_item_id
  LIMIT 3
) li ON true
WHERE o.customer_id = 123
ORDER BY o.created_at DESC
LIMIT 20;

The lateral join avoids pulling all items for all orders at once. It is a good pattern for paged reports where you preview details.

EXPLAIN plans: your performance microscope

Never guess. Read the plan.

PostgreSQL:

EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT *
FROM orders
WHERE customer_id = 123
  AND created_at > '2025-01-01'
ORDER BY created_at DESC
LIMIT 50;

Look for:

Seq Scan on large tables: likely missing index or predicate not sargable.
High costs or actual time: focus on the top expensive nodes.
Buffers: high shared hit vs read ratio shows cache efficiency.
Sort nodes: if sorting is heavy, improve the index.

MySQL:

EXPLAIN FORMAT=JSON
SELECT * FROM orders
WHERE customer_id = 123
  AND created_at > '2025-01-01'
ORDER BY created_at DESC LIMIT 50;

Pay attention to:

type: aim for ref, eq_ref, or range; avoid ALL (full scan).
key: which index is used.
rows: estimate of scanned rows.
Extra: filesort indicates sorting outside index.

If the plan is unstable due to data skew or parameter sniffing, consider hints or restructure the query. In PostgreSQL, you can influence planner behavior with configuration or cost constants, but query design is usually the better path.

Partitioning: scale by splitting

When tables grow large, partitioning reduces scan sizes and improves maintenance. PostgreSQL native partitioning is robust for time-based data.

-- PostgreSQL: create partitioned table by month
CREATE TABLE orders_partitioned (
  id BIGSERIAL,
  customer_id BIGINT NOT NULL,
  status VARCHAR(32) NOT NULL,
  created_at TIMESTAMPTZ NOT NULL,
  amount NUMERIC(12,2) NOT NULL,
  PRIMARY KEY (id, created_at)
) PARTITION BY RANGE (created_at);

-- Example partitions
CREATE TABLE orders_y2025m01 PARTITION OF orders_partitioned
  FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
CREATE TABLE orders_y2025m02 PARTITION OF orders_partitioned
  FOR VALUES FROM ('2025-02-01') TO ('2025-03-01');

-- Index each partition as needed
CREATE INDEX ON orders_y2025m01 (customer_id, created_at);

With partitions, a query like WHERE created_at BETWEEN '2025-01-01' AND '2025-01-31' hits only the relevant partition, reducing I/O. Partitioning also lets you drop old data quickly (DROP PARTITION) rather than running expensive deletes.

MySQL supports partitioning as well, though syntax and limitations differ. InnoDB partitioning can be effective for time-based ranges; verify engine compatibility and index behavior.

Concurrency and locks: reduce contention

Most production stalls stem from lock contention rather than CPU. Long-running transactions and bulk updates block others.

Common pitfalls:

Updating rows in a loop without batching: each statement acquires row-level locks and can cause deadlocks.
SELECT FOR UPDATE without a WHERE clause that touches many rows: locks too much.
Missing indexes on foreign keys: can lock more than necessary during deletes or updates.

PostgreSQL example of safe batch updates:

-- Update status in batches to limit lock duration
WITH batch AS (
  SELECT id
  FROM orders
  WHERE status = 'pending'
  ORDER BY created_at
  LIMIT 1000
  FOR UPDATE SKIP LOCKED
)
UPDATE orders
SET status = 'processing'
FROM batch
WHERE orders.id = batch.id;

The SKIP LOCKED clause avoids blocking on rows already being processed by other workers. It is ideal for job queues. MySQL provides similar behavior with SKIP LOCKED in SELECT ... FOR UPDATE (requires row-level locking and compatible isolation level).

Transaction hygiene:

Keep transactions short: read, compute, write, commit.
Avoid long-running reads inside transactions that also write.
Use READ COMMITTED unless you need stricter guarantees.

Caching and materialization: trade compute for latency

Caching at the database level reduces repeated work.

Materialized views precompute expensive aggregations. Refresh them on a schedule or incrementally.
Query result caching in the application can be effective for hot endpoints.

PostgreSQL materialized view example:

CREATE MATERIALIZED VIEW mv_daily_order_metrics AS
SELECT
  date_trunc('day', created_at) AS day,
  COUNT(*) AS orders,
  SUM(amount) AS revenue
FROM orders
GROUP BY 1;

-- Refresh strategy
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_daily_order_metrics;

Use CONCURRENTLY to avoid locking reads during refresh. Schedule refreshes during off-peak hours or after significant data changes.

In application code, caching with Redis or Memcached is common. A simple pattern:

import json
import redis.asyncio as redis
from fastapi import FastAPI

app = FastAPI()
redis_client = redis.Redis(host="localhost", port=6379, db=0)

@app.get("/orders/metrics/daily")
async def daily_metrics():
    cache_key = "orders:metrics:daily"
    cached = await redis_client.get(cache_key)
    if cached:
        return json.loads(cached)

    # Compute metrics from DB (could use materialized view)
    data = await fetch_daily_metrics_from_db()
    await redis_client.setex(cache_key, 300, json.dumps(data))  # 5 minutes TTL
    return data

The right TTL depends on your freshness requirements. For highly dynamic data, use event-driven invalidation instead of time-based TTL.

Connection pooling: keep the lights on

Applications that open a new connection per request can overwhelm the database with connection churn. Use a pooler like PgBouncer for PostgreSQL or a built-in pool in the driver.

Example setup for PgBouncer:

# pgbouncer.ini
[databases]
myapp = host=db.example.com port=5432 dbname=myapp

[pgbouncer]
listen_port = 6432
listen_addr = *
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 2000
default_pool_size = 25
server_idle_timeout = 30

Key choices:

pool_mode: transaction is usually best for OLTP apps.
default_pool_size: set based on expected concurrency; start small and measure.
max_client_conn: safe upper bound for your app servers.

Application code should target the pooler port (6432) and use single connections per request. Configure driver pools accordingly (e.g., SQLAlchemy's create_async_engine with pool_size and max_overflow).

Honest evaluation: strengths, weaknesses, and tradeoffs

Relational databases are strong when:

Your queries are diverse and require ad hoc filtering, sorting, and joins.
You need ACID guarantees and mature tooling (backups, replication, observability).
You can invest in index design and query tuning.

Where they may struggle:

Extremely high write throughput with wide, sparse schemas. Partitioning and careful indexing help but may not match specialized NoSQL.
Highly asymmetric workloads (e.g., write-heavy analytics). Columnar stores or data lakes may be better for analytics, while OLTP remains relational.
Rapidly evolving schemas with frequent large backfills. Use migrations carefully, and consider blue-green schema changes.

Compared to NoSQL, relational databases often provide better developer ergonomics for complex queries and data integrity. NoSQL can be faster for narrow access patterns, but you may end up writing application-level join logic and dealing with eventual consistency. Pick based on workload, not fashion.

Personal experience: lessons learned the hard way

A few years ago, I inherited a reporting API that joined four tables and returned thousands of rows per request. It worked fine until the dataset grew. The p95 latency crept from 200ms to 2 seconds, then 8 seconds. My first instinct was to throw more cache at it. That helped briefly. The real fix was threefold:

Replaced OFFSET pagination with keyset pagination.
Added a composite index on the primary filter and sort columns.
Split the query into two: a paged fetch of the main entities and a second query for related data, using a lateral join for previews.

The result: p99 latency dropped to 150ms, and CPU utilization fell by half. The second change was the most impactful: understanding the query plan showed we were sorting large in-memory sets because the index order did not match the ORDER BY. I had ignored the importance of index column order for months.

Another common mistake I have made is over-partitioning too early. Partitioning is powerful but adds complexity for backups, indexing, and query planning. Start with indexes and query rewriting; move to partitioning when table size or maintenance windows force it.

Getting started: workflow and mental models

Set up a reproducible environment to measure and iterate. Treat performance as a workflow, not a one-time fix.

project/
├── sql/
│   ├── migrations/
│   │   ├── 001_create_orders.sql
│   │   └── 002_add_indexes.sql
│   ├── explain/
│   │   └── orders_customer_date.sql
│   └── views/
│       └── mv_daily_order_metrics.sql
├── app/
│   ├── db.py          # Engine, pooling, session factory
│   ├── models.py      # SQLAlchemy or GORM models
│   └── routes.py      # FastAPI or similar endpoints
├── tests/
│   └── test_performance.py
├── scripts/
│   └── bench.py       # Basic load test
└── docker-compose.yml

Workflow:

Create baseline metrics. Write a small bench script that exercises the endpoint or query for 60 seconds. Capture p50, p95, p99, and throughput.
Inspect the EXPLAIN plan. Identify scans, sorts, and join explosion. Fix index order first.
Rewrite the query to reduce work (keyset pagination, selective columns, avoid SELECT *).
Apply pooling and connection limits. Measure again.
If needed, introduce partitioning or materialized views. Do this when data size or refresh cadence justifies it.

When reading plans, favor simple heuristics:

Avoid sequential scans on large, hot tables.
Prefer index scans that match WHERE and ORDER BY.
Prefer nested loops for small result sets, hash joins for larger sets, but let the planner decide; your job is to provide the right indexes and statistics.

Tooling suggestions:

Postgres: pg_stat_statements, auto_explain, pgbouncer for pooling.
MySQL: Performance Schema, slow query log, ProxySQL for pooling.
Observability: Prometheus and Grafana for metrics, structured logs with query context.

Free learning resources

PostgreSQL EXPLAIN documentation: https://www.postgresql.org/docs/current/using-explain.html A concise guide to reading plans, including ANALYZE and BUFFERS.
MySQL EXPLAIN documentation: https://dev.mysql.com/doc/refman/8.0/en/explain-output.html Explains plan fields and how to interpret them.
pg_stat_statements: https://www.postgresql.org/docs/current/pgstatstatements.html Identifies the most expensive queries in production safely.
Use The Index, Luke by Markus Winand: https://use-the-index-luke.com/ A practical, engine-agnostic guide to indexing with clear examples.
Percona’s MySQL performance blog: https://www.percona.com/blog/ Real-world case studies and troubleshooting guides.
PgBouncer documentation: https://www.pgbouncer.org/ Connection pooling patterns for PostgreSQL.
Redis caching patterns: https://redis.io/docs/latest/develop/get-started/ Caching strategies and TTL design.

Summary: who should use these techniques and when

Developers building services that touch a relational database should learn performance tuning. It is not a DBA-only skill. If you write queries, you own performance. These techniques are most valuable when:

Your application serves interactive users and needs consistent latency.
Data volume is growing or already large.
You are operating in production with measurable SLAs.

You might skip advanced tactics like partitioning if your tables are small and change rarely. Similarly, heavy caching may not be necessary if queries are already fast and infrequent. Start with measurement, then apply the simplest fix that solves the problem. Indexes and query rewriting are the highest-leverage moves for most teams.

In the end, database performance tuning is about respect for the workload. It asks you to reason about the shape of your data and the cost of your queries. When you treat the database as a collaborator rather than a black box, it responds with predictability and speed.