Database Replication: Strategies for High Availability
Understand database replication patterns — primary-replica, multi-primary, synchronous vs asynchronous, failover strategies, and choosing the right approach.
Strategic Systems Architect & Enterprise Software Developer
A single database server is a single point of failure. When it goes down — and it will, whether from hardware failure, network partition, or a bad migration — your entire application goes down with it. Database replication eliminates this single point of failure by maintaining copies of your data on multiple servers, so that if one fails, another can take over.
But replication is not just about availability. It also enables read scaling (spreading query load across replicas), geographic distribution (placing data closer to users), and disaster recovery (having a copy in a different data center). Each use case favors different replication configurations, and the trade-offs between them are fundamental to database architecture.
Primary-Replica Replication
The most common pattern. One server (the primary) handles all writes. Changes propagate to one or more replicas, which handle read queries. If the primary fails, a replica is promoted to take its place.
┌──────────────┐
│ Primary │ ◄── All writes go here
│ (writable) │
└──────┬───────┘
│ replication stream
┌──────┴──────────────────┐
│ │
┌────▼─────┐ ┌─────▼────┐
│ Replica 1 │ │ Replica 2 │
│ (read) │ │ (read) │
└───────────┘ └──────────┘
PostgreSQL implements this with streaming replication. The primary sends its write-ahead log (WAL) to replicas, which replay it to stay in sync. The configuration is straightforward:
-- On the primary
ALTER SYSTEM SET wal_level = 'replica';
ALTER SYSTEM SET max_wal_senders = 3;
-- On the replica
-- recovery.conf (or standby.signal + primary_conninfo in PG 12+)
primary_conninfo = 'host=primary-host port=5432 user=replicator'
The application needs to route writes to the primary and reads to replicas. Connection poolers like PgBouncer can handle this routing, or you can manage it in the application layer with separate connection strings for read and write operations.
The catch is replication lag. In asynchronous replication, there is a delay between a write on the primary and its appearance on the replica. A user who creates a record and immediately reads it might not see it if the read hits a replica that has not yet received the write. This "read-your-writes" consistency problem is the most common issue teams encounter with primary-replica setups.
Synchronous vs Asynchronous Replication
Asynchronous replication — the primary confirms a write to the client as soon as the data is written locally, without waiting for replicas. This is fast but means data can be lost if the primary fails before replication completes. The window of potential data loss is typically under a second, but it exists.
Synchronous replication — the primary waits for at least one replica to confirm receipt before acknowledging the write to the client. No data loss, but every write pays the latency cost of the network round trip to the replica.
-- PostgreSQL synchronous replication
ALTER SYSTEM SET synchronous_commit = 'on';
ALTER SYSTEM SET synchronous_standby_names = 'replica1';
The latency cost of synchronous replication depends on network distance. Within the same data center (sub-millisecond network latency), the overhead is small. Across regions (50-100ms network latency), it makes every write 50-100ms slower, which is often unacceptable.
A pragmatic compromise is synchronous_commit = 'remote_apply' with one synchronous replica in the same data center and additional asynchronous replicas in other regions. Local writes are confirmed after one replica has the data, providing durability without cross-region latency. The remote replicas catch up asynchronously.
For applications that require guaranteed consistency, this is a critical infrastructure decision that should be documented and tested before production deployment.
Failover and Promotion
When the primary fails, a replica needs to be promoted to primary. This can happen manually or automatically, and the choice has significant implications.
Manual failover — an operator decides which replica to promote and triggers the switch. This is safer because a human verifies the situation before making changes, but it depends on someone being available and responding quickly. Overnight failures might go unaddressed for hours.
Automatic failover — a monitoring system detects the primary failure and promotes a replica automatically. Tools like Patroni (PostgreSQL), Orchestrator (MySQL), or managed services handle this. Automatic failover is faster but introduces the risk of false positives — the system might promote a replica when the primary is merely experiencing a network hiccup, causing a split-brain situation where two servers both think they are the primary.
# Patroni configuration for automatic failover
bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
parameters:
wal_level: replica
max_wal_senders: 5
Split-brain prevention is essential. Fencing mechanisms ensure the old primary cannot accept writes after a new primary is promoted. Network fencing (blocking the old primary's connections), STONITH (Shoot The Other Node In The Head — powering off the old primary), or write-ahead log divergence detection all serve this purpose.
After failover, the application must connect to the new primary. DNS-based endpoints that update automatically, connection poolers that reroute, or application-level retry logic with service discovery all solve this. The worst outcome is an application that continues writing to the old (now stale) primary because its connection string is hardcoded.
Choosing the Right Strategy
The right replication strategy depends on three factors: how much data loss is acceptable, how much latency overhead is acceptable, and how quickly failover must happen.
For most web applications with a PostgreSQL backend, asynchronous primary-replica replication with automatic failover (via Patroni or a managed service like AWS RDS) is the right default. The sub-second potential data loss window is acceptable for nearly all business applications, and the zero write-latency overhead keeps the application fast.
For financial systems, healthcare records, or any application where losing a single committed transaction is unacceptable, synchronous replication to at least one replica is necessary. Accept the latency cost as a business requirement.
For global applications that need low-latency reads in multiple regions, geographic read replicas reduce latency for read-heavy workloads. CockroachDB and Spanner offer multi-region writes, but the complexity and cost are justified only for applications that genuinely need them.
The complexity of your replication setup should match the availability requirements of your application. Most applications are fine with managed database services that handle replication internally — the cloud cost implications of managed versus self-hosted replication usually favor the managed approach for teams without dedicated database administrators.