Postgres WAL and Replication
The Write-Ahead Log (WAL) is Postgres's durability mechanism: every change is written to the WAL before the corresponding heap page is modified. WAL is also the foundation of replication — both physical streaming replication and logical replication derive their data from the same WAL stream. Understanding WAL explains replication lag, standby failover, and why certain operations are replication-incompatible.
How WAL provides durability
When a transaction commits, Postgres guarantees that WAL records for that transaction have been flushed to durable storage before returning success to the client. The heap pages themselves may still be dirty in shared_buffers — a background writer and checkpointer flush them asynchronously. On crash, Postgres replays WAL records from the last checkpoint forward, restoring all committed transactions.
WAL is written to segment files in pg_wal/, each 16 MB by default. The LSN (Log Sequence Number) is a 64-bit monotonically increasing byte offset into this stream. Every operation — inserts, updates, deletes, DDL — has an LSN.
wal_level: what gets written
The wal_level GUC controls how much information goes into WAL records:
minimal— minimum for crash recovery. Skips WAL for operations that can be trivially replayed (bulk COPY, CREATE TABLE AS SELECT). Cannot support replication.replica(default) — adds information required for physical streaming replication and base backups. Sufficient for read replicas and high availability.logical— adds the column-level before/after values needed to decode row changes. Required for logical replication, CDC pipelines, and tools like Debezium. Increases WAL volume by roughly 30–50% on write-heavy workloads.
Changing wal_level requires a server restart. Check current value:
SHOW wal_level;Physical streaming replication
Physical replication sends raw WAL byte-for-byte to standby servers. The standby continuously replays WAL records against an exact byte-copy of the primary's data directory. This produces an identical Postgres instance — same page layout, same file structure.
Setup requires a replication slot or manual primary_conninfo in recovery.conf / standby.signal (Postgres 12+). The key parameters on the primary:
# postgresql.conf
wal_level = replica
max_wal_senders = 10 # max concurrent replication connections
wal_keep_size = 1GB # retain this much WAL for slow standbys
hot_standby = on # allow reads on standby (set on standby)Physical standbys can serve read queries but cannot have different schemas, additional databases, or run DDL independently.
Logical replication
Logical replication decodes WAL into row-level changes (INSERT/UPDATE/DELETE) and streams them to subscribers. Unlike physical replication, subscribers can be different Postgres versions, have their own schemas, apply only a subset of tables, or even be non-Postgres systems via logical decoding plugins.
Create a publication on the primary:
-- publish specific tables
CREATE PUBLICATION my_pub FOR TABLE orders, customers;
-- or publish everything
CREATE PUBLICATION my_pub FOR ALL TABLES;Subscribe on the target:
CREATE SUBSCRIPTION my_sub
CONNECTION 'host=primary port=5432 dbname=mydb user=replicator password=...'
PUBLICATION my_pub;Logical replication requires wal_level = logical and that replicated tables have a replica identity. By default this is the primary key; tables without a primary key need ALTER TABLE t REPLICA IDENTITY FULL (writes the entire old row to WAL on updates).
Replication slots and WAL retention risk
A replication slot causes the primary to retain WAL until the subscriber has consumed it. If a subscriber goes offline, the primary accumulates WAL indefinitely. On a write-heavy database, this can fill the disk and crash the primary.
Monitor slot lag:
SELECT slot_name,
active,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) AS lag_bytes
FROM pg_replication_slots;Set a hard limit on retained WAL:
-- drop a slot that's been inactive too long
SELECT pg_drop_replication_slot('my_slot');max_slot_wal_keep_size (Postgres 13+) caps how much WAL a slot can retain before Postgres invalidates it automatically:
# postgresql.conf
max_slot_wal_keep_size = 10GBMonitoring replica lag
On the primary, pg_stat_replication shows all connected standbys:
SELECT client_addr,
state,
sent_lsn,
write_lsn,
flush_lsn,
replay_lsn,
write_lag,
flush_lag,
replay_lag
FROM pg_stat_replication;The replay_lag column shows how far behind the standby is in wall-clock time. A standby with replay_lag > 30s on a production HA setup warrants investigation. Common causes:
- Network bandwidth between primary and standby is saturated.
- Standby is I/O-bound applying WAL — check
pg_stat_activityfor thestartupprocess. - Long-running queries on the standby (hot_standby_feedback enabled) conflict with WAL application.
recovery_min_apply_delayis set intentionally (delayed standby for accidental-delete protection).
Synchronous vs asynchronous replication
By default, replication is asynchronous: the primary commits as soon as WAL is flushed locally, without waiting for standbys. This is fastest but means standby lag is theoretically unbounded — a failover could lose recently-committed transactions.
Synchronous replication waits for at least one standby to acknowledge WAL receipt before returning to the client:
# postgresql.conf
synchronous_standby_names = 'ANY 1 (standby1, standby2)'
synchronous_commit = remote_write # or 'on' for full flushremote_write waits for the standby to write WAL to its OS buffer (not flush to disk). on waits for a full fsync on the standby. Synchronous replication adds commit latency equal to the network round-trip to the standby.
Quick checklist
- Set
wal_level = logicalonly if needed for CDC or logical replication — it adds WAL volume. - Monitor
pg_replication_slots— an inactive slot accumulating lag can fill disk. - Set
max_slot_wal_keep_sizeto cap WAL retention from dead slots. - Alert on
replay_lag > 30sfor HA standbys. - Use
synchronous_commit = remote_writefor durability without full-fsync latency. - Tables in logical replication must have a replica identity (primary key recommended).
Related: Postgres MVCC · Reading EXPLAIN Plans