Postgres Query Regression Detection

A query that was fast last week is slow today. The schema changed, or the data distribution shifted, or a new index was added that the planner stopped using. This guide shows how to detect those regressions systematically using pg_stat_statements and delta analysis.

Why regressions are hard to catch

Most teams notice query regressions via user complaints or an on-call page — long after the damage is done. The problem is that a slow query often looks fine in development (small dataset, warm cache) and only degrades under production load or after a data volume threshold is crossed.

The standard tool for tracking query performance over time is pg_stat_statements. It accumulates per-fingerprint statistics since the last pg_stat_statements_reset() call. The challenge: cumulative numbers are hard to compare across deployments or schema changes. Delta analysis fixes that.

pg_stat_statements basics

Enable it in postgresql.conf:

shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all

Then create the extension:

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

The key columns for regression detection:

queryid — stable fingerprint (same query with different literals = same queryid)
calls — total invocation count
mean_exec_time — mean execution time in ms
total_exec_time — total wall time, most useful for finding high-impact queries
rows — total rows returned

The baseline + delta pattern

The core idea: take a snapshot of pg_stat_statements before a deployment, take another after, and compute the delta for each queryid. A query that was averaging 2ms and is now averaging 20ms is a regression — even if its absolute time is still "low".

-- Snapshot schema (store in a separate DB or schema)
CREATE TABLE query_snapshots (
  snapshot_id   BIGSERIAL PRIMARY KEY,
  taken_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
  label         TEXT,        -- e.g. 'pre-deploy-v2.3.1'
  queryid       BIGINT,
  query         TEXT,
  calls         BIGINT,
  mean_exec_ms  DOUBLE PRECISION,
  total_exec_ms DOUBLE PRECISION
);

-- Take a snapshot
INSERT INTO query_snapshots (label, queryid, query, calls, mean_exec_ms, total_exec_ms)
SELECT
  'pre-deploy-v2.3.1',
  queryid,
  query,
  calls,
  mean_exec_time,
  total_exec_time
FROM pg_stat_statements
WHERE calls > 10;  -- ignore one-shot queries

Computing the delta

-- Compare two snapshots
SELECT
  a.queryid,
  a.query,
  a.mean_exec_ms                              AS before_ms,
  b.mean_exec_ms                              AS after_ms,
  ROUND((b.mean_exec_ms / NULLIF(a.mean_exec_ms, 0) - 1) * 100, 1)
                                              AS pct_change,
  b.calls - a.calls                           AS new_calls
FROM query_snapshots a
JOIN query_snapshots b USING (queryid)
WHERE a.label = 'pre-deploy-v2.3.1'
  AND b.label = 'post-deploy-v2.3.1'
  AND b.mean_exec_ms > a.mean_exec_ms * 1.5   -- 50% regression threshold
ORDER BY (b.mean_exec_ms - a.mean_exec_ms) * (b.calls - a.calls) DESC;

The ORDER BY clause ranks by total impact: a query that got 10× slower but runs once a minute ranks below one that got 2× slower but runs 10,000 times a minute.

Setting the regression threshold

A fixed percentage threshold (e.g. 50% slower) misses slow queries with high absolute degradation and fires false positives on fast queries. A better approach:

Relative threshold: flag if mean_after / mean_before > 1.5 AND absolute delta > 5ms
Minimum call count: require at least 100 new calls in the post-deploy window to filter noise
Significance filter: ignore queries under 1ms in both snapshots — regressions there rarely matter

Handling pg_stat_statements_reset()

If someone (or autovacuum in some configurations) calls pg_stat_statements_reset(), your baselines are invalidated. Always store snapshots to a separate table before resetting. Some teams reset on every deploy to get clean per-version stats; others never reset and rely on cumulative deltas.

Integrating into CI/CD

A practical integration:

Before deploy: pg_stat_statements snapshot labeled pre-{sha}
Run the deploy + migration
Run your integration/load test suite against production-like data
After tests: snapshot labeled post-{sha}
Compute delta; fail CI if any query crosses the regression threshold

This catches regressions introduced by schema changes — new indexes that change plan shape, dropped indexes that force seq scans, statistics changes from large inserts.

What TuneStack automates

TuneStack's slow log collector streams pg_stat_statements snapshots continuously and computes rolling deltas. When a query regresses by more than the configured threshold, it surfaces in the dashboard with before/after plans side by side — no manual SQL required.

Related: Reading EXPLAIN Plans