Tune PostgreSQL Background Writer with pg_stat_bgwriter
Tune PostgreSQL Background Writer with pg_stat_bgwriter
Where pg_stat_statements does query-level aggregation and pgBadger does after-the-fact log analysis, pg_stat_bgwriter operates at the buffer-cache layer — answering whether the bgwriter and checkpointer are keeping shared_buffers clean enough that foreground queries never have to pay for a write inline. On a healthy OLTP cluster the answer is almost always yes; on a struggling one the counters say which knob to turn first.
Purpose and Overview
When a PostgreSQL backend needs a clean buffer in shared_buffers and none is available, something has to flush a dirty page back to disk before the backend can continue. Three subsystems share that workload. The checkpointer flushes every dirty buffer at checkpoint time, spreading the writes over checkpoint_completion_target × checkpoint_timeout to avoid I/O spikes. The background writer ("bgwriter") runs continuously, sweeping a small window of the LRU list each bgwriter_delay and writing out dirty buffers to keep a runway of clean buffers available. And when neither of those has kept up, the user-facing backend writes the buffer itself — inline, on the critical path of the query.
The third case is the expensive one. A high share of buffers_backend relative to buffers_checkpoint and buffers_clean means foreground queries are doing the bgwriter's job, paying latency for I/O that the background processes should have absorbed.
pg_stat_bgwriter exposes the counters that distinguish the three writers, plus checkpoint timing and request counts — the data you need to size bgwriter_lru_maxpages, bgwriter_delay, checkpoint_timeout, and max_wal_size against your actual workload rather than against generic defaults.
Sample Code
1-- PostgreSQL 9.2 through 16
2SELECT
3 checkpoints_timed,
4 checkpoints_req,
5 checkpoint_write_time,
6 checkpoint_sync_time,
7 buffers_checkpoint,
8 buffers_clean,
9 buffers_backend,
10 buffers_backend_fsync,
11 buffers_alloc,
12 maxwritten_clean,
13 ROUND(
14 100.0 * buffers_backend /
15 GREATEST(
16 buffers_checkpoint + buffers_clean + buffers_backend,
17 1
18 ),
19 2
20 ) AS pct_backend_flushed,
21 stats_reset
22FROM pg_stat_bgwriter;
Notes: Available on PostgreSQL versions prior to 17. PostgreSQL 17 split this view in two: pg_stat_bgwriter now holds only the bgwriter counters, and the checkpointer counters moved to pg_stat_checkpointer. On 17+ the query above will fail on the checkpoint columns; see the Version Compatibility section for the split variant.
Code Breakdown
The query is a single-row SELECT against cumulative counters. The interesting work is in interpreting the ratios, not the columns themselves.
The Three Buffer Writers
buffers_checkpoint, buffers_clean, and buffers_backend are the count of dirty buffers each subsystem has flushed since the last counter reset. Together they decompose total write activity into "scheduled" (checkpointer), "background" (bgwriter), and "synchronous" (backend) shares. The ratio between them — not the absolute values — is the tuning signal.
Checkpoint Counters
checkpoints_timed counts checkpoints triggered by checkpoint_timeout — the healthy majority. checkpoints_req counts checkpoints triggered by hitting max_wal_size. A high checkpoints_req share means WAL is filling faster than checkpoints are spreading their I/O; the response is to raise max_wal_size. checkpoint_write_time and checkpoint_sync_time are total milliseconds spent in the write and sync phases respectively.
bgwriter Counters
buffers_clean is the bgwriter's flush count. maxwritten_clean is the number of times the bgwriter stopped a single sweep early because it hit bgwriter_lru_maxpages — the per-round cap. A non-zero (and growing) maxwritten_clean indicates the bgwriter is at its per-round ceiling and would benefit from raising bgwriter_lru_maxpages.
Backend Counters
buffers_backend is the foreground-flushed count — the one to minimize. buffers_backend_fsync is the count of backends that had to fsync themselves because the fsync request queue was full; this should be zero in steady state. A non-zero value is a saturated-I/O signal.
The pct_backend_flushed Derived Column
pct_backend_flushed is buffers_backend / (buffers_checkpoint + buffers_clean + buffers_backend) expressed as a percentage, with a GREATEST(..., 1) guard against division by zero on a freshly-reset cluster. Values above 15-20% on a write-heavy OLTP cluster are the classic "tune bgwriter" trigger.
Key Buffer-Flushing Counters
buffers_checkpoint and Checkpoint I/O Spreading
The checkpointer flushes every dirty buffer at checkpoint time. checkpoint_completion_target (default 0.9) controls how much of checkpoint_timeout the writes are spread over — higher values smooth the I/O, lower values complete faster but cause spikier writes.
buffers_clean and the bgwriter Sweep
The bgwriter wakes every bgwriter_delay (default 200ms), looks at the LRU list, and writes up to bgwriter_lru_maxpages (default 100) dirty buffers per pass. Raising either knob increases the bgwriter's throughput; lowering them throttles it.
buffers_backend and Foreground I/O
Every flush counted here is one a query had to do itself. A small share is normal; a large share is the unmistakable signal that the bgwriter is undersized or the checkpointer is too aggressively spread.
buffers_backend_fsync and Saturated fsync Queues
This counter increments only when the fsync request queue to the checkpointer is full and a backend has to fsync its own write. Persistently non-zero values point at I/O subsystem saturation or undersized wal_buffers.
maxwritten_clean as a bgwriter Ceiling Signal
A maxwritten_clean count rising in step with buffers_clean says the bgwriter is consistently hitting its per-round cap. Raising bgwriter_lru_maxpages lets it write more per sweep and typically lowers pct_backend_flushed as a result.
Practical Applications
Tuning bgwriter_lru_maxpages and bgwriter_delay
If pct_backend_flushed exceeds 15% and maxwritten_clean is growing, the bgwriter is the bottleneck. Common production values: bgwriter_lru_maxpages = 500 (up from 100), bgwriter_delay = 50ms (down from 200ms). Apply one change at a time, reset the counters with SELECT pg_stat_reset_shared('bgwriter'), and re-measure over a representative window.
Diagnosing Checkpoint Storms from checkpoints_req
checkpoints_req substantially exceeding checkpoints_timed is the storm signal: WAL is filling faster than the checkpoint spread accommodates, every checkpoint compresses into a short I/O burst, and buffers_backend rises as foreground queries pick up the slack between bursts. Raising max_wal_size (and accepting longer crash recovery as the cost) restores spread-checkpoint behavior.
Sizing max_wal_size on Write-Heavy OLTP
Compute the cluster's WAL generation rate by sampling pg_current_wal_lsn() over a 5-minute window during peak load. max_wal_size should be at least 1× to 2× the peak rate times checkpoint_timeout — large enough that checkpoints_req stays near zero.
Resetting bgwriter Counters for A/B Tuning
pg_stat_reset_shared('bgwriter') (and 'checkpointer' on 17+) zeroes the counters without touching live activity. The A/B pattern: capture current ratios, change one setting, reset counters, wait one workload cycle, re-capture, compare. The counters in this view are the only inexpensive way to A/B-test bgwriter tuning short of pgbench.
1-- Average rate since last reset
2SELECT
3 ROUND(buffers_checkpoint /
4 GREATEST(EXTRACT(EPOCH FROM NOW() - stats_reset), 1),
5 2) AS checkpoint_buf_per_sec,
6 ROUND(buffers_clean /
7 GREATEST(EXTRACT(EPOCH FROM NOW() - stats_reset), 1),
8 2) AS bgwriter_buf_per_sec,
9 ROUND(buffers_backend /
10 GREATEST(EXTRACT(EPOCH FROM NOW() - stats_reset), 1),
11 2) AS backend_buf_per_sec
12FROM pg_stat_bgwriter;
Version Compatibility
pg_stat_bgwriter has existed since PostgreSQL 8.3, with columns added across versions. buffers_backend_fsync arrived in 9.1; checkpoint_write_time and checkpoint_sync_time in 9.2.
PostgreSQL 17 introduced the major split. The view now contains only bgwriter-specific counters (buffers_clean, maxwritten_clean, buffers_alloc, stats_reset). The checkpointer counters moved to a new view, pg_stat_checkpointer, with renamed columns: num_timed (was checkpoints_timed), num_requested (was checkpoints_req), write_time, sync_time, buffers_written (was buffers_checkpoint). Migrate any dashboard or alert reading pg_stat_bgwriter.checkpoints_* before upgrading to 17.
1-- PostgreSQL 17+ split variant
2SELECT
3 b.buffers_clean,
4 b.maxwritten_clean,
5 b.buffers_alloc,
6 c.num_timed AS checkpoints_timed,
7 c.num_requested AS checkpoints_req,
8 c.write_time AS checkpoint_write_time,
9 c.sync_time AS checkpoint_sync_time,
10 c.buffers_written AS buffers_checkpoint
11FROM pg_stat_bgwriter b
12CROSS JOIN pg_stat_checkpointer c;
pg_stat_reset_shared accepts 'bgwriter' on all supported versions and additionally 'checkpointer' on 17+ — reset them independently to A/B-test bgwriter tuning without losing checkpointer baselines.
Best Practices
- Watch
pct_backend_flushedas the primary tuning indicator — below 5% is excellent, 5-15% is fine, above 15% on write-heavy OLTP says tune the bgwriter - Raise
max_wal_sizewhencheckpoints_reqexceedscheckpoints_timed— accept longer crash recovery as the cost of spread checkpoint I/O - Change one setting at a time and reset counters between —
bgwriter_lru_maxpages,bgwriter_delay,max_wal_size,checkpoint_completion_targetall interact - Plan the PostgreSQL 17 dashboard migration before the upgrade —
pg_stat_bgwriter.checkpoints_*no longer exists on 17 - Treat
buffers_backend_fsync > 0as an I/O alarm — it means the fsync queue saturated, which is rarely the right operating mode - Sample twice and subtract for instantaneous rates — the raw counters are cumulative since reset; deltas are what dashboards should plot
References
- PostgreSQL: pg_stat_bgwriter — official reference for the bgwriter statistics view (pre-17 column set and 17+ slimmed set)
- PostgreSQL: pg_stat_checkpointer — new view in PostgreSQL 17 holding the checkpointer counters that moved out of
pg_stat_bgwriter - PostgreSQL: Background Writer Configuration — reference for
bgwriter_delay,bgwriter_lru_maxpages, and related settings - PostgreSQL: WAL Checkpoint Configuration — reference for
checkpoint_timeout,max_wal_size, andcheckpoint_completion_target - EDB blog: Basics of Tuning Checkpoints — Tomas Vondra's operational primer on
checkpoint_timeoutandmax_wal_sizeinteraction, directly relevant to thecheckpoints_reqvscheckpoints_timedratio
Posts in this series
- Identify Underutilized PostgreSQL Columns for Optimization
- Identifying Unused PostgreSQL Indexes to Declutter & Optimize
- Monitor PostgreSQL HOT Updates and Fillfactor
- Find PostgreSQL Index Bloat and Wasted Space
- Estimate PostgreSQL Table Bloat with SQL
- Find Missing Indexes in PostgreSQL with SQL
- Monitor PostgreSQL Table Cache Hit Ratio with SQL
- Tune PostgreSQL Background Writer with pg_stat_bgwriter