Query PostgreSQL I/O Statistics with pg_stat_io
Query PostgreSQL I/O Statistics with pg_stat_io
An I/O spike that nobody can attribute is a frustrating thing to chase. Was it autovacuum, a checkpoint, a bulk load, or ordinary client queries? Before PostgreSQL 16 the answer required stitching together several partial views; pg_stat_io gives a single cluster-wide breakdown of reads, writes, extends, and cache activity by backend type and I/O context.
Purpose and Overview
pg_stat_io is a cumulative statistics view that decomposes the server's block I/O along two axes at once. The first is backend_type — client backends, autovacuum workers, the background writer, the checkpointer, and so on. The second is context — normal activity, vacuum, bulkread, and bulkwrite. The cross-product of those two axes is what makes the view powerful: it does not just say "the server did 40 GB of reads," it says which subsystem did them and under what kind of operation.
That attribution is the difference between guessing and knowing. If read volume is concentrated in the vacuum context under the autovacuum backend type, the remedy points at vacuum tuning. If it is concentrated in bulkread for client backends, the remedy points at the queries doing large sequential scans. The same I/O total tells a completely different operational story depending on where the view places it.
Beyond raw reads and writes, the view exposes buffer-management metrics that were previously hard to obtain: hits (a desired block found already in shared buffers), evictions (a buffer reused to make room, displacing another page), and reuses (ring-buffer reuse during bulk operations). Together these illuminate shared-buffer pressure and whether a workload is churning the cache. When track_io_timing is enabled, the timing columns add how long each class of I/O actually took.
Sample Code
1SELECT
2 backend_type,
3 object,
4 context,
5 reads,
6 writes,
7 extends,
8 hits,
9 evictions
10FROM
11 pg_stat_io
12WHERE
13 reads > 0 OR writes > 0 OR extends > 0
14ORDER BY
15 reads DESC, writes DESC;
Notes: Requires PostgreSQL 16 or later, where pg_stat_io was introduced. The WHERE clause drops the many all-zero rows that exist for backend/context combinations that never do I/O, leaving only the active ones. Counters are cumulative since the last statistics reset; subtract two snapshots taken minutes apart to see a rate rather than lifetime totals.
Code Breakdown
The query is a filtered, ordered read of the view. Its value comes from the columns it surfaces and from suppressing the noise of inactive rows.
Filtering Out Idle Rows
1WHERE reads > 0 OR writes > 0 OR extends > 0
pg_stat_io materializes a row for every valid combination of backend type, object, and context, and many of those combinations are structurally impossible to use — they stay at zero forever. This predicate keeps only rows where real I/O happened, which turns a wall of dozens of rows into a short, readable list.
Ordering by Read Then Write Volume
1ORDER BY reads DESC, writes DESC
Most I/O investigations start with reads, so the heaviest readers float to the top, with write volume as the tiebreaker. Swap the order when chasing a write-heavy problem such as checkpoint or WAL-related pressure.
Snapshot Differencing
The counters are cumulative since stats_reset, so a single query shows lifetime totals — useful for proportions but not for "what is happening right now." Capturing the view into a temp table, waiting a defined interval, and subtracting converts the totals into a per-interval rate, which is what you want when correlating with a live incident.
Key pg_stat_io Dimensions
backend_type
This column attributes I/O to the process class that performed it: client backend, autovacuum worker, background writer, checkpointer, walwriter, and others. It answers "who" — and on a busy server the answer is frequently a background process rather than the client queries operators instinctively blame first.
context
The context refines the "how": normal for routine buffered access, vacuum for the ring-buffer strategy vacuum uses, bulkread for large sequential scans, and bulkwrite for operations like COPY and CREATE TABLE AS. The same backend type behaves very differently across contexts, and separating them prevents a bulk load from being mistaken for a chronic problem.
reads, writes, and extends
reads counts blocks pulled from storage into buffers; writes counts dirty blocks pushed back out; extends counts operations that grow a relation by adding new blocks. A high extends rate signals heavy insert or load activity, distinct from the read/write churn of ordinary access.
hits, evictions, and reuses
hits records blocks found already cached — high hits relative to reads is a healthy cache. evictions records buffers reclaimed by displacing an existing page, a direct indicator of shared-buffer pressure when it climbs. reuses reflects buffers recycled within a bulk operation's dedicated ring, which is normal behavior for large scans rather than a warning sign.
Practical Applications
Attributing an I/O Spike
When storage metrics jump, query the view (ideally as a differenced snapshot) and read down the backend_type/context breakdown. The rows carrying the spike name the responsible subsystem directly, turning a vague alert into a specific target.
Tuning Autovacuum I/O
If reads and writes concentrate in the vacuum context, autovacuum is the I/O consumer. That points at cost-delay settings, worker counts, and scheduling rather than at query tuning, and the view confirms the diagnosis before any parameter changes.
Spotting Shared-Buffer Pressure
A rising evictions count alongside a falling hit ratio indicates the working set no longer fits comfortably in shared_buffers. The view provides the evidence needed to justify a memory increase or a workload change.
Distinguishing Bulk Work from Steady State
bulkread and bulkwrite contexts isolate one-off heavy operations from ongoing load. An I/O total that looks alarming may be a scheduled bulk job in its own context — clearly separated here from the normal activity that represents the system's baseline.
Version Compatibility
pg_stat_io is new in PostgreSQL 16; it does not exist on 15 or earlier. On older releases the closest substitutes are the per-table pg_statio_user_tables and pg_statio_user_indexes views and the cluster-level counters in pg_stat_bgwriter, none of which provide the unified backend-type-by-context matrix that makes pg_stat_io so direct.
Within 16 and onward the view has been refined — additional columns such as byte-denominated read/write/extend totals appear in later releases — but the core dimensions of backend type, object, context, and the read/write/extend/hit/eviction counters are consistent. Timing columns populate only when track_io_timing is enabled, which carries a small overhead, so confirm that setting before relying on the time-based columns. Counters reset whenever statistics are reset, recorded in the view's stats_reset timestamp.
Best Practices
- Difference two snapshots for rates — lifetime totals show proportion, not what is happening now; subtract snapshots to get per-interval I/O.
- Read both axes together — backend type alone or context alone can mislead; the combination is what attributes I/O correctly.
- Enable track_io_timing deliberately — it unlocks the timing columns but adds overhead; turn it on when you need latency data, not by default everywhere.
- Watch evictions for memory pressure — a sustained rise is a strong signal to revisit
shared_bufferssizing. - Treat bulk contexts separately — exclude
bulkread/bulkwritewhen characterizing steady-state load so scheduled jobs do not skew the baseline.
References
- PostgreSQL Documentation — Monitoring Statistics — full column reference for
pg_stat_ioand the cumulative statistics system. - PostgreSQL Documentation — Cumulative Statistics Configuration —
track_io_timingand the settings that govern what the view records. - pganalyze Blog — Postgres 16: Cumulative I/O statistics with pg_stat_io — introduction to the view and what each metric reveals.
- pganalyze Blog — How Postgres DBAs can use pg_stat_io — practical DBA-oriented walkthrough of interpreting the view.
Posts in this series
- Identify Underutilized PostgreSQL Columns for Optimization
- Identifying Unused PostgreSQL Indexes to Declutter & Optimize
- Monitor PostgreSQL HOT Updates and Fillfactor
- Find PostgreSQL Index Bloat and Wasted Space
- Estimate PostgreSQL Table Bloat with SQL
- Find Missing Indexes in PostgreSQL with SQL
- Monitor PostgreSQL Table Cache Hit Ratio with SQL
- Tune PostgreSQL Background Writer with pg_stat_bgwriter
- Query PostgreSQL I/O Statistics with pg_stat_io