Identify Running PostgreSQL Autovacuum Worker Sessions
Identify Running PostgreSQL Autovacuum Worker Sessions
Autovacuum lag is a slow accumulation. Every UPDATE rewrites the row and creates a new HOT chain, the launcher falls behind during write spikes, and over weeks a busy OLTP cluster ends up with three workers stuck on a handful of huge tables while smaller ones wait their turn. Knowing which tables each worker is processing — and how far along — is the first step in deciding whether to wait it out, raise autovacuum_max_workers, or run a manual VACUUM.
Purpose and Overview
PostgreSQL's autovacuum subsystem reclaims dead tuples, refreshes planner statistics, and prevents transaction-ID wraparound. The autovacuum launcher periodically spawns worker processes that pick eligible tables off a priority queue and run VACUUM (and optionally ANALYZE) on them. The number of concurrent workers is capped by autovacuum_max_workers (default 3). Each worker shows up in pg_stat_activity as a real backend with backend_type = 'autovacuum worker' and writes per-vacuum progress data to pg_stat_progress_vacuum.
The operational frame is straightforward: when DDL hangs, when bloat climbs against expectations, or when manual VACUUM is about to step on autovacuum's toes, the DBA needs to know which tables autovacuum is touching right now and how far each worker has progressed. The query below joins the two views to answer that question in a single row per worker.
The cost of the query is trivial — both views are in-memory representations of cluster state, neither traverses on-disk structures. It is safe to run repeatedly from monitoring scripts at second-level cadence.
Sample Code
1SELECT
2 a.pid,
3 a.datname,
4 a.query_start,
5 NOW() - a.query_start AS run_time,
6 a.wait_event_type,
7 a.wait_event,
8 p.phase,
9 p.relid::regclass AS table_name,
10 p.heap_blks_total,
11 p.heap_blks_scanned,
12 p.heap_blks_vacuumed,
13 CASE p.heap_blks_total
14 WHEN 0 THEN NULL
15 ELSE ROUND(
16 100.0 * p.heap_blks_scanned / p.heap_blks_total, 1
17 )
18 END AS pct_scanned
19FROM pg_stat_activity a
20LEFT JOIN pg_stat_progress_vacuum p
21 ON p.pid = a.pid
22WHERE a.backend_type = 'autovacuum worker'
23ORDER BY a.query_start;
Notes: Requires PostgreSQL 10+ for the backend_type column and PostgreSQL 9.6+ for pg_stat_progress_vacuum. Run as superuser or as a role with pg_monitor to see every worker; unprivileged roles see only their own activity, which means no autovacuum workers will appear.
Code Breakdown
The query is a pg_stat_activity row restriction joined to per-vacuum progress data. Each piece below is part of why this returns a usable per-worker picture instead of a row dump.
Filtering pg_stat_activity by backend_type
backend_type = 'autovacuum worker' is the discriminator between launcher-spawned vacuums and everything else. The launcher itself shows backend_type = 'autovacuum launcher' — distinct, and not what this query wants. Manual VACUUM runs as client backend, not autovacuum worker, even though it also writes to pg_stat_progress_vacuum; combine both with backend_type IN ('autovacuum worker','client backend') when you want a single view of all in-flight vacuums.
Joining pg_stat_progress_vacuum
The join key is pid. pg_stat_progress_vacuum has one row per running VACUUM, and the row is keyed by the running process's PID — so a LEFT JOIN ... ON p.pid = a.pid lines each worker up with its progress. The LEFT is important: a worker briefly between phases can appear in pg_stat_activity without a row in the progress view for a fraction of a second, and dropping it from the result would create a false negative.
Phase and Progress Counters
phase reports the current vacuum phase (scanning heap, vacuuming indexes, etc.); the value lags reality by milliseconds, so transient mismatch with wait_event is expected. heap_blks_total, heap_blks_scanned, and heap_blks_vacuumed are block-level counters in 8 KB pages, useful for sizing the remaining work on large tables.
The pct_scanned Derived Column
pct_scanned is computed defensively: the CASE guards against division by zero on initializing workers where heap_blks_total is briefly 0. The value is not a true completion percentage because the index passes (which happen after the heap scan) are not represented in this counter — but for the dominant scanning heap phase on a large table it is the right operational signal.
Key Autovacuum Worker Phases
initializing
The worker has been spawned but has not yet locked the table or begun reading. Brief — typically subsecond.
scanning heap
The worker is reading the heap pages and recording dead-tuple references in the dead-tuple list. This phase scales with heap size and is the longest single phase on large tables. heap_blks_scanned increments here.
vacuuming indexes
After the heap scan completes, the worker visits each index on the table and removes dead-tuple references. This phase is I/O-heavy and scales with the total size of all indexes — a multi-billion-row table with five indexes can spend hours here. A worker stuck in this phase for hours on a huge table is normal, not a hang.
vacuuming heap and cleaning up indexes
The worker returns to the heap to mark dead tuples free, then revisits indexes for final cleanup. Both phases are typically shorter than the initial heap scan and index pass.
truncating heap and final cleanup
The worker attempts to truncate empty pages at the end of the heap (if vacuum_truncate is on for the table) and writes the final statistics. The truncate step requires an ACCESS EXCLUSIVE lock briefly and can be the contention point that blocks application queries on otherwise-idle tables; setting vacuum_truncate = off on hot tables avoids it.
Practical Applications
Diagnosing a Blocked DDL Statement
When ALTER TABLE or CREATE INDEX hangs, the most common cause is an autovacuum worker holding a lock on the target table. Running this query and looking for wait_event_type = 'Lock' on a worker against the same table_name confirms the diagnosis; pg_cancel_backend(pid) releases it cleanly and the launcher will reschedule the table shortly.
Sizing autovacuum_max_workers Against Table Count
If the query consistently returns three rows on a cluster with thousands of tables, the workers are saturated. The next step is checking which tables are eligible (high n_dead_tup against pg_stat_user_tables) and raising autovacuum_max_workers. The setting requires a restart, so plan it alongside a maintenance window.
1SELECT
2 schemaname,
3 relname,
4 n_dead_tup,
5 last_autovacuum
6FROM pg_stat_user_tables
7WHERE n_dead_tup > 1000
8ORDER BY n_dead_tup DESC
9LIMIT 20;
Coordinating Manual VACUUM with In-Flight Autovacuum
Before running a manual VACUUM <table>, check whether autovacuum is already on the same table. Two concurrent vacuums on the same relation are not destructive but waste cycles. The query's WHERE clause loosened to backend_type IN ('autovacuum worker','client backend') surfaces both.
Tracking Long-Running Vacuum on Multi-Billion-Row Tables
For huge tables the pct_scanned column gives the operational signal needed to decide whether to wait or intervene. A worker at 5% after an hour on a billion-row table is on pace; a worker at 5% after eight hours is symptomatic of I/O saturation or an aggressive autovacuum_vacuum_cost_delay setting throttling the pace.
Version Compatibility
backend_type was added to pg_stat_activity in PostgreSQL 10. On PostgreSQL 9.6, the query can be approximated by filtering on query LIKE 'autovacuum:%' — autovacuum workers' synthetic query text begins with autovacuum: VACUUM. The approximation is fragile and was removed when backend_type landed.
pg_stat_progress_vacuum was added in PostgreSQL 9.6 alongside the broader progress-reporting framework. Earlier versions have no equivalent; on those clusters the only signal is pg_stat_activity's query column and pg_stat_user_tables.last_autovacuum.
VACUUM FULL rewrites the table rather than vacuuming it in place, and reports to a separate view: pg_stat_progress_cluster. It will not appear in this query's results. VACUUM ... PARALLEL (PostgreSQL 13+) uses additional parallel worker backends that show backend_type = 'parallel worker'; the leading autovacuum worker still owns the pg_stat_progress_vacuum row.
autovacuum_vacuum_cost_limit controls how much I/O each worker is allowed before sleeping. The default -1 shares a single global budget — the effective per-worker limit becomes vacuum_cost_limit / autovacuum_max_workers. Raising autovacuum_max_workers without raising autovacuum_vacuum_cost_limit makes each worker individually slower. Set autovacuum_vacuum_cost_limit to a positive value (2000 is a common production starting point) when you need both more parallelism and faster per-table vacuums.
Best Practices
- Run as
pg_monitoror superuser — autovacuum-worker rows are filtered out for unprivileged roles - Combine with
pg_stat_user_tables— when no workers are active and bloat is climbing, the eligibility query above tells you whether autovacuum has nothing to do or is starved - Prefer
pg_cancel_backend()overpg_terminate_backend()— repeated cancellation of the same table indicates autovacuum settings are too aggressive, not a problem with the table - Disable
vacuum_truncateon hot tables — avoids theACCESS EXCLUSIVElock at the end of a vacuum that occasionally blocks application queries - Raise
autovacuum_vacuum_cost_limitalongsideautovacuum_max_workers— adding workers without lifting the global budget slows every worker proportionally - Sample on second-level cadence — both views are in-memory; a 1-second polling loop is operationally cheap
References
- PostgreSQL: pg_stat_activity — the live activity view exposing
backend_typeand the per-process state - PostgreSQL: pg_stat_progress_vacuum — progress reporting view detailing each vacuum's phase and block counters
- PostgreSQL: Routine Vacuuming — conceptual guide to VACUUM, dead tuples, and the autovacuum subsystem
- PostgreSQL: Autovacuum Configuration — reference for
autovacuum_max_workers,autovacuum_vacuum_cost_limit, and related settings - depesz: Is my autovacuum configured properly? — operational guidance on autovacuum tuning and monitoring from Hubert "depesz" Lubaczewski
Posts in this series
- Identify Blocking PostgreSQL Queries with pg_stat_activity
- PostgreSQL query to find columns containing only NULL values
- Analyze PostgreSQL Cache-Hit Ratio with pg_stat_statements
- How to Monitor Slow Running Queries in PostgreSQL
- Find Idle in Transaction Sessions in PostgreSQL
- Monitor PostgreSQL Index Build Progress with SQL
- Find PostgreSQL Tables That Need VACUUM FREEZE
- Detect PostgreSQL Transaction ID Wraparound Risk
- Cancel vs Terminate PostgreSQL Backends Explained
- Identify Running PostgreSQL Autovacuum Worker Sessions