Monitor PostgreSQL Index Build Progress with SQL

How to Monitor PostgreSQL Index Build Progress

Building an index on a large table can take minutes or even hours. Without progress tracking, it is hard to know if the build is almost done or barely started. PostgreSQL 12 introduced the pg_stat_progress_create_index view to fix this. It shows real-time progress for any index creation currently in progress.

This query joins that view with pg_stat_activity to give you a complete picture: how far along the build is, how long it has been running, and what phase it is in.

Sample Code

 1SELECT
 2    age(clock_timestamp(), a.query_start) AS duration,
 3    a.query,
 4    p.phase,
 5    round(p.blocks_done / NULLIF(p.blocks_total::numeric, 0) * 100, 2) AS "% blocks done",
 6    p.blocks_total,
 7    p.blocks_done,
 8    round(p.tuples_done / NULLIF(p.tuples_total::numeric, 0) * 100, 2) AS "% tuples done",
 9    p.tuples_total,
10    p.tuples_done,
11    p.command,
12    p.lockers_total,
13    p.lockers_done,
14    p.current_locker_pid,
15    index_relid::regclass AS index_name,
16    relid::regclass AS table_name,
17    pg_size_pretty(pg_relation_size(relid)) AS table_size,
18    a.pid
19FROM
20    pg_stat_progress_create_index p
21JOIN
22    pg_stat_activity a ON p.pid = a.pid;

Notes: Requires PostgreSQL 12 or later. Returns one row per active index build. Run this query while a CREATE INDEX or REINDEX is in progress. Returns no rows if no index build is currently running.

Code Breakdown

  • pg_stat_progress_create_index β€” A system view added in PostgreSQL 12 that tracks the progress of active index builds.
  • pg_stat_activity β€” Joined to get the query text, start time, and process ID of the session running the build.
  • age(clock_timestamp(), a.query_start) β€” Calculates how long the index build has been running. clock_timestamp() returns the current time, not the transaction start time.
  • phase β€” The current stage of the index build. Common phases include initializing, building index, waiting for writers before build, and waiting for old snapshots.
  • blocks_done / blocks_total β€” Page-level progress. Shows what percentage of the table's data pages have been scanned.
  • tuples_done / tuples_total β€” Row-level progress. Shows how many rows have been indexed so far.
  • NULLIF(..., 0) β€” Prevents division by zero if totals are not yet known at the start of the build.
  • lockers_total / lockers_done β€” For concurrent index builds, shows how many conflicting transactions need to complete before the build can advance.
  • current_locker_pid β€” The PID of a transaction currently blocking the concurrent index build.
  • index_relid::regclass β€” Converts the index OID to its name.
  • relid::regclass β€” Converts the table OID to its name.
  • pg_size_pretty(pg_relation_size(relid)) β€” Shows the size of the table being indexed.

Key Points

  • Run this query periodically while a long index build is in progress to check on it.
  • If % blocks done stops advancing for a long time, check current_locker_pid β€” a long-running transaction may be blocking the build.
  • During a CREATE INDEX CONCURRENTLY, the build goes through multiple phases. It needs to wait for all active transactions to finish at certain points before it can move on.
  • The pid column lets you cancel or terminate the build if needed using pg_cancel_backend(pid) or pg_terminate_backend(pid).

Insights and Explanations

Phases of a concurrent index build:

PostgreSQL creates a concurrent index in several passes. It scans the table once to build an initial index, then waits for old transactions to finish, then scans again to catch changes made during the first scan. This is why concurrent builds take longer than regular builds β€” but they do not lock the table.

A regular (non-concurrent) index build only shows one phase and holds an AccessShareLock on the table. Writes are blocked during this time.

If you see the build stuck on waiting for writers before build or waiting for old snapshots, find the blocking transaction:

1SELECT pid, now() - xact_start AS duration, query, state
2FROM pg_stat_activity
3WHERE xact_start IS NOT NULL
4ORDER BY duration DESC;

You can also estimate time remaining by watching the % blocks done value change over two readings:

1rows_remaining = blocks_total - blocks_done
2time_per_block = elapsed_time / blocks_done
3estimated_time_left = rows_remaining * time_per_block

Additional Considerations

  • PostgreSQL version: This view only exists in PostgreSQL 12 and later. On older versions, there is no built-in way to check index build progress.
  • REINDEX: This query also tracks REINDEX and REINDEX CONCURRENTLY operations, not just CREATE INDEX.
  • Permissions: Superusers see all index builds. Regular users only see their own sessions in pg_stat_activity, but can see all rows in pg_stat_progress_create_index.

References

Posts in this series