Find PostgreSQL Index Bloat and Wasted Space
How to Find Index Bloat and Wasted Space in PostgreSQL
Just like tables, PostgreSQL indexes can become bloated over time. When rows are updated or deleted, the old index entries are not removed immediately. They stay in the index as dead entries, wasting space and slowing down index scans.
B-tree indexes are the most common type in PostgreSQL and the most prone to bloat. This query estimates how much space is wasted in each B-tree index, so you can decide which ones need to be rebuilt.
Sample Code
1WITH btree_index_atts AS (
2 SELECT nspname, relname, reltuples, relpages, indrelid, relam,
3 regexp_split_to_table(indkey::text, ' ')::smallint AS attnum,
4 indexrelid AS index_oid
5 FROM pg_index
6 JOIN pg_class ON pg_class.oid = pg_index.indexrelid
7 JOIN pg_namespace ON pg_namespace.oid = pg_class.relnamespace
8 JOIN pg_am ON pg_class.relam = pg_am.oid
9 WHERE pg_am.amname = 'btree'
10),
11index_item_sizes AS (
12 SELECT
13 i.nspname, i.relname, i.reltuples, i.relpages, i.relam,
14 s.starelid, a.attrelid AS table_oid, index_oid,
15 current_setting('block_size')::numeric AS bs,
16 CASE
17 WHEN version() ~ 'mingw32' OR version() ~ '64-bit' THEN 8
18 ELSE 4
19 END AS maxalign,
20 24 AS pagehdr,
21 CASE WHEN max(coalesce(s.stanullfrac, 0)) = 0
22 THEN 2
23 ELSE 6
24 END AS index_tuple_hdr,
25 sum((1 - coalesce(s.stanullfrac, 0)) * coalesce(s.stawidth, 2048)) AS nulldatawidth
26 FROM pg_attribute AS a
27 JOIN pg_stats pgs
28 JOIN btree_index_atts AS i ON i.indrelid = a.attrelid AND a.attnum = i.attnum
29 WHERE a.attnum > 0
30 AND pgs.schemaname = 'public'
31 GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9
32),
33index_aligned AS (
34 SELECT maxalign, bs, nspname, relname AS index_name, reltuples,
35 relpages, relam, table_oid, index_oid,
36 (2 +
37 maxalign - CASE
38 WHEN index_tuple_hdr % maxalign = 0 THEN maxalign
39 ELSE index_tuple_hdr % maxalign
40 END
41 + nulldatawidth + maxalign - CASE
42 WHEN nulldatawidth::integer % maxalign = 0 THEN maxalign
43 ELSE nulldatawidth::integer % maxalign
44 END
45 )::numeric AS nulldatahdrwidth,
46 pagehdr
47 FROM index_item_sizes AS s1
48),
49otta_calc AS (
50 SELECT bs, nspname, table_oid, index_oid, index_name, relpages,
51 coalesce(
52 ceil((reltuples * (4 + nulldatahdrwidth)) / (bs - pagehdr::float)) +
53 CASE WHEN am.amname IN ('hash', 'btree') THEN 1 ELSE 0 END, 0
54 ) AS otta
55 FROM index_aligned AS s2
56 LEFT JOIN pg_am am ON s2.relam = am.oid
57),
58raw_bloat AS (
59 SELECT
60 current_database() AS dbname, nspname,
61 c.relname AS table_name, index_name,
62 bs * (sub.relpages)::bigint AS totalbytes,
63 CASE
64 WHEN sub.relpages <= otta THEN 0
65 ELSE bs * (sub.relpages - otta)::bigint
66 END AS wastedbytes,
67 CASE
68 WHEN sub.relpages <= otta THEN 0
69 ELSE bs * (sub.relpages - otta)::bigint * 100 / (bs * (sub.relpages)::bigint)
70 END AS realbloat,
71 pg_relation_size(sub.table_oid) AS table_bytes,
72 stat.idx_scan AS index_scans
73 FROM otta_calc AS sub
74 JOIN pg_class AS c ON c.oid = sub.table_oid
75 JOIN pg_stat_user_indexes AS stat ON sub.index_oid = stat.indexrelid
76)
77SELECT
78 dbname AS database_name,
79 nspname AS schema_name,
80 table_name,
81 index_name,
82 round(realbloat, 1) AS bloat_pct,
83 wastedbytes AS bloat_bytes,
84 pg_size_pretty(wastedbytes::bigint) AS bloat_size,
85 totalbytes AS index_bytes,
86 pg_size_pretty(totalbytes::bigint) AS index_size,
87 table_bytes,
88 pg_size_pretty(table_bytes) AS table_size,
89 index_scans
90FROM raw_bloat
91ORDER BY wastedbytes DESC
92LIMIT 10;
Notes: Works on PostgreSQL 9.0 and later. Covers B-tree indexes only, which are the most common index type. Change pgs.schemaname = 'public' to match your schema. Results are estimates based on table statistics; run ANALYZE first for best accuracy.
Code Breakdown
btree_index_attsCTE — Collects all B-tree indexes, their column numbers, and table relationships. Filters toamname = 'btree'to skip other index types like GIN and GiST.index_item_sizesCTE — Calculates the estimated size of each index entry based on column statistics frompg_stats.stanullfracandstawidthare the null fraction and average column width.index_alignedCTE — Adds memory alignment padding to each entry size. PostgreSQL aligns data to 4 or 8 byte boundaries depending on the platform.otta_calcCTE — Calculates the optimal number of pages the index should use (otta) based on the actual data. Anything above this is potential bloat.raw_bloatCTE — Compares actual pages (relpages) to optimal pages (otta). The difference, multiplied by block size, is the wasted bytes.wastedbytes— The raw number of bytes being wasted in the index.bloat_pct— Wasted space as a percentage of total index size.index_scans— How many times the index has been used. An index with high bloat and low scans is a candidate for removal, not just rebuilding.
Key Points
- Results are sorted by
wastedbytesso the largest absolute waste appears first. - A
bloat_pctover 30% is a common threshold for considering a rebuild. - The
index_scanscolumn helps you decide: a bloated index that is never used may be worth dropping instead of rebuilding. - This query is limited to the top 10 results. Remove the
LIMIT 10clause to see all indexes.
Insights and Explanations
Once you identify bloated indexes, you can rebuild them. The simplest approach is REINDEX:
1REINDEX INDEX your_index_name;
This locks the index (and blocks reads on that index) while it rebuilds. For production use, prefer the concurrent version:
1REINDEX INDEX CONCURRENTLY your_index_name;
The concurrent option takes longer but does not block queries. It was added in PostgreSQL 12.
If you want to rebuild all indexes on a table at once:
1REINDEX TABLE CONCURRENTLY your_table_name;
For indexes on very busy tables, pg_repack is another option. It rebuilds the index online without any locking.
Index bloat builds up faster on tables with many updates and deletes. If an index keeps bloating quickly after rebuilding, check whether autovacuum is keeping up with the table it belongs to.
Additional Considerations
- Permissions: This query reads from
pg_stats,pg_index,pg_class, andpg_stat_user_indexes. Superuser or the table owner can see all indexes. Regular users may see fewer results. - GIN and GiST indexes: This query does not cover GIN or GiST indexes, which have different internal structures. Use
pgstattuplefor those. - Statistics accuracy: If
ANALYZEhas not run recently,pg_statsmay be stale and the bloat estimate will be off.