Estimate PostgreSQL Table Bloat with SQL
How to Estimate Table Bloat in PostgreSQL
When you update or delete rows in PostgreSQL, the old row versions are not removed immediately. They stay in the table as dead tuples until VACUUM cleans them up. Over time, if VACUUM does not keep up, these dead rows pile up and the table grows larger than it needs to be. This is called table bloat.
Bloat wastes disk space and slows down queries because PostgreSQL has to read through more pages to find live data. This query estimates how much of each table is wasted space, so you can decide which tables need attention.
Sample Code
1SELECT
2 current_database(),
3 schemaname,
4 tblname,
5 bs * tblpages AS real_size,
6 (tblpages - est_tblpages) * bs AS extra_size,
7 CASE WHEN tblpages - est_tblpages > 0
8 THEN 100 * (tblpages - est_tblpages) / tblpages::float
9 ELSE 0
10 END AS extra_pct,
11 fillfactor,
12 CASE WHEN tblpages - est_tblpages_ff > 0
13 THEN (tblpages - est_tblpages_ff) * bs
14 ELSE 0
15 END AS bloat_size,
16 CASE WHEN tblpages - est_tblpages_ff > 0
17 THEN 100 * (tblpages - est_tblpages_ff) / tblpages::float
18 ELSE 0
19 END AS bloat_pct,
20 is_na
21FROM (
22 SELECT
23 ceil(reltuples / ((bs - page_hdr) / tpl_size)) + ceil(toasttuples / 4) AS est_tblpages,
24 ceil(reltuples / ((bs - page_hdr) * fillfactor / (tpl_size * 100))) + ceil(toasttuples / 4) AS est_tblpages_ff,
25 tblpages, fillfactor, bs, tblid, schemaname, tblname, heappages, toastpages, is_na
26 FROM (
27 SELECT
28 (4 + tpl_hdr_size + tpl_data_size + (2 * ma)
29 - CASE WHEN tpl_hdr_size % ma = 0 THEN ma ELSE tpl_hdr_size % ma END
30 - CASE WHEN ceil(tpl_data_size)::int % ma = 0 THEN ma ELSE ceil(tpl_data_size)::int % ma END
31 ) AS tpl_size,
32 bs - page_hdr AS size_per_block,
33 (heappages + toastpages) AS tblpages, heappages,
34 toastpages, reltuples, toasttuples, bs, page_hdr, tblid, schemaname, tblname, fillfactor, is_na
35 FROM (
36 SELECT
37 tbl.oid AS tblid, ns.nspname AS schemaname, tbl.relname AS tblname, tbl.reltuples,
38 tbl.relpages AS heappages, coalesce(toast.relpages, 0) AS toastpages,
39 coalesce(toast.reltuples, 0) AS toasttuples,
40 coalesce(substring(
41 array_to_string(tbl.reloptions, ' ')
42 FROM 'fillfactor=([0-9]+)')::smallint, 100) AS fillfactor,
43 current_setting('block_size')::numeric AS bs,
44 CASE WHEN version() ~ 'mingw32' OR version() ~ '64-bit|x86_64|ppc64|ia64|amd64' THEN 8 ELSE 4 END AS ma,
45 24 AS page_hdr,
46 23 + CASE WHEN MAX(coalesce(s.null_frac, 0)) > 0 THEN (7 + count(s.attname)) / 8 ELSE 0::int END
47 + CASE WHEN bool_or(att.attname = 'oid' AND att.attnum < 0) THEN 4 ELSE 0 END AS tpl_hdr_size,
48 sum((1 - coalesce(s.null_frac, 0)) * coalesce(s.avg_width, 0)) AS tpl_data_size,
49 bool_or(att.atttypid = 'pg_catalog.name'::regtype)
50 OR sum(CASE WHEN att.attnum > 0 THEN 1 ELSE 0 END) <> count(s.attname) AS is_na
51 FROM pg_attribute AS att
52 JOIN pg_class AS tbl ON att.attrelid = tbl.oid
53 JOIN pg_namespace AS ns ON ns.oid = tbl.relnamespace
54 LEFT JOIN pg_stats AS s
55 ON s.schemaname = ns.nspname
56 AND s.tablename = tbl.relname
57 AND s.inherited = false
58 AND s.attname = att.attname
59 LEFT JOIN pg_class AS toast ON tbl.reltoastrelid = toast.oid
60 WHERE NOT att.attisdropped
61 AND tbl.relname NOT LIKE '%pg%'
62 AND tbl.relkind IN ('r', 'm')
63 GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
64 ORDER BY 2, 3
65 ) AS s
66 ) AS s2
67) AS s3
68ORDER BY bloat_pct DESC;
Notes: Compatible with PostgreSQL 9.0 and later. Results are estimates based on table statistics. Run ANALYZE first for more accurate results. The is_na column flags rows where the estimate may not be reliable.
Code Breakdown
real_size— The actual disk space the table is using right now, in bytes.extra_size— The extra space beyond what the live data actually needs. This is the overhead from dead rows and alignment padding.extra_pct— Extra space as a percentage of the total table size.fillfactor— The percentage of each page PostgreSQL fills with live data (default: 100). A lower fillfactor intentionally leaves free space for updates.bloat_size— The space that is truly wasted after accounting for the fillfactor. This is the amount you could reclaim.bloat_pct— Bloat as a percentage of the total table size. This is the key metric to watch.is_na— True when the estimate is not reliable, usually because column statistics are incomplete. Re-runANALYZEon those tables.tpl_size— The estimated size of one row, including alignment overhead.est_tblpages_ff— The estimated number of pages the table would need if it were compacted to its fillfactor.
Key Points
- A
bloat_pctabove 20–30% is generally worth investigating. - Results are sorted by
bloat_pctdescending so the worst tables appear first. - Tables with
is_na = trueshould be analyzed withANALYZE your_tablebefore trusting the estimate. - The estimate is based on statistics from
pg_statsandpg_class. It is not 100% exact, but it is accurate enough to guide decisions.
Insights and Explanations
Once you identify bloated tables, you have a few options:
Regular VACUUM: Marks dead rows as reusable but does not shrink the table file on disk. Best for ongoing maintenance.
1VACUUM your_table_name;
VACUUM FULL: Rewrites the table completely, reclaiming all dead space. It locks the table exclusively during the operation, so do this during a maintenance window.
1VACUUM FULL your_table_name;
pg_repack: A third-party tool that reclaims bloat without locking the table. It is the preferred option for large, busy tables in production.
For tables with a bloat percentage consistently above 30%, consider tuning autovacuum to run more aggressively on those specific tables:
1ALTER TABLE your_table SET (
2 autovacuum_vacuum_scale_factor = 0.01,
3 autovacuum_analyze_scale_factor = 0.005
4);
This tells autovacuum to trigger a vacuum after just 1% of rows change, rather than the default 20%.
Additional Considerations
- Permissions: Running this query as a non-superuser will only show tables you have permission to read.
- Statistics freshness: The bloat estimate depends on
pg_stats. IfANALYZEhas not run recently, the estimates may be off. RunANALYZEbefore using this in a capacity planning decision. - Write-heavy tables: Tables that receive many updates and deletes will naturally accumulate more bloat. Tune autovacuum settings for those tables individually.