PostgreSQL Table Row Count Estimates with SQL

PostgreSQL Table Row Count Estimates with SQL

This PostgreSQL query returns estimated row counts for all user tables, sorted by the largest tables first. It reads from the statistics collector rather than scanning the tables, making it near-instant even on tables with billions of rows.

Purpose and Overview

Counting rows with SELECT COUNT(*) FROM table requires a full sequential scan of the table — on large tables this can run for minutes and generate significant I/O load. PostgreSQL's statistics collector maintains a continuously updated row estimate for every table in pg_stat_user_tables. This estimate is fast to read and accurate enough for most operational purposes.

Use this query when you need to compare table sizes, identify the largest tables in the database, or get a quick sense of data volume without the cost of exact counts.

SQL Script

1SELECT
2    schemaname,
3    relname,
4    n_live_tup
5FROM
6    pg_stat_user_tables
7ORDER BY
8    n_live_tup DESC;

Code Breakdown

pg_stat_user_tables

The pg_stat_user_tables view provides one row per user table (excluding system catalog tables) with statistics gathered by the statistics collector. It exposes counters for sequential scans, index scans, row inserts, updates, deletes, and live/dead tuple estimates.

n_live_tup

The n_live_tup column is the statistics collector's estimate of the number of live (non-deleted) rows in the table. It is updated by VACUUM, ANALYZE, and the autoanalyze daemon. It is not updated in real time on every insert or delete — the estimate drifts between ANALYZE runs but is generally accurate enough for capacity planning and relative size comparisons.

ORDER BY n_live_tup DESC

Sorts results so the table with the most rows appears first. This makes it easy to identify the largest tables in a single glance.

schemaname and relname

schemaname is the schema containing the table. relname is the table name. Including both allows the query to work correctly in databases with multiple schemas containing tables of the same name.

How Accurate Are the Estimates?

The accuracy of n_live_tup depends on how recently ANALYZE has run on the table:

  • Immediately after ANALYZE: Very accurate, typically within 1-2% of the true count.
  • On a busy table between ANALYZE runs: Can drift by the number of rows modified since the last ANALYZE. Autovacuum triggers ANALYZE when 20% of the rows have changed (default threshold).
  • On a table that has never been analyzed: The value may be 0 or reflect the initial estimate from pg_class.reltuples.

For exact counts, use SELECT COUNT(*) FROM table — but only when precision is required.

Practical Applications

Identifying Large Tables

Sort by n_live_tup to quickly find the tables with the most rows. Combine with the tables-by-size query to see whether large row counts correlate with large storage consumption.

Checking After Bulk Loads

After a large data import, compare n_live_tup before and after (with an ANALYZE in between) to verify the expected number of rows landed correctly.

Monitoring Table Growth

Run this query periodically and track changes to n_live_tup over time to understand which tables are growing fastest and plan for capacity accordingly.

Dead Row Detection

The companion column n_dead_tup shows the estimated number of dead (deleted or updated-but-not-yet-vacuumed) rows. A high n_dead_tup relative to n_live_tup indicates the table needs vacuuming:

 1SELECT
 2    schemaname,
 3    relname,
 4    n_live_tup,
 5    n_dead_tup,
 6    round(n_dead_tup::numeric / nullif(n_live_tup + n_dead_tup, 0) * 100, 1) AS dead_pct
 7FROM
 8    pg_stat_user_tables
 9ORDER BY
10    n_dead_tup DESC;

Extending the Query

Include Dead Rows and Last Analyze

 1SELECT
 2    schemaname,
 3    relname,
 4    n_live_tup,
 5    n_dead_tup,
 6    last_analyze,
 7    last_autoanalyze
 8FROM
 9    pg_stat_user_tables
10ORDER BY
11    n_live_tup DESC;

Compare with pg_class Estimate

pg_class.reltuples is a similar estimate updated at checkpoint time. Comparing it to n_live_tup can reveal tables where the two estimates have diverged:

 1SELECT
 2    N.nspname AS schemaname,
 3    C.relname,
 4    C.reltuples::bigint AS reltuples,
 5    S.n_live_tup
 6FROM
 7    pg_class C
 8LEFT JOIN pg_namespace N ON N.oid = C.relnamespace
 9LEFT JOIN pg_stat_user_tables S ON S.relname = C.relname AND S.schemaname = N.nspname
10WHERE
11    C.relkind = 'r'
12    AND N.nspname NOT IN ('pg_catalog', 'information_schema')
13ORDER BY
14    S.n_live_tup DESC NULLS LAST;

Version Compatibility

Tested on PostgreSQL 8.4, 9.x, 10.x, 11.x, 12.x, and 13.0. pg_stat_user_tables and n_live_tup have been available since early PostgreSQL versions. No version-specific features are used.

Best Practices

Run ANALYZE before relying on estimates: If you need reasonably accurate counts, run ANALYZE table_name first to refresh the statistics, then query n_live_tup.

Use for relative comparisons, not audit counts: Estimates are appropriate for capacity planning and size rankings. For compliance, reconciliation, or exact reporting, use COUNT(*).

Monitor n_dead_tup alongside n_live_tup: High dead row counts indicate autovacuum is not keeping up with the workload, which can cause table bloat and query slowdowns.

Track over time: Store periodic snapshots of this query's results to build a growth trend for each table and anticipate when partitioning or archiving will be needed.

References

Posts in this series