Prioritize PostgreSQL Table Analysis, Unlocking Performance Insights

Prioritize PostgreSQL Table Analysis: Unlocking Performance Insights

Maintaining optimal PostgreSQL database performance requires regular table analysis. This process lets the query planner make informed decisions based on accurate table statistics. This article dives into a valuable PostgreSQL script revealing tables needing analysis, explains its components, and highlights how to optimize your database.

Sample Code from Command Line

 1SELECT
 2  schemaname,
 3  relname,
 4  -- not available on PostgreSQL <= 9.3
 5  n_mod_since_analyze,
 6  last_analyze,
 7  last_autoanalyze,
 8  -- not available on PostgreSQL <= 9.0
 9  analyze_count,
10  autoanalyze_count
11FROM pg_stat_user_tables
12ORDER BY
13  n_mod_since_analyze DESC,
14  last_analyze DESC,
15  last_autoanalyze DESC;

Postgres Pre 9.4 SQL

 1SELECT
 2  schemaname,
 3  relname,
 4  -- not available on PostgreSQL <= 9.3
 5  --n_mod_since_analyze,
 6  last_analyze,
 7  last_autoanalyze
 8  -- not available on PostgreSQL <= 9.0
 9  --analyze_count,
10  --autoanalyze_count
11FROM pg_stat_user_tables
12ORDER BY
13  --n_mod_since_analyze DESC;
14  last_analyze DESC,
15  last_autoanalyze DESC;

Notes PostgreSQL Analyze info for tables with rows modified since last analyze. Requires PostgreSQL 9.4+ (see postgres_last_analyze_pre94.sql for earlier versions). Tested on PostgreSQL 9.4+, 10.x, 11.x, 12.x, 13.0.

Code Breakdown:

  • SELECT: Chooses specific columns from the pg_stat_user_tables view. Key columns include:
    • schemaname: The schema containing the table.
    • relname: The table name.
    • n_mod_since_analyze: Approximate rows modified since the last ANALYZE.
    • last_analyze: Timestamp of the last manual ANALYZE.
    • last_autoanalyze: Timestamp of the last autoanalyze run.
    • analyze_count: Total manual ANALYZE executions.
    • autoanalyze_count: Total autoanalyze executions.
  • FROM pg_stat_user_tables: Specifies the view providing statistics on user tables.
  • ORDER BY: Sorts results primarily by rows modified since analysis, and secondarily by analysis timestamps.

Purpose

This script's primary purpose is to help DB administrators identify tables where changes since the last ANALYZE might impact query performance. The greater the n_mod_since_analyze value, the more outdated the statistics, potentially leading to suboptimal query plans.

Insights and Explanations

  • The Importance of ANALYZE: PostgreSQL relies on table statistics to choose the fastest query execution methods. ANALYZE gathers these statistics. Outdated statistics risk the planner making poor choices, hindering performance.
  • Manual vs. Automatic Analysis: ANALYZE can be triggered manually or automatically (autoanalyze). This script reveals both execution times and counts.
  • Prioritizing Analysis: Focus on tables with high n_mod_since_analyze values and older last_analyze timestamps.

References:

PostgreSQL documentation on ANALYZE https://www.postgresql.org/docs/current/sql-analyze.html

PostgreSQL documentation on statistics: https://www.postgresql.org/docs/current/monitoring-stats.html

Source:

https://github.com/HariSekhon/SQL-scripts/blob/master/postgres_last_analyze.sql

https://github.com/HariSekhon/SQL-scripts/blob/master/postgres_last_analyze_pre94.sql