Optimize Your PostgreSQL Queries, Index Cardinality

PostgreSQL Index Cardinality: A Guide to Understanding Index Efficiency

This blog post delves into the world of PostgreSQL index cardinality, a crucial metric for optimizing database performance. We'll dissect the provided SQL code, explain its purpose, and equip you with valuable insights to make informed decisions about your PostgreSQL indexes.

Sample Code from Command Line

 1SELECT
 2  relname,
 3  --relkind,
 4  reltuples AS "cardinality (reltuples)",
 5  relpages
 6FROM
 7  pg_class
 8WHERE
 9  relkind = 'i'
10--    AND
11--  relname LIKE 'someprefix%';
12    AND
13  relname NOT ILIKE 'pg_%'
14ORDER BY
15  2 DESC;

Notes PostgreSQL index cardinality. Tested on PostgreSQL 8.4, 9.x, 10.x, 11.x, 12.x, 13.0.

What is Index Cardinality?

In PostgreSQL, indexes act like shortcuts to efficiently locate specific data within tables. Cardinality refers to the estimated number of distinct values in a particular column or the number of rows an index can help retrieve quickly. Understanding index cardinality is essential because:

  • Effective Index Selection: Choosing indexes with high cardinality (many distinct values) leads to faster searches, while low cardinality indexes (few distinct values) might not offer significant performance benefits.
  • Query Optimization: The PostgreSQL query optimizer utilizes cardinality estimations to determine the most efficient execution plan for your queries.

Code Breakdown:

Let's explore the code line by line:

  • SELECT: This clause retrieves data from the pg_class system catalog table, which stores information about database objects.
  • Selected Columns:
    • relname: Name of the relation (index in this case).
    • --relkind: (Commented out) This column indicates the relation kind ('i' for index).
    • reltuples AS "cardinality (reltuples)": Estimated number of distinct values in the indexed column (aliased for clarity).
    • relpages: Number of data pages allocated for the index storage.
  • FROM pg_class: Specifies the data source as the pg_class table.
  • WHERE: This clause filters the results based on specific conditions:
    • relkind = 'i': Selects only indexes ('i' stands for index).
    • --  AND: (Commented out) This section allows filtering by index name prefix (e.g., LIKE 'someprefix%'). Uncomment and modify as needed.
    • relname NOT ILIKE 'pg_%': Excludes internal PostgreSQL indexes (starting with "pg_").
  • ORDER BY 2 DESC: Sorts the results by the estimated cardinality (second column) in descending order, showing high cardinality indexes first.

Key Points and Insights:

  • This code provides a handy way to analyze the estimated cardinality of your PostgreSQL indexes.
  • Indexes with high cardinality generally offer better performance benefits for specific queries.
  • Consider reviewing indexes with low cardinality to assess if they're still valuable or can be dropped to optimize storage and query planning.
  • Remember, cardinality is an estimation. Analyze actual query performance to validate the effectiveness of your indexes.

Conclusion

By understanding index cardinality and using the provided code, you can effectively evaluate and optimize your PostgreSQL indexes for peak database performance.

References:

Source:

https://github.com/HariSekhon/SQL-scripts/blob/master/postgres_index_cardinality.sql