Partial indexes in PostgreSQL. Part 2: Query performance

Consider using partial indexes to improve performance for tables with significant percentage of "not interesting" rows — the ones filtered out in typical queries with a common WHERE condition with a constant.

The more rows are excluded from a partial index, the faster it is. It improves both read and write performance of the table.

Let's look at an example. Suppose we have the same table as in the previous post.

CREATE TABLE payments (  
  id              SERIAL  NOT NULL,
  status          VARCHAR NOT NULL,
  order_id        INTEGER NOT NULL,
  amount_in_cents INTEGER NOT NULL
);

The partial index over order_id column that was created in the previous post

CREATE UNIQUE INDEX payments_order_id_idx ON payments (order_id) WHERE status = 'success';  

excludes rows for failed payments. It should improve performance of queries like this:

SELECT * FROM payments WHERE order_id = 123 AND status = 'success';  

compared to a similar index over order_id that includes all table rows. However, queries like:

SELECT * FROM payments WHERE order_id = 123 AND status = 'failure';  
SELECT * FROM payments WHERE order_id = 123;  

will not use the partial index and will be slower.

The query condition doesn't have to strictly match the predicate (condition) of the partial index. If it returns a subset of rows in the partial index it may still be used by query planner. For example, if an index has predicate (amount_in_cents > 50000) then queries with condition (amount_in_cents > 100000) may still use it.

Deterministic (or "immutable") functions can also be used in the partial index predicate. That means that LOWER() can be used, but NOW() cannot.

Read more about partial indexes in PostgreSQL: