Partial indexes in PostgreSQL. Part 1: Data Integrity

A partial index is an index that is built over a subset of table rows that meet some condition. A partial index can ensure data integrity by enforcing constraints on that subset of table rows.

Let's look at an example. Suppose we have to track each successful and failed payment for every order in an e-commerce application. An order may have any number of failed payments but no more than one successful payment. We can model payments with the following table, assuming that status column has success and failure values for successful and failed payments respectively:

CREATE TABLE payments (
  id              SERIAL  NOT NULL,
  status          VARCHAR NOT NULL,
  order_id        INTEGER NOT NULL,
  amount_in_cents INTEGER NOT NULL
);

To allow the table to have no more than one record with success status and any number of records with failure status for each order_id we can create a unique partial index:

CREATE UNIQUE INDEX payments_order_id_idx ON payments (order_id)
WHERE status = 'success';

Let's see how this constraint is applied. We can add two rows to the table with the same order_id and different statuses:

INSERT INTO payments (status, order_id, amount_in_cents) VALUES ('failure', 1, 1000);
INSERT INTO payments (status, order_id, amount_in_cents) VALUES ('success', 1, 1000);

But if we try to add one more row with the same order_id and status equal to success:

INSERT INTO payments (status, order_id, amount_in_cents) VALUES ('success', 1, 1000);

we get an error that says duplicate key value violates unique constraint "payments_order_id_idx". However, we can still add rows with failure status for the same order:

INSERT INTO payments (status, order_id, amount_in_cents) VALUES ('failure', 1, 1000);

Partial indexes can make queries even faster than the same "full" indexes without a condition. I'll be talking about that in the next posts.

Also, check PostgreSQL docs to learn more about partial indexes.