Primary keys (and all indexed columns for that matter) take advantage of indexes when those column(s) are used as filter predicates, like WHERE
and JOIN...ON
clauses.
As a real world example, my application has a table called Log_Games
, which is a table with millions of rows, ID
as the primary key, and a number of other non-indexed columns such as ParsedAt
. Compare the below:
INDEXED QUERY
EXPLAIN ANALYZE
SELECT *
FROM "Log_Games"
WHERE "ID" = 792046
INDEXED QUERY PLAN
Index Scan using "Log_Games_pkey" on "Log_Games" (cost=0.43..8.45 rows=1 width=4190) (actual time=0.024..0.024 rows=1 loops=1)
Index Cond: ("ID" = 792046)
Planning time: 1.059 ms
Execution time: 0.066 ms
NON-INDEXED QUERY
EXPLAIN ANALYZE
SELECT *
FROM "Log_Games"
WHERE "ParsedAt" = '2015-05-07 07:31:24+00'
NON-INDEXED QUERY PLAN
Seq Scan on "Log_Games" (cost=0.00..141377.34 rows=18 width=4190) (actual time=0.013..793.094 rows=1 loops=1)
Filter: ("ParsedAt" = '2015-05-07 07:31:24+00'::timestamp with time zone)
Rows Removed by Filter: 1924676
Planning time: 0.794 ms
Execution time: 793.135 ms
The query with the indexed clause uses the index Log_Games_pkey
, resulting in a query that executes in 0.066ms. The query with the non-indexed clause reverts to a sequential scan, which means it goes from the start to the finish of the table to see which columns match, an operation that causes the execution time to blow out to 793.135ms.
There are plenty of good resources around the web that can help you read execution plans and decide when they may need supporting indexes. A good place to start is the PostgreSQL documentation:
https://www.postgresql.org/docs/9.6/static/using-explain.html