There are lots of questions on this topic, but all of them seem to be more complex cases than what I'm looking at at the moment and the answers don't seem applicable.
OHDSI=> \d record_counts
Table "results2.record_counts"
Column | Type | Modifiers
------------------------+-----------------------+-----------
concept_id | integer |
schema | text |
table_name | text |
column_name | text |
column_type | text |
descendant_concept_ids | bigint |
rc | numeric |
drc | numeric |
domain_id | character varying(20) |
vocabulary_id | character varying(20) |
concept_class_id | character varying(20) |
standard_concept | character varying(1) |
Indexes:
"rc_dom" btree (domain_id, concept_id)
"rcdom" btree (domain_id)
"rcdomvocsc" btree (domain_id, vocabulary_id, standard_concept)
The table has 3,133,778 records, so Postgres shouldn't be ignoring the index because of small table size.
I filter on domain_id, which is indexed, and the index is ignored:
OHDSI=> explain select * from record_counts where domain_id = 'Drug';
QUERY PLAN
------------------------------------------------------------------------
Seq Scan on record_counts (cost=0.00..76744.81 rows=2079187 width=87)
Filter: ((domain_id)::text = 'Drug'::text)
I turn off seqscan and:
OHDSI=> set enable_seqscan=false;
SET
OHDSI=> explain select * from record_counts where domain_id = 'Drug';
QUERY PLAN
-------------------------------------------------------------------------------------
Bitmap Heap Scan on record_counts (cost=42042.13..105605.97 rows=2079187 width=87)
Recheck Cond: ((domain_id)::text = 'Drug'::text)
-> Bitmap Index Scan on rcdom (cost=0.00..41522.33 rows=2079187 width=0)
Index Cond: ((domain_id)::text = 'Drug'::text)
Indeed, the plan says it's going to be more expensive to use the index than not, but why? If the index lets it handle many fewer records, shouldn't it be quicker to use it?