I'm querying a table with a few millions rows. When I use a sub select with =
select *
from "parcels"
where "parcel_id" = (select "parcel_id"
from "parcels_properties"
where "property_id" = '178528')
the index will be used and it returns in 2 secs.
This, obviously when the subquery has more than one results, I need to use WHERE IN
. Then it won't use the index and gives me 10+ second.
select *
from "parcels"
where "parcel_id" in (select "parcel_id"
from "parcels_properties"
where "property_id" = '178528')
explain:
"Hash Join (cost=82047.73..357097.98 rows=2019856 width=170)"
" Hash Cond: ((parcels.parcel_id)::text = (parcels_properties.parcel_id)::text)"
" -> Seq Scan on parcels (cost=0.00..241975.11 rows=4039711 width=170)"
" -> Hash (cost=82045.23..82045.23 rows=200 width=38)"
" -> HashAggregate (cost=82043.23..82045.23 rows=200 width=38)"
" Group Key: (parcels_properties.parcel_id)::text"
" -> Gather (cost=1000.00..81970.73 rows=28999 width=38)"
" Workers Planned: 2"
" -> Parallel Seq Scan on parcels_properties (cost=0.00..78070.83 rows=12083 width=38)"
" Filter: ((property_id)::text = '178528'::text)"
On line 3 of this explain, a Seq Scan
on parcels
table is used, instead of the parcel_id index
?
One more thing, this is on the AWS RDS.
If I run this same SQL on my local database with almost same setup (only RDS is v14, local is v12) it uses the index and returns instantly.
Created index on parcels table:
- parcel_id
Created index on parcels_properties table:
- parcel_id
- property_id
So could anyone help me out with this issue?
Thank you.
UPDATE
Doing so will sorta force index scan on Parcels table
SET enable_seqscan = OFF;
SELECT *
FROM parcels p
WHERE EXISTS (
SELECT 1
FROM parcels_properties pp
WHERE pp.parcel_id = p.parcel_id AND property_id = '178528'
);