0

I'm querying a table with a few millions rows. When I use a sub select with =

select *
from "parcels" 
where "parcel_id" = (select "parcel_id" 
                     from "parcels_properties" 
                     where "property_id" = '178528')

the index will be used and it returns in 2 secs.

This, obviously when the subquery has more than one results, I need to use WHERE IN. Then it won't use the index and gives me 10+ second.

select * 
from "parcels" 
where "parcel_id" in (select "parcel_id" 
                      from "parcels_properties" 
                      where "property_id" = '178528')

explain:

"Hash Join  (cost=82047.73..357097.98 rows=2019856 width=170)"
"  Hash Cond: ((parcels.parcel_id)::text = (parcels_properties.parcel_id)::text)"
"  ->  Seq Scan on parcels  (cost=0.00..241975.11 rows=4039711 width=170)"
"  ->  Hash  (cost=82045.23..82045.23 rows=200 width=38)"
"        ->  HashAggregate  (cost=82043.23..82045.23 rows=200 width=38)"
"              Group Key: (parcels_properties.parcel_id)::text"
"              ->  Gather  (cost=1000.00..81970.73 rows=28999 width=38)"
"                    Workers Planned: 2"
"                    ->  Parallel Seq Scan on parcels_properties  (cost=0.00..78070.83 rows=12083 width=38)"
"                          Filter: ((property_id)::text = '178528'::text)"

On line 3 of this explain, a Seq Scan on parcels table is used, instead of the parcel_id index ?

One more thing, this is on the AWS RDS.

If I run this same SQL on my local database with almost same setup (only RDS is v14, local is v12) it uses the index and returns instantly.

Created index on parcels table:

  • parcel_id

Created index on parcels_properties table:

  • parcel_id
  • property_id

So could anyone help me out with this issue?

Thank you.

UPDATE

Doing so will sorta force index scan on Parcels table

SET enable_seqscan = OFF;

SELECT *
FROM parcels p
WHERE EXISTS (
    SELECT 1
    FROM parcels_properties pp
    WHERE pp.parcel_id = p.parcel_id AND property_id = '178528'
);
Yunwei.W
  • 1,589
  • 1
  • 14
  • 31

2 Answers2

1

You never told us which indices you have defined, but in any case I would express your query using exists logic:

SELECT *
FROM parcels p
WHERE EXISTS (
    SELECT 1
    FROM parcels_properties pp
    WHERE pp.parcel_id = p.parcel_id AND property_id = '12345'
);

This query should benefit from the following index on parcels_properties:

CREATE INDEX idx_pp ON parcel_properties (property_id, parcel_id);
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
0

For anyone might have this same problem in the future. Try VACUUM your database.

I did a full VACUUM of the database and it worked.

 vacuumdb --echo --full --verbose --analyze -h yourdbhost -p 5432 -U your username -d yourdbname
Yunwei.W
  • 1,589
  • 1
  • 14
  • 31