3

I have a setup where I would like to support a user defined subquery inside of a spatial query to identify overlapping polygons. Below is an example:

SELECT i1.id, i2.id
FROM
    (SELECT * FROM Images WHERE id > 600) i1,
    (SELECT * FROM Images WHERE id > 600) i2
WHERE ST_INTERSECTS(i1.footprint_latlon, i2.footprint_latlon) = TRUE
AND i1.id > i2.id

The above is working, but I more canonical way to do this exists.

The inner query is completely arbitrary here and the important component is that I am defining i1 and i2 using an identical query. I am doing this so that I have i1 and i2 aliases for the outer, spatial query.

Is it necessary to execute the inner query twice or does a better way to create the i1, and i2 aliases exist? Lots of example on executing subqueries with a single (mandatory alias), but I am not seeing any examples of 'multi-aliases'.

Durgpal Singh
  • 11,481
  • 4
  • 37
  • 49
Jzl5325
  • 3,898
  • 8
  • 42
  • 62
  • It is merely your ability to get experts to answer your question that causes you to raise these questions. There are thousands of examples, only if you spend time you can learn. – Jay Kumar R Mar 28 '18 at 05:04
  • 1
    Even with `WITH i AS (foo) SELECT bah FROM i AS i1, i AS i2 WHERE...` the sub query is effectively resolved twice. To avoid that YOU need to preserve the result, such as with a temp table. But doing so may not be a good idea; you need to write the results somewhere and lose the indexes, etc. For that reason caching the results may be more expensive than resolving it twice. Also, note that sql is a declarative language and the rdbms is free to rearrange the execution, so as to make use of indexes, etc. In short, don't prematurely optimise. Be concerned with maintainable code. – MatBailie Mar 28 '18 at 05:28
  • @MatBailie I very much appreciate the point about premature optimization and the nature of SQL withr espect to how the query is performed. Interesting to know that indices are stripped, but that makes total sense once you highlighted that issue. – Jzl5325 Mar 28 '18 at 13:27
  • 1
    @Jzl5325 No, the indices are not *stripped* (what is that?) The CTE is executed *exactly once*, and every time the CTE is referenced, its (materialized) result set is used. And: this resultset has no (sub)structure, at most it will have a defined order. [see my answer on how to *avoid* the CTE] – wildplasser Mar 28 '18 at 13:53
  • @wildplasser Thanks! Makes sense. I am reading about CTE right now and trying out each of the answers to make sure I understand how they are working. By stripped, I mean removed and not used, but it sounds like that is not the case. [This question](https://stackoverflow.com/questions/4740748/when-to-use-common-table-expression-cte) looks to be pretty good for filling in my knowledge gap on a CTE. – Jzl5325 Mar 28 '18 at 14:00
  • @wildplasser - My comment noted that if the OP persists the intermediate results ***such as with a temp table*** then he would ***lose the indexes***. – MatBailie Mar 30 '18 at 13:55

4 Answers4

4

I don't think there is a clean easy way to "alias an alias". You could do something like this:

WITH 
   i1 AS (SELECT * FROM Images WHERE id > 600),
   i2 AS (SELECT * FROM i1)
SELECT i1.id, i2.id
FROM
    i1, i2
WHERE ST_INTERSECTS(i1.footprint_latlon, i2.footprint_latlon) = TRUE
AND i1.id > i2.id

EDIT and much nicer, as suggested by @MatBailie :

WITH 
   i AS (SELECT * FROM Images WHERE id > 600)
SELECT i1.id, i2.id
FROM
    i AS i1, i AS i2
WHERE ST_INTERSECTS(i1.footprint_latlon, i2.footprint_latlon) = TRUE
AND i1.id > i2.id
Turophile
  • 3,367
  • 1
  • 13
  • 21
2

My inclination would be one of the following:

select i1.id, i2.id
from images i1 join
     images i2
     on st_intersects(i1.footprint_latlon, i2.footprint_latlon)
where i1.id > 600 and i2.id > i1.id;

Or:

with ix as (
      select i.*
      from images i
      where i.id > 600
     )
select i1.id, i2.id
from ix AS i1 join
     ix AS i2
     on st_intersects(i1.footprint_latlon, i2.footprint_latlon)
where i2.id > i1.id;

I don't think there is a shortcut on performing the self join.

wildplasser
  • 43,142
  • 8
  • 66
  • 109
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

If you don't want to repeat the (complicated) subquery, you could either use a CTE (see Gordon Linoff's answer) or use a (temp) view containing the subquery [a CTE behaves differently for the optimiser/planner, for larger resultsets this could cause subobtimal performance] :


CREATE TEMP VIEW vv AS
     -- Put your complex subquery here ...
SELECT * FROM Images WHERE id > 600
        ;

SELECT i1.id, i2.id
FROM vv i1
JOIN vv i2 ON ST_INTERSECTS(i1.footprint_latlon, i2.footprint_latlon)
AND i1.id > i2.id
        ;
wildplasser
  • 43,142
  • 8
  • 66
  • 109
0
SELECT i1.id, i2.id
FROM Images i1,Images i2, 
WHERE i2.id > 600
  AND i1.id > i2.id
  AND ST_INTERSECTS(i1.footprint_latlon, i2.footprint_latlon) = TRUE
Jay Kumar R
  • 537
  • 2
  • 7
  • 3
    This totally misses the point of the question. `The inner query is completely arbitrary here and the important component is that I am defining i1 and i2 using an identical query.` – MatBailie Mar 28 '18 at 05:22