5

Using PostgreSQL-9.1 and PostGIS 2.0.1, when doing a SELECT query containing a sub-query that returns multiple columns, I am getting the error subquery must return only one column.

How can the query/subquery be modified to return multiple columns?

Query

SELECT l.id, l.lat, l.lng, l.geom,
        (SELECT g.id, g.lat, g.lng, ST_Distance(l.geom, g.geom)
        FROM stage.dogs as g
        LIMIT 1)

FROM stage.users As l

Full Query

SELECT l.id, l.lat, l.lng, l.geom,
    g.id, g.lat, g.lng, ST_Distance(l.geom, g.geom) 
FROM stage.users As l
CROSS JOIN (SELECT *
    FROM stage.dogs as g
    ORDER BY g.geom <-> l.geom 
    LIMIT 1) as g

Error

ERROR: invalid reference to FROM-clause entry for table "l"
SQL state: 42P01
Hint: There is an entry for table "l", but it cannot be referenced from this part of the query.
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Nyxynyx
  • 61,411
  • 155
  • 482
  • 830

2 Answers2

2
SELECT l.id, l.lat, l.lng, l.geom,
       g.id, g.lat, g.lng, ST_Distance(l.geom, g.geom)
FROM stage.users As l
CROSS JOIN (SELECT * FROM stage.dogs LIMIT 1) as g

This is literally what you had (assuming stage.dogs) is not empty. Not sure if there should be a correlation between users and dogs though.


To find the closest dog to a user, you can use this query. The scalar subquery finds the dog's ID, and is joined back to the table to retrieve the other columns.

SELECT l.id, l.lat, l.lng, l.geom,
       g.id, g.lat, g.lng, ST_Distance(l.geom, g.geom)
FROM (
    SELECT l1.*, (SELECT g1.id
                  FROM stage.dogs as g
                  ORDER BY g.geom <-> l.geom 
                  LIMIT 1) g_id
    FROM stage.users As l1
) l
JOIN stage.dogs as g ON g.id = l.g_id;

Fair warning that this will NOT be a fast query.


At the risk of performing even slower, see query below for multiple tables

SELECT l.id, l.lat, l.lng, l.geom,
       g.id, g.lat, g.lng, ST_Distance(l.geom, g.geom) dog_distance,
       c.id, c.lat, c.lng, ST_Distance(l.geom, c.geom) cat_distance,
       b.id, b.lat, b.lng, ST_Distance(l.geom, b.geom) bird_distance
FROM (
    SELECT l1.*, (SELECT g1.id
                  FROM stage.dogs as g1
                  ORDER BY g1.geom <-> l.geom 
                  LIMIT 1) dog_id,
                 (SELECT c1.id
                  FROM stage.cats as c1
                  ORDER BY c1.geom <-> l.geom 
                  LIMIT 1) cat_id,
                 (SELECT b1.id
                  FROM stage.cats as b1
                  ORDER BY b1.geom <-> l.geom 
                  LIMIT 1) bird_id
    FROM stage.users As l1
) l
LEFT JOIN stage.dogs as g ON g.id = l.dog_id
LEFT JOIN stage.dogs as c ON c.id = l.cat_id
LEFT JOIN stage.dogs as b ON b.id = l.bird_id;
RichardTheKiwi
  • 105,798
  • 26
  • 196
  • 262
  • I tried expanding the query (updated question) and now theres the error `ERROR: invalid reference to FROM-clause entry for table "l"` `Hint: There is an entry for table "l", but it cannot be referenced from this part of the query.` – Nyxynyx May 09 '13 at 06:39
  • For the second query, it took 200ms to return 500 rows. However like Erwin's answer, this query returned the `g.geom` that was furthest away from each `l.geom`. Then again maybe I am using the wrong CRS? Both`geom` cols are currently `POINT,2163` – Nyxynyx May 10 '13 at 06:04
  • Clutching straws here, try `ORDER BY ST_Distance(l.geom, g.geom)` in the subquery? – RichardTheKiwi May 10 '13 at 08:42
  • I discovered there was an error with the coordinate system used. How do you extend the query to also select the closest geometry from more tables, eg: `cats`, `birds` etc – Nyxynyx May 11 '13 at 05:15
2

This gives you one row per user with the closest dog:

SELECT DISTINCT ON (l.id)
       l.id, l.lat, l.lng, l.geom
      ,g.id, g.lat, g.lng, ST_Distance(l.geom, g.geom) 
FROM   stage.users     l
CROSS  JOIN stage.dogs g
ORDER  BY l.id, (l.geom <-> g.geom)

More information on the technique with DISTINCT ON in this related answer:

I guess if you have GiST index on g.geom, the planner might be smart enough to just pick the closes item from it. Not sure, didn't test. Otherwise, this kind of CROSS JOIN would result in O(N²) and performance may get out of hand quickly with a bigger table.

I quote the Postgis manual here:

Index only kicks in if one of the geometries is a constant (not in a subquery/cte). e.g. 'SRID=3005;POINT(1011102 450541)'::geometry instead of a.geom

So you may be out of luck here.

According to the manual you may need to order by ST_Distance() to get precise sort order, but you shouldn't be getting the one furthest away. That makes no sense.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228