How to ORDER BY an unaggregated column in subquery?

Question

I'm trying to create a function in Hasura that queries the posts near a user. I got that working but I would also like to order the posts. I get a Postgres error that says:

"postgres-error : column "b.Volume" must appear in the GROUP BY clause or be used in an aggregate function"

I'm new to Hasura and Postgres and I'm not sure if you are allowed to do something like this or not. Here is my code:

CREATE OR REPLACE FUNCTION public.search_posts_near_user(id uuid, distance_kms integer, volume integer)
 RETURNS SETOF user_posts
 LANGUAGE sql
 STABLE
AS $function$
  SELECT  A.id, A.location,
  (SELECT json_agg(row_to_json(B)) FROM "Posts" B
   WHERE (
     ST_Distance(
       ST_Transform(B.location::Geometry, 3857),
       ST_Transform(A.location::Geometry, 3857)
     ) /1000) < distance_kms AND B."Volume" < volume
    ORDER BY B."Volume" Desc
   ) AS nearby_Posts
  FROM users A where A.id = id
$function$

Does this solve your problem? https://stackoverflow.com/questions/19601948/must-appear-in-the-group-by-clause-or-be-used-in-an-aggregate-function — YFl, Jun 20 '21 at 22:39

Erwin Brandstetter · Accepted Answer · 2021-06-21T02:47:46.430

This should work:

CREATE OR REPLACE FUNCTION public.search_posts_near_user(
      _id uuid
    , _distance_kms integer
    , _volume integer)
  RETURNS SETOF user_posts
  LANGUAGE sql STABLE AS
$func$
SELECT  A.id, A.location
     , (SELECT json_agg(B.* ORDER BY B."Volume" DESC)
        FROM   "Posts" B
        WHERE  ST_Distance(ST_Transform(B.location::Geometry, 3857)
                         , ST_Transform(A.location::Geometry, 3857))
               < _distance_kms * 1000
        AND    B."Volume" < _volume
      ) AS nearby_Posts
FROM    users A
WHERE   A.id = _id
$func$;

Make sure that function parameters (and variables) do not collide with table column names. I prefixed with _. Not strictly necessary, but a good habit.

Note how I multiply _distance_kms * 1000 instead of ST_Distance() / 1000 like you had it. That's better for multiple reasons. Multiplication is cheaper to begin with, and it avoids rounding errors. And it is substantially cheaper to manipulate the constant value once than the computed values for every row. Finally, your original expression might prevent index support (if available), being non-"sargable", which would be the strongest reason to fix it.

Basically, this is a KNN (k nearest neighbors) type of problem, that can benefit a lot from the right query technique in combination with the right index, typically a GiST index. See:

Unique assignment of closest points between two tables

Also, while ORDER BY attached to the aggregate expression works conveniently, it is typically faster to sort in a subquery with ORDER BY, and then aggregate. See:

How to apply ORDER BY and LIMIT in combination with an aggregate function?

Aside:
My standing advice is to use legal, lower-case, unquoted names in Postgres exclusively. See:

Are PostgreSQL column names case-sensitive?

How to ORDER BY an unaggregated column in subquery?

1 Answers1