1

I have a SQL subquery that always returns a single array. I want my outer query to filter where a value is in that array. Nothing I expect to work is working for this.

Setup for a minimal test:

CREATE TABLE test (num int);
INSERT INTO test (num) values (1);
INSERT INTO test (num) values (2);

My attempted query, which should return two rows, "1" and "2":

SELECT * FROM test WHERE num = ANY(SELECT array_agg(num) from test);

Of course... ERROR: operator does not exist: integer = integer[]. The subquery returns one row containing an array, and I need to just get the array.

Also tried this without success:

SELECT * FROM test WHERE num = ANY((SELECT array_agg(num) from test)[0]);

I get ERROR: op ANY/ALL (array) requires array on right side. THAT doesn't make sense to me, but I'm no expert.

What am I supposed to do here?

sudo
  • 5,604
  • 5
  • 40
  • 78

2 Answers2

4

The first error is due to the fact that PostgreSQL sees

num = ANY(SELECT array_agg(num) from test)

as a form of

expression operator ANY (subquery)

instead of the one you want:

expression operator ANY (array expression)

As such, it tries to compare num to any ROW returned by the subquery (which in your example returns a single row ARRAY[1,2]), instead of every element of the array (hence the operator does not exist: integer = integer[] error you're getting). See the documentation for more details.


The second errors simply comes from the fact that [0] accesses an element of an integer[], and as such returns an integer. Since the right operand has to be an array (or a subquery, but PostgreSQL can't see one here), PostgreSQL returns ERROR: op ANY/ALL (array) requires array on right side.
(As an aside, arrays are 1-based in PostgreSQL, not 0-based).


If you are sure that your function will always return a single array, then you can simply force Postgre to see SELECT array_agg(num) from test as an integer[]:

SELECT * FROM test 
WHERE num = ANY((SELECT array_agg(num) from test)::integer[]);
┌─────┐
│ num │
├─────┤
│   1 │
│   2 │
└─────┘
(2 rows)

Note however that if your function returns multiple arrays, you will get an error (since a setof integer[] cannot be seen as an integer[]):

SELECT * FROM test
WHERE num = ANY((SELECT array_agg(num) from test GROUP BY num)::integer[]);
ERROR:  21000: more than one row returned by a subquery used as an expression
LOCATION:  ExecSetParamPlan, nodeSubplan.c:970

Another solution is to use the unnest function to transform your array to a set of integer, using the expression operator ANY (subquery) form. This will work even if your function returns multiple array, though it is a bit slower than the previous query.

SELECT * 
FROM test 
WHERE num = ANY(
  SELECT unnest(sub.array_agg)
  FROM (
    SELECT array_agg(num) FROM test GROUP BY num -- GROUP BY num to show off the multiple array scenario
  ) AS sub
);
┌─────┐
│ num │
├─────┤
│   1 │
│   2 │
└─────┘
(2 rows)
Marth
  • 23,920
  • 3
  • 60
  • 72
  • 1
    Also, a simple `where num = any(array(select num from test))` would work too. – Daniel Vérité Sep 04 '15 at 23:38
  • Thanks, this is an excellent answer! I did it with a type cast because I'm sure it's only returning one array, and it's working now. – sudo Sep 05 '15 at 00:19
1

@Marth already explained details and some applications for the ANY construct. Closely related:

There is simpler variant: use the subquery as derived table in the FROM clause and join to it:

SELECT t.*
FROM   test t
JOIN  (SELECT array_agg(num) arr FROM test) a ON t.num = ANY(a.arr);

This even works if the subselect should return more than one row (with an array each). You get every row from test that matches any element of the array, but you get each row only once. For an array '{1,1,2}', you get the same two rows as for '{1,2}'.

Or unnest() and join to unnested elements without ANY. Conveniently use the same column name and join with the USING clause to merge the joined column. Then you can just SELECT * to get rows of just test:

SELECT *
FROM   test t
JOIN   unnest((SELECT array_agg(num) arr FROM test)) num USING (num);

Here you get one row for every element of the array, so you get three result rows for '{1,1,2}'.

SQL Fiddle.

(SELECT array_agg(num) arr FROM test) is just a dummy subquery for the proof of concept. But it should be mentioned at least once: if at all possible, don't aggregate in the first place, then you don't have to unnest later. Like:

SELECT *
FROM   test t
JOIN   (SELECT num FROM test) t1 USING (num);
Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228