7

I use postgres & have a table like this :

id   | arr
-------------------
 1   | [A,B,C]
 2   | [C,B,A]
 3   | [A,A,B]
 4   | [B,A,B]

I created a GROUP BY 'arr' query.

SELECT COUNT(*) AS total, "arr" FROM "table" GROUP BY "arr"

... and the result :

total | arr
-------------------
 1    | [A,B,C]
 1    | [C,B,A]
 1    | [A,A,B]
 1    | [B,A,B]

BUT, since [A,B,C] and [C,B,A] have the same elements, so i expected the result should be like this :

total | arr
-------------------
  2   |   [A,B,C]
  2   |   [A,A,B]

Did i miss something (in query) or else? Please help me..

Raditya Arya
  • 73
  • 2
  • 3

2 Answers2

7

You do not need to create a separate function to do this. It can all be done in a single statement:

select array(select unnest(arr) order by 1) as sorted_arr, count(*)
from t
group by sorted_arr;

Here is a rextester.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
3

[A,B,C] and [C,B,A] are different arrays even if they have the same elements they are not in the same position, they will never be grouped by a group by clause, in case you want to make them equivalent you'd need to sort them first.

On this thread you have info abour sorting arrays.

You should do something like:

SELECT COUNT(*) AS total, array_sort("arr") FROM "table" GROUP BY array_sort("arr")

After creating a sort function like the one proposed in there:

CREATE OR REPLACE FUNCTION array_sort (ANYARRAY)
RETURNS ANYARRAY LANGUAGE SQL
AS $$
SELECT ARRAY(SELECT unnest($1) ORDER BY 1)
$$;
L. Amigo
  • 384
  • 1
  • 10
  • Wow its working. But, i still get an error if i JOIN it with other table. Let's say table1 and table2. I tried group it like this : array_sort("table1"."arr"), and got "ERROR: column "table1.arr" must appear in tht GROUP BY clause ......". can you explain why? – Raditya Arya Jun 03 '18 at 07:49
  • 1
    `SELECT count(id),ARRAY(SELECT unnest(arr) ORDER BY 1) from your_table group by 2` this will also produce the same result – Vivek S. Jun 03 '18 at 07:50
  • Every column that is referenced must be grouped, if you are used table1.arr in your join clause you need to group it too, just add it after the first group part: ...GROUP BY array_sort("arr"), table1.arr – L. Amigo Jun 03 '18 at 08:00
  • @w͏̢in̡͢g͘̕ed̨p̢͟a͞n͏͏t̡͜͝he̸r̴ It works as well! But how to use your solution in joined table? it will get "ERROR: subquery uses ungrouped column "table1.arr" from outer query". – Raditya Arya Jun 03 '18 at 08:03
  • Which is the join you are trying to use? – L. Amigo Jun 03 '18 at 08:04
  • @L. Amigo I have tried all of your solutions and work with my joined tables. Thanks a lot, sir! ;) – Raditya Arya Jun 03 '18 at 08:28