1

I want to compare two arrays, ignoring the order of the elements and duplicates.

I could do a containment check in both directions, but are there better/faster solutions?

select * from my_table where my_table.a1 @> my_table.a2 and
                             my_table.a2 @> my_table.a1
  • 1
    If it is one dimensional array of integers with no nulls inside then you can use [intarray](https://www.postgresql.org/docs/current/static/intarray.html) to sort its contents, remove duplicates and then compare. I don't know if this will be faster tho. Alternative way is to `unnest` and work with set, this could be faster or slower, depending on your data, conditions and indexes. – Łukasz Kamiński Apr 13 '17 at 07:47

2 Answers2

3

From my internal testing, it seems that these two variants are the fastest (even faster than the @> + <@ check), and they both can handle NULLs too:

where (select array_agg(distinct e order by e) from unnest(arr1) e)
    = (select array_agg(distinct e order by e) from unnest(arr2) e)

where array(select distinct e from unnest(arr1) e order by e)
    = array(select distinct e from unnest(arr2) e order by e)

The latter is usually a little faster, but sometimes not. This can depend on a lot of things, f.ex. the cardinality of the arrays, etc.

pozs
  • 34,608
  • 5
  • 57
  • 63
1

using unnest and distinct example:

t=# create or replace function so62(a1 anyarray,a2 anyarray) returns boolean as
$$
declare
 _r boolean := false;
 _a text;
begin
  with p as (select distinct unnest(a1) order by 1) select array_agg(unnest) into _a from p;
  with p as (select distinct unnest(a2) order by 1) select array_agg(unnest)::text = _a into _r from p;
  return _r;
end;
$$ language plpgsql;
CREATE FUNCTION

(function updated following @poz notice that it would require two except for comparison)

test:

t=# with c as (
  select '{1,null,2,2}'::int[] a1,'{2,1,null}'::int[] a2
) 
select a1 @> a2, a2 @> a1,so62(a1,a2) 
from c;
 ?column? | ?column? | so62
----------+----------+------
 f        | f        | t
(1 row)

in this example @> and <@ do not work at all. Also read Compare arrays for equality, ignoring order of elements

Community
  • 1
  • 1
Vao Tsun
  • 47,234
  • 13
  • 100
  • 132
  • You'll also need to do the `except` both ways. You're technically right, **if** the array can contain `NULL` values, this is one of the few possible way to do it, but if it does not contain any `NULL`s, this can also be a [huge performance killer](http://rextester.com/TUBW76742). – pozs Apr 13 '17 at 08:36
  • yes, it is - but I assumed nulls from the question, my logic: if not nulls not mentioned, then there can be nulls – Vao Tsun Apr 13 '17 at 08:39
  • @pozs thanks for your notice! to avoid except twice casting to text – Vao Tsun Apr 13 '17 at 09:14