eliminate duplicate array values in postgres

Question

I have an array of type bigint, how can I remove the duplicate values in that array?

Ex: array[1234, 5343, 6353, 1234, 1234]

I should get array[1234, 5343, 6353, ...]

I tested out the example SELECT uniq(sort('{1,2,3,2,1}'::int[])) in the postgres manual but it is not working.

score 146 · Answer 1 · edited Jun 02 '16 at 12:35

146

I faced the same. But an array in my case is created via array_agg function. And fortunately it allows to aggregate DISTINCT values, like:

  array_agg(DISTINCT value)

This works for me.

edited Jun 02 '16 at 12:35

ρяσѕρєя K

132,198
53
198
213

answered Jun 02 '16 at 12:09

Mikhail Lisakov

1,600
1
9
7

10

Note that DISTINCT is not supported for window functions. – Thinkable Jun 28 '16 at 17:38
tks guy `trim(string_agg(distinct to_char(z.dat_codigo,'0000000000'),'')) as dat_codigo,` – Fábio Zangirolami Jul 21 '17 at 02:44
4

select array_agg(DISTINCT Array[1,2,2,3]) gives "{{1,2,2,3}}" – user48956 Aug 25 '17 at 17:02
2

@user48956, that is logical, as you enter an array as value, you need to set a single column as value that is grouped by in the query – Daniël Tulp Jun 01 '18 at 19:44

score 104 · Answer 2 · edited Oct 08 '19 at 12:03

The sort(int[]) and uniq(int[]) functions are provided by the intarray contrib module.

To enable its use, you must install the module.

If you don't want to use the intarray contrib module, or if you have to remove duplicates from arrays of different type, you have two other ways.

If you have at least PostgreSQL 8.4 you could take advantage of unnest(anyarray) function

SELECT ARRAY(SELECT DISTINCT UNNEST('{1,2,3,2,1}'::int[]) ORDER BY 1);
 ?column? 
----------
 {1,2,3}
(1 row)

Alternatively you could create your own function to do this

CREATE OR REPLACE FUNCTION array_sort_unique (ANYARRAY) RETURNS ANYARRAY
LANGUAGE SQL
AS $body$
  SELECT ARRAY(
    SELECT DISTINCT $1[s.i]
    FROM generate_series(array_lower($1,1), array_upper($1,1)) AS s(i)
    ORDER BY 1
  );
$body$;

Here is a sample invocation:

SELECT array_sort_unique('{1,2,3,2,1}'::int[]);
 array_sort_unique 
-------------------
 {1,2,3}
(1 row)

The solution of the problem ("eliminate duplicate array values") not need to *sort*. Although usually be a useful feature, it is unnecessary (CPU cost) in this context/requirement. — Peter Krauss, Oct 19 '16 at 12:26
Well it is actually needed, uniq only "eliminate neighbor duplicates". Take care! — lzap, Aug 31 '21 at 07:10

Peter Krauss · Answer 3 · 2016-07-22T22:14:51.063

... Where the statandard libraries (?) for this kind of array_X utility??

Try to search... See some but no standard:

postgres.cz/wiki/Array_based_functions: good reference!
JDBurnZ/postgresql-anyarray, good initiative but needs some collaboration to enhance.
wiki.postgresql.org/Snippets, frustrated initiative, but "offcial wiki", needs some collaboration to enhance.
MADlib: good! .... but it is an elephant, not an "pure SQL snippets lib".

Simplest and faster `array_distinct()` snippet-lib function

Here the simplest and perhaps faster implementation for array_unique() or array_distinct():

CREATE FUNCTION array_distinct(anyarray) RETURNS anyarray AS $f$
  SELECT array_agg(DISTINCT x) FROM unnest($1) t(x);
$f$ LANGUAGE SQL IMMUTABLE;

NOTE: it works as expected with any datatype, except with array of arrays,

SELECT  array_distinct( array[3,3,8,2,6,6,2,3,4,1,1,6,2,2,3,99] ), 
        array_distinct( array['3','3','hello','hello','bye'] ), 
        array_distinct( array[array[3,3],array[3,3],array[3,3],array[5,6]] );
 -- "{1,2,3,4,6,8,99}",  "{3,bye,hello}",  "{3,5,6}"

the "side effect" is to explode all arrays in a set of elements.

PS: with JSONB arrays works fine,

SELECT array_distinct( array['[3,3]'::JSONB, '[3,3]'::JSONB, '[5,6]'::JSONB] );
 -- "{"[3, 3]","[5, 6]"}"

Edit: more complex but useful, a "drop nulls" parameter

CREATE FUNCTION array_distinct(
      anyarray, -- input array 
      boolean DEFAULT false -- flag to ignore nulls
) RETURNS anyarray AS $f$
      SELECT array_agg(DISTINCT x) 
      FROM unnest($1) t(x) 
      WHERE CASE WHEN $2 THEN x IS NOT NULL ELSE true END;
$f$ LANGUAGE SQL IMMUTABLE;

can you please explain what is t(x) doing in FROM unnest($1) t(x) ...also how can i keep the order of items in which they inserted — abhirathore2006, Oct 18 '16 at 06:59
@abhirathore2006 this answer is a Wiki, you can write the explanations that you suggested. About "keep the order", no, it is a destructive solution, See PLpgSQL solutions in this page to preserve original array order. It is also commom the two requiriments, *sort* and *distinct* (see sucess of [main answer here](http://stackoverflow.com/a/3995141/287948) and my comment there). — Peter Krauss, Oct 19 '16 at 12:32
no worries, i already found the solution from somewhere else, yes that is plsql solution — abhirathore2006, Oct 19 '16 at 13:05

tbussmann · Answer 4 · 2017-03-07T09:31:45.053

20

Using DISTINCT implicitly sorts the array. If the relative order of the array elements needs to be preserved while removing duplicates, the function can be designed like the following: (should work from 9.4 onwards)

CREATE OR REPLACE FUNCTION array_uniq_stable(anyarray) RETURNS anyarray AS
$body$
SELECT
    array_agg(distinct_value ORDER BY first_index)
FROM 
    (SELECT
        value AS distinct_value, 
        min(index) AS first_index 
    FROM 
        unnest($1) WITH ORDINALITY AS input(value, index)
    GROUP BY
        value
    ) AS unique_input
;
$body$
LANGUAGE 'sql' IMMUTABLE STRICT;

edited Mar 07 '17 at 09:31

answered Feb 22 '17 at 18:17

tbussmann

593
7
11

2

best answer! see also: https://dba.stackexchange.com/questions/211501/how-to-remove-duplicates-from-any-array-and-preserve-ordering-in-postgresql – fjsj Jan 07 '19 at 18:48

Joshua Burns · Answer 5 · 2013-09-16T16:00:35.853

I have assembled a set of stored procedures (functions) to combat PostgreSQL's lack of array handling coined anyarray. These functions are designed to work across any array data-type, not just integers as intarray does: https://www.github.com/JDBurnZ/anyarray

In your case, all you'd really need is anyarray_uniq.sql. Copy & paste the contents of that file into a PostgreSQL query and execute it to add the function. If you need array sorting as well, also add anyarray_sort.sql.

From there, you can peform a simple query as follows:

SELECT ANYARRAY_UNIQ(ARRAY[1234,5343,6353,1234,1234])

Returns something similar to: ARRAY[1234, 6353, 5343]

Or if you require sorting:

SELECT ANYARRAY_SORT(ANYARRAY_UNIQ(ARRAY[1234,5343,6353,1234,1234]))

Return exactly: ARRAY[1234, 5343, 6353]

score 10 · Answer 6 · edited Jun 23 '15 at 08:26

10

Here's the "inline" way:

SELECT 1 AS anycolumn, (
  SELECT array_agg(c1)
  FROM (
    SELECT DISTINCT c1
    FROM (
      SELECT unnest(ARRAY[1234,5343,6353,1234,1234]) AS c1
    ) AS t1
  ) AS t2
) AS the_array;

First we create a set from array, then we select only distinct entries, and then aggregate it back into array.

edited Jun 23 '15 at 08:26

Patrick

29,357
6
62
90

answered May 14 '14 at 09:01

alexkovelsky

3,880
1
27
21

10

Or "more inline" ;-) `SELECT array_agg(DISTINCT c1) FROM unnest(ARRAY[1234,5343,6353,1234,1234]) t(c1)` – Peter Krauss Apr 19 '16 at 18:56

score 6 · Answer 7 · answered Apr 08 '20 at 12:51

6

In a single query i did this:

SELECT (select array_agg(distinct val) from ( select unnest(:array_column) as val ) as u ) FROM :your_table;

answered Apr 08 '20 at 12:51

Gregorio Freidin

101
1
3

score 2 · Answer 8 · answered Jan 11 '13 at 23:41

For people like me who still have to deal with postgres 8.2, this recursive function can eliminate duplicates without altering the sorting of the array

CREATE OR REPLACE FUNCTION my_array_uniq(bigint[])
  RETURNS bigint[] AS
$BODY$
DECLARE
    n integer;
BEGIN

    -- number of elements in the array
    n = replace(split_part(array_dims($1),':',2),']','')::int;

    IF n > 1 THEN
        -- test if the last item belongs to the rest of the array
        IF ($1)[1:n-1] @> ($1)[n:n] THEN
            -- returns the result of the same function on the rest of the array
            return my_array_uniq($1[1:n-1]);
        ELSE
            -- returns the result of the same function on the rest of the array plus the last element               
            return my_array_uniq($1[1:n-1]) || $1[n:n];
        END IF;
    ELSE
        -- if array has only one item, returns the array
        return $1;
    END IF;
END;
$BODY$
  LANGUAGE 'plpgsql' VOLATILE;

for exemple :

select my_array_uniq(array[3,3,8,2,6,6,2,3,4,1,1,6,2,2,3,99]);

will give

{3,8,2,6,4,1,99}

score 0 · Answer 9 · answered Jul 04 '23 at 09:48

0

AFAICT if you just want this inline, you can do:

ARRAY(SELECT value FROM UNNEST(%%someArrayHere%%) WITH ORDINALITY AS entry(value, idx) GROUP BY value ORDER BY min(idx)))

answered Jul 04 '23 at 09:48

Paul

335
1
3
16

eliminate duplicate array values in postgres

9 Answers9

Simplest and faster `array_distinct()` snippet-lib function

Linked

eliminate duplicate array values in postgres

9 Answers9

Simplest and faster array_distinct() snippet-lib function

Linked

Simplest and faster `array_distinct()` snippet-lib function