Return multiple values and populate column in array like manner

Question

I am using Postgres 9.3 on MacOSX.

I am wondering how I can return multiple values (depending on certain criterion) and use them to populate a column in a list/array like manner?

--DUMMY DATA

CREATE TABLE tbl (
   id VARCHAR(2) PRIMARY KEY
  ,name TEXT
  ,year_born NUMERIC
  ,nationality TEXT
);
INSERT INTO tbl(id, name, year_born, nationality)
VALUES ('A1','Bill',2001,'American')
      ,('B1','Anna',1997,'Swedish')
      ,('A2','Bill',1991,'American')
      ,('B2','Anna',2004,'Swedish')
      ,('B3','Anna',1989,'Swedish')
      ,('A3','Bill',1995,'American');
SELECT * FROM tbl;

id | name | year_born | nationality
---+------+-----------+------------
A1 | Bill |   2001    |  American
B1 | Anna |   1997    |  Swedish
A2 | Bill |   1991    |  American
B2 | Anna |   2004    |  Swedish
B3 | Anna |   1989    |  Swedish
A3 | Bill |   1995    |  American

I pool over column name, nationality by using SELECT DISTINCT ON clause as in the below code

CREATE TABLE another_tbl ( name TEXT, nationality TEXT, ids VARCHAR ); 

CREATE FUNCTION f1() RETURNS SETOF another_tbl AS
$$ SELECT DISTINCT ON (name, nationality) name, nationality, id
   FROM tbl
   GROUP BY name, nationality, ID;
$$ LANGUAGE sql

SELECT * FROM f1();

 name | nationality | ids 
------+-------------+-----
 Anna |  Swedish    | B1
 Bill |  American   | A1

So, here is the thing which I do not know how to achieve, but which I reckon is fairly easy. I want column ids to be populated by all the id's corresponding to the names in the name column as seen below.

Desired output:

 SELECT * FROM f1();

 name | nationality | ids 
------+-------------+-----
 Anna |  Swedish    | B1, B2, B3
 Bill |  American   | A1, A2, A3

Update

Found out about ARRAY which I use together with class VARCHAR for column ids in another_tbl. However, I get a mismatch call saying Final statement returns character varying instead ofcharacter varying[]at column 3.

Erwin Brandstetter · Accepted Answer · 2013-11-20T04:22:43.793

8

Use GROUP BY and the aggregate function string_agg() if you want a text column as result.
Or array_agg() to construct an array.
But drop the now redundant DISTINCT ON.

SELECT name, nationality, string_agg(id, ',') AS ids
FROM   tbl
GROUP  BY 1, 2
ORDER  BY 1, 2;

The RETURNS clause of your function definition has to match, like @ozczecho suggested:

CREATE FUNCTION f1()
  RETURNS TABLE(name text, nationality text, ids text) AS
                                              -- varchar[] for array_agg()
$func$
SELECT t.name, t.nationality, string_agg(t.id, ',') AS ids
FROM   tbl t
GROUP  BY 1, 2
ORDER  BY 1, 2;
$func$ LANGUAGE sql;

edited Nov 20 '13 at 04:22

answered Nov 20 '13 at 04:15

Erwin Brandstetter

605,456
145
1,078
1,228

Cool, beats my answer..Thanks for clarifying! – jO. Nov 20 '13 at 04:31
I'm attempting to do the same, but using MS SQL. Is there any equivalent for string_agg() in MS SQL? Or another approach for MS SQL that would work to give similar results as this question asks? – Hamman Samuel Dec 16 '13 at 20:30
1

@HammanSamuel: I asked this [similar question](http://stackoverflow.com/questions/8109475/sql-server-collect-values-in-an-aggregation-temporarily-and-re-use-in-the-same) quite some time ago and got an excellent answer. – Erwin Brandstetter Dec 16 '13 at 20:52

score 2 · Answer 2 · answered Nov 20 '13 at 02:23

2

I believe you should change:

RETURNS SETOF another_tbl

to:

RETURNS TABLE(name TEXT, nationality TEXT, ids VARCHAR[])

answered Nov 20 '13 at 02:23

ozczecho

8,649
8
36
42

Thanks for commenting. Your suggestion gives the same mismatch error message however, since `RETURNS TABLE(name...)` in this case should be equivalent to `another_tbl`. And by default `VARCHAR[]` will set the array to one-dimension. In my case I do not know in advance how many dimensions the array needs to have. I think you can use `ARRAY` for that. – jO. Nov 20 '13 at 02:44
@jO.: Currently, there is only *one* array type for any given base type in Postgres, covering all dimensions. [Details in this related answer.](http://stackoverflow.com/questions/18707455/passing-multidimensional-array-as-parameter-postgresql-function/18709721#18709721) – Erwin Brandstetter Nov 20 '13 at 04:33

jO. · Answer 3 · 2013-11-20T03:52:53.450

Ok, it works using the array_agg function.

CREATE FUNCTION f1() RETURNS SETOF another_tbl AS
$$ SELECT DISTINCT ON (name, nationality) name, nationality, array_agg(id)
   FROM tbl
   GROUP BY name, nationality
$$ LANGUAGE sql

SELECT * FROM f();

name | nationality |    ids     
-----+-------------+-----------
Anna | Swedish     | {B1,B2,B3}
Bill | American    | {A1,A2,A3}

And after adding the function array_to_string to array_agg(id) we get the desired output

name | nationality |   ids    
-----+-------------+---------
Anna | Swedish     | B1,B2,B3
Bill | American    | A1,A2,A3

Return multiple values and populate column in array like manner

3 Answers3