2

I have multiple tables with each two rows of interest: connection_node_start_id and connection_node_end_id. My goal is to get a collection of all those IDs, either as a flat ARRAY or as a new TABLE consisting of one row.

Example output ARRAY:

result = {1,4,7,9,2,5}

Example output TABLE:

IDS
-------
1
4
7
9
2 
5

My fist attempt is somewhat clumsy and does not work properly as the SELECT statement just returns one row. It seems there must be a simple way to do this, can someone point me into the right direction?

CREATE OR REPLACE FUNCTION get_connection_nodes(anyarray)
  RETURNS anyarray AS
$$
DECLARE
  table_name varchar;
  result integer[];
  sel integer[];
BEGIN
  FOREACH table_name IN ARRAY $1
  LOOP
     RAISE NOTICE 'table_name(%)',table_name;
     EXECUTE 'SELECT ARRAY[connection_node_end_id, 
                           connection_node_start_id] FROM ' || table_name INTO sel;
    RAISE NOTICE 'sel(%)',sel;
    result  := array_cat(result, sel);  
  END LOOP;
  RETURN result;            
END
$$
  LANGUAGE 'plpgsql';

Test table:

connection_node_start_id | connection_node_end_id
--------------------------------------------------
 1                       | 4 
 7                       | 9 

Call:

SELECT get_connection_nodes(ARRAY['test_table']);

Result:

{1,4}  -- only 1st row, rest is missing
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
LarsVegas
  • 6,522
  • 10
  • 43
  • 67

3 Answers3

2
CREATE OR REPLACE FUNCTION get_connection_nodes(text[])
  RETURNS TABLE (ids int)
  LANGUAGE plpgsql AS
$func$
DECLARE
   _tbl text;
BEGIN
   FOREACH _tbl IN ARRAY $1
   LOOP
      RETURN QUERY EXECUTE format('
         SELECT t.id
         FROM   %I, LATERAL (VALUES (connection_node_start_id)
                                  , (connection_node_end_id)) t(id)'
       , _tbl);
   END LOOP;
END
$func$;

Related answer on dba.SE:

Or drop the loop and concatenate a single query. Probably fastest:

CREATE OR REPLACE FUNCTION get_connection_nodes2(text[])
  RETURNS TABLE (ids int)
  LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE (    
   SELECT string_agg(format(
             'SELECT t.id FROM %I, LATERAL (VALUES (connection_node_start_id)
                                                 , (connection_node_end_id)) t(id)'
           , tbl), ' UNION ALL ')
   FROM   unnest($1) tbl
   );
END
$func$;

Related:

LATERAL was introduced with Postgres 9.3.

For very old Postgres version

You can use the set-returning function unnest() in the SELECT list, too:

CREATE OR REPLACE FUNCTION get_connection_nodes2(text[])
  RETURNS TABLE (ids int)
  LANGUAGE plpgsql AS
$func$
BEGIN
   RETURN QUERY EXECUTE (
   SELECT string_agg(
            'SELECT unnest(ARRAY[connection_node_start_id
                               , connection_node_end_id]) FROM ' || tbl
          , ' UNION ALL '
          )
   FROM (SELECT quote_ident(tbl) AS tbl FROM unnest($1) tbl) t
   );
END
$func$;

Should work with pg 8.4+ (or maybe even older). Works with current Postgres (9.4) as well, but LATERAL is much cleaner.

Or make it very simple:

CREATE OR REPLACE FUNCTION get_connection_nodes3(text[])
  RETURNS TABLE (ids int)
  LANGUAGE plpgsql AS
$func$
BEGIN
   RETURN QUERY EXECUTE (
   SELECT string_agg(format(
             'SELECT connection_node_start_id FROM %1$I
              UNION ALL
              SELECT connection_node_end_id FROM %1$I'
           , tbl), ' UNION ALL ')
   FROM   unnest($1) tbl
   );
END
$func$;

format() was introduced with pg 9.1.

Might be a bit slower with big tables because each table is scanned once for every column (so 2 times here). Sort order in the result is different, too - but that does not seem to matter for you.

Be sure to sanitize escape identifiers to defend against SQL injection and other illegal syntax. Details:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • this looks very interesting. I have to read up on the details, though. Thanks for sharing – LarsVegas Aug 11 '15 at 11:17
  • your code is not working out of the box. Without having further investigated, i gives me this error: ``ERROR: syntax error at or near "(" LINE 3: FROM x, LATERAL (VALUES (x.connection_...`` – LarsVegas Aug 11 '15 at 11:24
  • Sorry, was interrupted. A semicolon was missing. Fixed. Does the sort order matter in any way? Do you prefer a set or an array. I would go with a set like in my answer. – Erwin Brandstetter Aug 11 '15 at 12:24
  • No worries. The semicolon I already had found. I still get the same error. – LarsVegas Aug 11 '15 at 12:29
  • hmm. could it be a version problem? Lateral seems to be quite new...yeah, since version 9.3. I think the version we use in this project is 9.1 – LarsVegas Aug 11 '15 at 12:32
  • @LarsVegas: Your Postgres version is one of the items that should be in *every* question. Even more so if it's not the current version. In older versions you can substitute with `unnest(ARRAY[connection_node_start_id, connection_node_end_id])` like outlined in the linked answer. Adding another alternative ... – Erwin Brandstetter Aug 11 '15 at 12:39
1

The EXECUTE ... INTO statement can only return data from a single row:

If multiple rows are returned, only the first will be assigned to the INTO variable.

In order to concatenate values from all rows you have to aggregate them first by column and then append the arrays:

EXECUTE 'SELECT array_agg(connection_node_end_id) || 
                array_agg(connection_node_start_id) FROM ' || table_name INTO sel;
Patrick
  • 29,357
  • 6
  • 62
  • 90
  • which data type has ``sel`` to be then? When I leave the declaration like this ``sel integer[];`` I'll get this error ``ERROR: could not find array type for data type integer[]`` – LarsVegas Aug 11 '15 at 10:50
  • Ah right, double array. Instead, first aggregate, then append the two resulting arrays. The order of the id's will be different though. See updated answer. – Patrick Aug 11 '15 at 10:57
  • Order does not matter, so that's no problem. Thanks, things are looking brighter now ;-) – LarsVegas Aug 11 '15 at 11:00
1

You're probably looking for something like this:

  CREATE OR REPLACE FUNCTION d (tblname TEXT [])
  RETURNS TABLE (c INTEGER) AS $$

  DECLARE sql TEXT;

  BEGIN
    WITH x
    AS (SELECT unnest(tblname) AS tbl),

      y AS (
      SELECT FORMAT('
      SELECT connection_node_end_id
      FROM %s

        UNION ALL

      SELECT connection_node_start_id
      FROM %s
      ', tbl, tbl) AS s
      FROM x)

    SELECT string_agg(s, ' UNION ALL ')
      INTO sql
    FROM y;

    RETURN QUERY EXECUTE sql;
  END;$$

LANGUAGE plpgsql;

CREATE TABLE a (connection_node_end_id INTEGER, connection_node_start_id INTEGER);
INSERT INTO A VALUES (1,2);

CREATE TABLE b (connection_node_end_id INTEGER, connection_node_start_id INTEGER);
INSERT INTO B VALUES (100, 101);

SELECT * from d(array['a','b']);
  c
-----
   1
   2
 100
 101
(4 rows)
Robins Tharakan
  • 2,209
  • 19
  • 17