how to convert a dataset list to a set of rows in redshift

Question

I have data set that I am uploading in to two columns in a Redshift database table:

{A, 1|2|3|4}

Once loaded into a table I need to convert it to a set of rows and insert it into another table:

{A, 1},{A,2},{A,3},{A,4}

I am wondering if there is a specific function in Redshift that I can use or if there is a specific SQL plan anyone has written to perform this type of conversion. Any help is appreciated.

NOTES: The data in the second column can vary in range from 1 data point to 1000 data points.

You write *"uploading in to two columns in a Postgresql database table"*, yet I see the Redshift tag. What is it Postgres or Redshift? — Erwin Brandstetter, Nov 26 '16 at 03:18
Nope, Redshift is a fork based on the ancient PostgreSQL version 8.0. It's ok to tag the question [postgres] additionally, but be clear about your actual RDBMS and version. — Erwin Brandstetter, Nov 26 '16 at 03:43

Winter Faulk · Answer 1 · 2016-11-26T07:56:01.857

EDIT2: Redshift doesn't support Function nor Triggers so the below is only useful if you are using a PostgreSQL database

EDIT: The following function can be called to do the insert and it should do what you need, after adapting it to your schema

CREATE OR REPLACE FUNCTION insert_function(title TEXT, data TEXT)
RETURNS BOOLEAN AS $$
  BEGIN
    INSERT INTO table_one(title, data) VALUES (title, data);
    INSERT INTO table_two(title, data) SELECT title, unnest(string_to_array(data, '|'));
    RETURN TRUE;
  END;
$$ LANGUAGE plpgsql

To do an insert you would do the following:

SELECT insert_function('A', '1|2|3');

You will need to adapt the following Function and Trigger to fit your schema but it should do what you want:

CREATE OR REPLACE FUNCTION insert_function()
RETURNS trigger AS $$
  BEGIN
    INSERT INTO table_two(title, data) SELECT NEW.title, unnest(string_to_array(NEW.data, '|'));
    RETURN NEW;
  END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER insert_trigger AFTER INSERT ON table_one FOR EACH ROW EXECUTE PROCEDURE insert_function();

Unfortunately, Amazon Redshift does not support Triggers. See: [Unsupported PostgreSQL Features](http://docs.aws.amazon.com/redshift/latest/dg/c_unsupported-postgresql-features.html) — John Rotenstein, Nov 26 '16 at 05:35
I could not get unnest() to work in Redshift. It doesn't appear supported. — Stephen McKain, Nov 30 '16 at 16:00

score 0 · Answer 2 · edited May 23 '17 at 12:13

With the help of some code from @Charles Lee on sequence number generation function in AWS redshift:

First, create a table with the values 0 to 1000. This is required because the generate_series() function cannot operate at the table-level.

with seq_0_9 as (
select 0 as num
union all select 1 as num
union all select 2 as num
union all select 3 as num
union all select 4 as num
union all select 5 as num
union all select 6 as num
union all select 7 as num
union all select 8 as num
union all select 9 as num
)
select a.num + b.num * 10 + c.num * 100 as num
INTO numbers
from seq_0_9 a, seq_0_9 b, seq_0_9 c
order by num

Let's create your input table:

CREATE TABLE one (a text, b text);
INSERT INTO one VALUES ('A', '1|2|3|4');

Then, extract every combination of the first column and the split values in the second column:

CREATE TABLE two AS
SELECT
  a,
  split_part(b, '|', num) AS b
FROM one, numbers
WHERE
  num >0
  AND split_part(b, '|', num) != ''

The result in table two is:

A 1
A 2
A 3
A 4

how to convert a dataset list to a set of rows in redshift

2 Answers2