1

I have data set that I am uploading in to two columns in a Redshift database table:

{A, 1|2|3|4}

Once loaded into a table I need to convert it to a set of rows and insert it into another table:

{A, 1},{A,2},{A,3},{A,4}

I am wondering if there is a specific function in Redshift that I can use or if there is a specific SQL plan anyone has written to perform this type of conversion. Any help is appreciated.

NOTES: The data in the second column can vary in range from 1 data point to 1000 data points.

Stephen McKain
  • 285
  • 2
  • 5
  • 19

2 Answers2

0

EDIT2: Redshift doesn't support Function nor Triggers so the below is only useful if you are using a PostgreSQL database


EDIT: The following function can be called to do the insert and it should do what you need, after adapting it to your schema

CREATE OR REPLACE FUNCTION insert_function(title TEXT, data TEXT)
RETURNS BOOLEAN AS $$
  BEGIN
    INSERT INTO table_one(title, data) VALUES (title, data);
    INSERT INTO table_two(title, data) SELECT title, unnest(string_to_array(data, '|'));
    RETURN TRUE;
  END;
$$ LANGUAGE plpgsql

To do an insert you would do the following:

SELECT insert_function('A', '1|2|3');

You will need to adapt the following Function and Trigger to fit your schema but it should do what you want:

CREATE OR REPLACE FUNCTION insert_function()
RETURNS trigger AS $$
  BEGIN
    INSERT INTO table_two(title, data) SELECT NEW.title, unnest(string_to_array(NEW.data, '|'));
    RETURN NEW;
  END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER insert_trigger AFTER INSERT ON table_one FOR EACH ROW EXECUTE PROCEDURE insert_function();
Winter Faulk
  • 413
  • 4
  • 9
  • 1
    Unfortunately, Amazon Redshift does not support Triggers. See: [Unsupported PostgreSQL Features](http://docs.aws.amazon.com/redshift/latest/dg/c_unsupported-postgresql-features.html) – John Rotenstein Nov 26 '16 at 05:35
  • I could not get unnest() to work in Redshift. It doesn't appear supported. – Stephen McKain Nov 30 '16 at 16:00
0

With the help of some code from @Charles Lee on sequence number generation function in AWS redshift:

First, create a table with the values 0 to 1000. This is required because the generate_series() function cannot operate at the table-level.

with seq_0_9 as (
select 0 as num
union all select 1 as num
union all select 2 as num
union all select 3 as num
union all select 4 as num
union all select 5 as num
union all select 6 as num
union all select 7 as num
union all select 8 as num
union all select 9 as num
)
select a.num + b.num * 10 + c.num * 100 as num
INTO numbers
from seq_0_9 a, seq_0_9 b, seq_0_9 c
order by num

Let's create your input table:

CREATE TABLE one (a text, b text);
INSERT INTO one VALUES ('A', '1|2|3|4');

Then, extract every combination of the first column and the split values in the second column:

CREATE TABLE two AS
SELECT
  a,
  split_part(b, '|', num) AS b
FROM one, numbers
WHERE
  num >0
  AND split_part(b, '|', num) != ''

The result in table two is:

A 1
A 2
A 3
A 4
Community
  • 1
  • 1
John Rotenstein
  • 241,921
  • 22
  • 380
  • 470