Split comma separated values into target table with fixed number of columns

Question

I have a table with a single column in a Postgres 13.1 database. It consists of many rows with comma-separated values - around 20 elements at most.

I want to split the data into multiple columns. But I have only a limited number of columns say 5 and more than 5 CSV values in a single row, so excess values must be shifted to new/next row). How to do this?

Example:

a1, b1, c1
a2, b2, c2, d2, e2, f2
a3, b3, c3, d3, e3, f3, g3, h3, i3, j3
a4
a5, b5, c5
'
'
'

Columns are only 5, so the output would be like:

c1 c2 c3 c4 c5
---------------
a1 b1 c1
a2 b2 c2 d2 e2 
f2
a3 b3 c3 d3 e3
f3 g3 h3 i3 j3
a4
a5 b5 c5
'
'
'

@ErwinBrandstetter Yes, the order in which they are in. There's no logic part in the order. Thanks — Martin2000, Feb 20 '21 at 12:27
@ErwinBrandstetter Hello. Is there any function or a query that puts the values from the right side into the columns? Like a1 value under c5, b1 under c4, and c1 under c3? — Martin2000, Feb 23 '21 at 09:07

Erwin Brandstetter · Accepted Answer · 2021-02-24T12:18:43.373

It is typically bad design to store CSV values in a single column. If at all possible, use an array or a properly normalized design instead.

While stuck with your current situation ...

For known small maximum number of elements

A simple solution without trickery or recursion will do:

SELECT id, 1 AS rnk
     , split_part(csv, ', ', 1) AS c1
     , split_part(csv, ', ', 2) AS c2
     , split_part(csv, ', ', 3) AS c3
     , split_part(csv, ', ', 4) AS c4
     , split_part(csv, ', ', 5) AS c5
FROM   tbl
WHERE  split_part(csv, ', ', 1) <> '' -- skip empty rows

UNION ALL
SELECT id, 2
     , split_part(csv, ', ', 6)
     , split_part(csv, ', ', 7)
     , split_part(csv, ', ', 8)
     , split_part(csv, ', ', 9)
     , split_part(csv, ', ', 10)
FROM   tbl
WHERE  split_part(csv, ', ', 6) <> '' -- skip empty rows

-- three more blocks to cover a maximum "around 20"

ORDER  BY id, rnk;

db<>fiddle here

id being the PK of the original table.
This assumes ', ' as separator, obviously.
You can adapt easily.

Split comma separated column data into additional columns

For unknown number of elements

Various ways. One way use regexp_replace() to replace every fifth separator before unnesting ...

-- for any number of elements
SELECT t.id, c.rnk
     , split_part(c.csv5, ', ', 1) AS c1
     , split_part(c.csv5, ', ', 2) AS c2
     , split_part(c.csv5, ', ', 3) AS c3
     , split_part(c.csv5, ', ', 4) AS c4
     , split_part(c.csv5, ', ', 5) AS c5
FROM   tbl t
     , unnest(string_to_array(regexp_replace(csv, '((?:.*?,){4}.*?),', '\1;', 'g'), '; ')) WITH ORDINALITY c(csv5, rnk)
ORDER  BY t.id, c.rnk;

db<>fiddle here

This assumes that the chosen separator ; never appears in your strings. (Just like , can never appear.)

The regular expression pattern is the key: '((?:.*?,){4}.*?),'

(?:) ... “non-capturing” set of parentheses
() ... “capturing” set of parentheses
*? ... non-greedy quantifier
{4}? ... sequence of exactly 4 matches

The replacement '\1;' contains the back-reference \1.

'g' as fourth function parameter is required for repeated replacement.

Fill from right to left

(Like you added in How to put values starting from the right side into columns?)
Simply count down numbers like:

SELECT t.id, c.rnk
     , split_part(c.csv5, ', ', 5) AS c1
     , split_part(c.csv5, ', ', 4) AS c2
     , split_part(c.csv5, ', ', 3) AS c3
     , split_part(c.csv5, ', ', 2) AS c4
     , split_part(c.csv5, ', ', 1) AS c5
FROM ...

db<>fiddle here

Thank you so much for the help. Question- what does '<>' this means in WHERE clause? Though I am new to this. Is there any option to do it with hard coding like what if in the future there is a row with more than 20 elements? Please help. — Martin2000, Feb 20 '21 at 12:45
@Martin2000: [`<>` is the "not equal" operator](https://www.postgresql.org/docs/current/functions-comparison.html). If the first column is not equal to the empty string (`''`), then the row isn't noise. See: https://stackoverflow.com/a/23767625/939860 — Erwin Brandstetter, Feb 20 '21 at 12:49
@Martin2000: I added solutions for an arbitrary number of elements, and to fill from right to left. — Erwin Brandstetter, Feb 24 '21 at 12:19

score 3 · Answer 2 · answered Feb 20 '21 at 13:47

CREATE UNLOGGED TABLE foo( x TEXT );
\copy foo FROM stdin
a1, b1, c1
a2, b2, c2, d2, e2, f2
a3, b3, c3, d3, e3, f3, g3, h3, i3, j3
a4
a5, b5, c5
\.

From lines to single column...

SELECT (ROW_NUMBER() OVER () - 1)/5 AS r, u FROM (SELECT unnest(string_to_array(x,', ')) u from foo) y;
 r | u
---+----
 0 | a1
 0 | b1
 0 | c1
 0 | a2
 0 | b2
 1 | c2
 1 | d2
...etc

...and back to lines of known length.

SELECT r,array_agg(u) a FROM (
 SELECT (ROW_NUMBER() OVER () - 1)/5 AS r, u FROM (
  SELECT unnest(string_to_array(x,', ')) u from foo) y) y1 
GROUP BY r ORDER BY r;
 r |    a
---+------------------
 0 | {a1,b1,c1,a2,b2}
 1 | {c2,d2,e2,f2,a3}
 2 | {b3,c3,d3,e3,f3}
 3 | {g3,h3,i3,j3,a4}
 4 | {a5,b5,c5}

After this you can insert it into a table using a[] for each column. What to do with the last line is left as an exercise to the reader...

score 1 · Answer 3 · answered Feb 23 '21 at 14:12

Answer to related question: How to put values starting from the right side into columns?

The accepted great answer from @ErwinBrandstetter can be easily adapted to required right-to-left output.

You just need to change to order of the split parts. So you don't return split parts 1-5 and 6-10 but 5-1 and 10-6:

demo:db<>fiddle

SELECT id, 1 AS rnk
     , split_part(csv, ', ', 5) AS c1
     , split_part(csv, ', ', 4) AS c2
     , split_part(csv, ', ', 3) AS c3
     , split_part(csv, ', ', 2) AS c4
     , split_part(csv, ', ', 1) AS c5
FROM   tbl
WHERE  split_part(csv, ', ', 1) <> '' -- skip empty rows

UNION ALL
SELECT id, 2
     , split_part(csv, ', ', 10)
     , split_part(csv, ', ', 9)
     , split_part(csv, ', ', 8)
     , split_part(csv, ', ', 7)
     , split_part(csv, ', ', 6)
FROM   tbl
WHERE  split_part(csv, ', ', 6) <> '' -- skip empty rows

-- more?

ORDER  BY id, rnk;

What do you mean with "strong logic"? Sry, the question is not quite clear to me. — S-Man, Feb 23 '21 at 14:58

score 0 · Answer 4 · answered Feb 20 '21 at 12:21

You need to do this in the whatever backend layer you are using.

First, convert the CSV rows to array of string

Then, use logic something like this to add Values to the database

int row = 0; // database row index - can be used to just have a count

final int MAX_COLUMNS = 5;

for(int i = 0; i<rows.length; i++) {
    // Convert csv row string to array of each value.
    String [] values = rows[i].split(",");

    // Dividing whole row into chunks of size of number of columns

    for(int j = 0; j < (values.length/(MAX_COLUMNS)) + 1; j++) {
       Add Values [MAX_COLUMNS*j,MAX_COLUMNS*j+(MAX_COLUMNS - 1)] to the row [row + j]
      row++;
    }

}

Could you please tell me in Postgres? I am new to this and don't know the syntax you are using or how to convert your syntax into Postgres. — Martin2000, Feb 20 '21 at 16:15

Split comma separated values into target table with fixed number of columns

4 Answers4

For known small maximum number of elements

For unknown number of elements

Fill from right to left

Linked

Related