2

I would like to split a column that represent a csv line in postgres. Fields in this text line are delimited by pipe, sometime they are enclosed by quote and sometime not. In addition we can have escaped chars.

field1|"field2"|field3|"22 \" lcd \| screen "

Is there a regex to split this column (i.e. using regexp_split_to_array(....)? )

Clodoaldo Neto
  • 118,695
  • 26
  • 233
  • 260
Roberto G.
  • 171
  • 5
  • 12

1 Answers1

1

Not about regexp but it works

create or replace function split_csv(
  line text,
  delim_char char(1) = ',',
  quote_char char(1) = '"')
returns setof text[] immutable language plpythonu as $$
  import csv
  return csv.reader(line.splitlines(), quotechar=quote_char, delimiter=delim_char, skipinitialspace=True, escapechar='\\')
$$;

select *, x[4] from split_csv('field1|"field2"|field3|"22 \" lcd \| screen "'||E'\n'||'a|b', delim_char := '|') as x;
╔══════════════════════════════════════════════╤════════════════════╗
║                      x                       │         x          ║
╠══════════════════════════════════════════════╪════════════════════╣
║ {field1,field2,field3,"22 \" lcd | screen "} │ 22 " lcd | screen  ║
║ {a,b}                                        │ ░░░░               ║
╚══════════════════════════════════════════════╧════════════════════╝
Peter Krauss
  • 13,174
  • 24
  • 167
  • 304
Abelisto
  • 14,826
  • 2
  • 33
  • 41
  • Hi, what kind of additional setup requires this solutions? Should I only add python postgres extension or do I need to install specific csv module? – Roberto G. Feb 04 '17 at 11:19
  • @RobertoG. Honestly I am not sure. It works "out of the box" here (Linux+PostgreSQL+Python). Just try. – Abelisto Feb 04 '17 at 14:22
  • @RobertoG. BTW I was sleepy, sorry. Function simplified, more complex example added. Good luck. – Abelisto Feb 04 '17 at 14:47
  • For direct use (get CSV from filesystem!) replace `line.splitlines()` by `open(line, 'rb')` and use a filename as input, eg. `/tmp/test.csv` . In general pg-server can read only from `/tmp`. – Peter Krauss Mar 21 '17 at 14:34