0

I have a large body of text and other data that I need to import into Postgres. This text contains all the possible single-byte characters. This means I can't choose ",", ";", "-" or any other single-byte character as a delimiter in my CSV file because it would be confused by the text that contains it.

Is there any way to chose a multibyte character as a delimiter, use multiple characters as a delimiter or use COPY command in some other way to solve this?

Command I'm using:

COPY site_articles(id,url,title,content) FROM '/home/sites/site_articles.csv' DELIMITER '^' CSV;
Sergey
  • 4,702
  • 6
  • 26
  • 32

1 Answers1

1

This means I can't choose ",", ";", "-" or any other single-byte character as a delimiter in my CSV file because it would be confused by the text that contains it.

CSV has an escaping mechanism. Use it. Quote strings that contain the delimiter character ,, and if the quoted string contains the quote character, double the quote character.

e.g. if you want to represent two values Fred "wiggle" Smith and one, two, you'd do so as:

"Fred ""Wiggle"" Smith","one, two"

At time of writing (9.5) copy does not support multi-byte characters as delimiters. You can use 3rd party ETL tools like Pentaho Kettle, though.

Community
  • 1
  • 1
Craig Ringer
  • 307,061
  • 76
  • 688
  • 778