0

I am trying to upload a series of tables (.txt files) into a PostgreSQL database that runs on my Windows 10 desktop. I use psql upload the files. I have successfully uploaded a couple of tables but the largest one (5GB with over 20 million rows) is giving me trouble:

databasename=# \copy table1 FROM 'C:\Users\tablename.txt' DELIMITER ',' CSV HEADER;
ERROR: character with byte sequence 0x9d in encoding "WIN1252" has no equivalent in encoding "UTF8"
CONTEXT: COPY table1, line 581330

I found an answer here which suggested I check the client encoding...

databasename=# SHOW client_encoding;
client_encoding
-----------------
WIN1252
(1 row)

and then change it, which I tried:

databasename=# SET CLIENT_ENCODING TO 'utf8';
SET

I then try the same copy command again and get the following error:

ERROR: invalid byte sequence for encoding "UTF8": 0x92
CONTEXT: COPY table1, line 206051

I've read a little about 0x92 here. It sounds like there is a character in the file which cannot be encoded when I try and perform the \copy command.

Some background:

I was able to upload about 1 million rows into SQL Server 2019 (free version) using the SQL Server Import and Export Wizard. (I stopped the import because it was taking too long.) I was also able to view the file in R using read.csv. Not sure if any of this is helpful. Thank you all in advance.

SupaDupa
  • 91
  • 2
  • 10
  • Do you know what these are really supposed to be? IN 1252, 0x92 is apparently a curly single quote, while 0x9d is apparently nothing. In 437, they seem to be AE ligature, and a Yen symbol. – jjanes Apr 10 '20 at 19:58
  • @jjanes I checked the line referenced in the first error in R. A row contains one column where some of the characters are as such: €Œâ€™Vâ€. – SupaDupa Apr 10 '20 at 20:22

0 Answers0