1

I have a .csv file which contains French characters. The data is like this: Antonie Bégarder,12345,6789,France

The file got loaded to S3. I am using COPY command to load the file from S3 to Redshift. I was getting the error:

String contains invalid or unsupported UTF8 codepoints. Bad UTF8 hex sequence: e9 6c 61 (error 4)

After that I used, ACCEPTINVCHARS as '`' in the COPY command. The data got loaded, but it looks like:

Antonie B`garder

Any solution for this?

Sourav Gupta
  • 227
  • 5
  • 17
  • Giving just ACCEPTINVCHARS is replacing the accented character with a default ? – Sourav Gupta Sep 19 '18 at 13:42
  • Did you try `ENCODING [AS] file_encoding` option into `Copy command`? – Red Boy Sep 19 '18 at 13:52
  • 1
    Error 4 means - The value of the trailing byte in the byte sequence is out of range. The continuation byte must be between 128 and 191 (inclusive). My guess is that your source data is not UTF8 or has some bad encoding. can you check using some tool? https://stackoverflow.com/questions/115210/how-to-check-whether-a-file-is-valid-utf-8 – Jon Scott Sep 19 '18 at 14:00
  • @JonScott The error is happening for that particular value. I removed the particular row and data got loaded perfectly. The issue is only for that particular character. – Sourav Gupta Sep 19 '18 at 14:15
  • 1
    @JonScott Yes, it worked... The file was not in UTF8. I svaed the same source file in UTF8 and loaded. It worked.. Thanks.... – Sourav Gupta Sep 19 '18 at 14:27

0 Answers0