I am trying to write data in a StringIO object using Python and then ultimately load this data into a postgres database using psycopg2's copy_from() function.
First when I did this, the copy_from() was throwing an error: ERROR: invalid byte sequence for encoding "UTF8": 0xc92 So I followed this question.
I figured out that my Postgres database has UTF8 encoding.
The file/StringIO object I am writing my data into shows its encoding as the following: setgid Non-ISO extended-ASCII English text, with very long lines, with CRLF line terminators
I tried to encode every string that I am writing to the intermediate file/StringIO object into UTF8 format. To do this used .encode(encoding='UTF-8',errors='strict')) for every string.
This is the error I got now: UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position 47: ordinal not in range(128)
What does it mean? How do I fix it?
EDIT: I am using Python 2.7 Some pieces of my code:
I read from a MySQL database that has data encoded in UTF-8 as per MySQL Workbench. This is a few lines code for writing my data (that's obtained from MySQL db) to StringIO object:
# Populate the table_data variable with rows delimited by \n and columns delimited by \t
row_num=0
for row in cursor.fetchall() :
# Separate rows in a table by new line delimiter
if(row_num!=0):
table_data.write("\n")
col_num=0
for cell in row:
# Separate cells in a row by tab delimiter
if(col_num!=0):
table_data.write("\t")
table_data.write(cell.encode(encoding='UTF-8',errors='strict'))
col_num = col_num+1
row_num = row_num+1
This is the code that writes to Postgres database from my StringIO object table_data:
cursor = db_connection.cursor()
cursor.copy_from(table_data, <postgres_table_name>)