I can't seem to store unicode characters correctly in Postgres. They are shown as code representations, e.g. <C3><A5>
instead of å
for example.
My database was created with UTF-8 as the encoding. I have tried storing the unicode strings with psycopg2 like so:
field = myUnicodeString.encoding('utf-8')
cursor.execute("INSERT INTO mytable (column1) VALUES (%s)", (field))
field = myUnicodeString
cursor.execute("INSERT INTO mytable (column1) VALUES (%s)", (field))
but both alternatives stores incorrect characters. Do I need to set the charset for the table as well, or what can the problem be here?
UPDATE 1:
I have discovered that I can't even type non-ascii characters – like å, ä and ö – in my terminal. I'm on an Ubuntu 12.04 server. Could this in any way be related to the language settings of the server itself?
UPDATE 2
I am now able to type non-ascii characters in my terminal during a SSH session. I changed the locale settings and rebooted the server. Moreover, I am able to manually store non-ascii characters in my UTF-8 database (in psql: INSERT INTO table (column) VALUES ('ö')
). The char is displayed correctly in psql.
When I SELECT convert_to(column, 'utf-8') FROM table
with the manually inserted row in the table, the char ö
is displayed as \xc383c2b6
in psql.
When I do print repr('ö')
in Python, I get '\xc3\xb6'
. I'm really trying to understand how to debug this, but I'm not sure what to look for.