2

Hi guys I am having a problem with inserting utf-8 unicode character to my database.

The unicode that I get from my form is u'AJDUK MARKO\u010d'. Next step is to decode it to utf-8. value.encode('utf-8') then I get a string 'AJDUK MARKO\xc4\x8d'.

When I try to update the database, works the same for insert btw.

cur.execute( "UPDATE res_partner set %s = '%s' where id = %s;"%(columns, value, remote_partner_id))

The value gets inserted or updated to the database but the problem is it is exactly in the same format as AJDUK MARKO\xc4\x8d and of course I want AJDUK MARKOČ. Database has utf-8 encoding so it is not that.

What am I doing wrong? Surprisingly couldn't really find anything useful on the forums.

Alastair McCormack
  • 26,573
  • 8
  • 77
  • 100
HateCamel
  • 23
  • 1
  • 6

1 Answers1

4

\xc4\x8d is the UTF-8 encoding representation of Č. It looks like the insert has worked but you're not printing the result correctly, probably by printing the whole row as a list. I.e.

>>> print "Č"
"Č"
>>> print ["Č"] # a list with one string
['\xc4\x8c']

We need to see more code to validate (It's always a good idea to give as much reproducible code as possible).

You could decode the result (result.decode("utf-8")) but you should avoid manually encoding or decoding. Psycopg2 already allows you send Unicodes, so you can do the following without encoding first:

cur.execute( u"UPDATE res_partner set %s = '%s' where id = %s;" % (columns, value, remote_partner_id))

- note the leading u

Psycopg2 can return Unicodes too by having strings automatically decoded:

import psycopg2
import psycopg2.extensions
psycopg2.extensions.register_type(psycopg2.extensions.UNICODE)
psycopg2.extensions.register_type(psycopg2.extensions.UNICODEARRAY)

Edit:

SQL values should be passed as an argument to .execute(). See the big red box at: http://initd.org/psycopg/docs/usage.html#the-problem-with-the-query-parameters

Instead E.g.

# Replace the columns field first.
# Strictly we should use http://initd.org/psycopg/docs/sql.html#module-psycopg2.sql  
sql = u"UPDATE res_partner set {} = %s where id = %s;".format(columns) 
cur.execute(sql, (value, remote_partner_id))
Alastair McCormack
  • 26,573
  • 8
  • 77
  • 100
  • Alastair what do you mean that I am not printing the result correctly? I query the table res_partner with pg_admin and I can see that it was inserted into database with \xc4\x8d and not Č. Thanks for help mate. – HateCamel Jan 05 '16 at 11:46
  • Ah, I see. What happens if you take my advice and do not encode before passing to `cur.execute()`? – Alastair McCormack Jan 05 '16 at 12:19
  • 1
    It works if I don't encode and only pass one value, for example name = 'Markoč', but it doesn't work if I send it multiple columns and values in a tuple. Then it says : ProgrammingError: type "u" does not exist. – HateCamel Jan 05 '16 at 12:47
  • Can you update your question with the code that causes that error? – Alastair McCormack Jan 05 '16 at 13:59
  • @HateCamel also faced with that issue, right now I'm thinking about using both `INSERT` and `UPDATE` expressions... Did you found out anything about inserting data in multiple columns? – Pavel Pereverzev Nov 19 '18 at 14:36
  • @PavelPereverzev raise a new question. Link to this one and describe your issue including a full stacktrace. Link to your new question here and I'll try to answer it. – Alastair McCormack Nov 19 '18 at 14:39
  • @AlastairMcCormack posted it already here https://stackoverflow.com/questions/53375955/psycopg2-does-not-insert-unicode-data – Pavel Pereverzev Nov 19 '18 at 14:43