0

having problem to store unicode to database. for your information 你 = you..

>> a='你'
>>a <\br>
'\xc4\xe3'

the problem is

 # -*- coding: utf-8 -*-
 import MySQLdb
 db = MySQLdb.Connect(host="127.0.0.1", port=3306, user="root", passwd="root",db="mydata", charset="utf8", use_unicode=True)

 cursor = db.cursor()

  insert = "insert into testing (english,chinese,frequency) values(%s,%s,1) on duplicate KEY UPDATE frequency=frequency+1;"
  a='你'
 data=('you',a)
 try:
    cursor.execute(insert,data) 
 except:
    print "error"

 db.commit()

which return me an error, but when i change to this

data=('you','你')

it works....

can anyone help me?? i need to use "data=('you',a)" because later i will import chinese chracter file

khheng
  • 147
  • 1
  • 9

1 Answers1

2

Try telling python to treat the string as unicode, like this: a= u'你'

If you're not using an interactive prompt, you can accomplish this via the unicode function. An example of one way to load data would be:

fname = 'somefile.txt'
with open(fname,'r') as f
    unicode_data = unicode(f.read())

If this doesn't work, you should be able to fine more details in the python docs: http://docs.python.org/2/howto/unicode.html and also you may find this SO answer helpful: Character reading from file in Python

Community
  • 1
  • 1
David Marx
  • 8,172
  • 3
  • 45
  • 66
  • thank a lot ...how about i read the file and store as u'chinese _character' – khheng Feb 21 '13 at 18:54
  • I'm not sure what you mean – David Marx Feb 21 '13 at 18:57
  • for an example, a="你好" then "a" will be '\xc4\xe3\xba\xc3'
    how do i make it to u'\xc4\xe3\xba\xc3'
    – khheng Feb 21 '13 at 18:59
  • try using the `unicode()` function. Edited my response to assume you are loading data from an external file. – David Marx Feb 21 '13 at 19:33
  • i think here the error "chi = unicode(f.read())"
    error message UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 0: ordinal not in range(128)
    – khheng Feb 21 '13 at 19:45
  • Sorry, I'm not a pro at working with unicode. I hope this other SO response helps you: http://stackoverflow.com/questions/147741/character-reading-from-file-in-python – David Marx Feb 21 '13 at 19:50