0

I'm trying to get scraped Chinese text into a MYSQL database from python scrapy, but it seems that either scrapy or MYSQL can't handle Chinese characters using my current method.

def insert_table(datas):
    sql = "INSERT INTO %s (name, uses, time_capt) \
        values('%s', '%s', NOW())" % (SQL_TABLE,
            escape_string(datas['name']),
            escape_string(datas['uses']),
            )
    if cursor.execute(sql):
                print "Inserted"
    else:
        print "Something wrong"

I keep getting this error when adding into the MYSQL database:

exceptions.UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-7: ordinal not in range(128)

datas['name'] contains correctly formatted Chinese characters. If I print the variable within the code, it comes out properly. I've tried adding .encode['utf8'] before adding it into MYSQL, which makes the error go away, but that makes it come out as garbled gibberish within my database. Am I doing something wrong?

Edit, This is my current code:

 15 def insert_table(datas):
 16     sql = "INSERT INTO %s (name, uses, time_capt) \
 17         values(%s, %s, NOW())",  (SQL_TABLE,
 18             escape_string(datas['name']),
 19             escape_string(datas['uses']),
 20             )
 21     if cursor.execute(sql):
 22                 print "Inserted"
 23     else:
 24         print "Something wrong"                           
  • 1
    Related: [Python & MySql: Unicode and Encoding](http://stackoverflow.com/q/8365660) – Martijn Pieters Dec 07 '13 at 15:53
  • You are trying to use string interpolation to build a query. Don't do that, use SQL parameters instead and have the database adapter do the encoding for you. It is the interpolation here that throws the `UnicodeEncodeError`. – Martijn Pieters Dec 07 '13 at 15:54
  • I'm not sure I understand, the examples in the related link seem to use the same method I use? – user3025245 Dec 07 '13 at 16:48
  • No, the code there uses SQL parameters. – Martijn Pieters Dec 07 '13 at 16:58
  • Sorry if this is really basic, but I'm not sure of the difference. – user3025245 Dec 07 '13 at 17:34
  • You are using string interpolation (`INSERT ... VALUES ('%s', ...)` % (value1, ...)`). `MySQLdb` uses what *looks* like the same syntax to denote SQL parameters, but there are no quotes: `cursor.execute('INSERT ... VALUES (%s, ...), (value1, ...))`. – Martijn Pieters Dec 07 '13 at 17:36
  • Note that there is no `%` operator being used anymore; the `execute()` method takes a *second* argument which is a sequence of parameters for the SQL server to quote and insert. – Martijn Pieters Dec 07 '13 at 17:37
  • Have I done it right (new code in original message)? I'm still getting the same error... – user3025245 Dec 08 '13 at 04:28
  • `I've tried adding .encode['utf8'] before adding it into MYSQL, which makes the error go away, but that makes it come out as garbled gibberish within my database.` what is the encoding of your connection and of your database? – Burhan Khalid Dec 08 '13 at 05:23
  • I'm not sure what the connection encoding is, but the column thats going to store the Chinese character entries is utf8_unicode_ci. – user3025245 Dec 08 '13 at 07:21

1 Answers1

0

I have met similar problem when I use name = u"大学" as a variable in "scrapy" project. After I add the comment -*- coding: utf-8 -*- the error is gone.