1

I have a query string that runs in side a loop and for every item in the list query gets executed. The list contains strings and I use python string format technique to replace the query with the respective string from the list as iteration progress.

I have unicoded the query along with the string from the list: Here is my unicoded query:

query = ur'''SELECT something FROM some_table WHERE some_name LIKE "{this_name}%"'''

Prior to execute I encode the query string to utf-8

            try:
                formatted_query = query.format(this_name=list_name)
                #encode the query
                encoded_q = formatted_query.encode('utf-8')
                # execute the query
                self.dbCursor.execute(encoded_q)
                row = self.dbCursor.fetchone()
            except Exception, e:
                traceback.print_exc()

But the problem is that sometimes I run into strings from list that has single quote example: foo's. I have already unicoded with utf-8 and I thought doing so I don't have to worry about situations like this. But I am getting sql error since MySQL is not skipping the single quote.

My next attempt was to replace the single quote:

format_string = u"foo's".replace(u"'",u"\'")

But this didn't work either. I also saw that answer to this question is using mysqldb library inbuilt functionality which I am not aware of, so I seek the help from stackoverflow community to solve this problem.

I change the code to reflect the solution suggested in answers but result is the same: Here is the change:

args = [u"{this_name}%".format(this_name=format_name)]
self.dbCursor.execute(query.encode('utf-8'), args)

#error get thrown at this line:

Error:

UnicodeEncodeError: 'latin-1' codec can't encode character u'\u014d' in position 4: ordinal not in range(256)

This is the string that error is complaining and I have checked the type of that string its a unicoded string.

this_name= Sentōkisei type= <type 'unicode'>
Community
  • 1
  • 1
add-semi-colons
  • 18,094
  • 55
  • 145
  • 232

3 Answers3

4

If you call dbCursor.execute with two arguments, your DB adapter will quote the arguments for you. See the DB-API specification for details:

query = u'''SELECT something FROM some_table WHERE some_name LIKE %s'''
args = [u"{this_name}%".format(this_name=list_name)]
self.dbCursor.execute(query, args)

The %s in query is a parameter marker. It will be replaced by a quoted parameter given in args. The correct parameter marker to use depends on your DB adapter. For example, MySQLdb uses %s, while oursql and sqlite3 use ?.

Using parametrized SQL is the recommended way. You really should never have to quote the arguments yourself.


Regarding the error, you post that

this_name= Sentōkisei type= <type 'unicode'>

I am going to assume this means format_name is unicode. Therefore,

args = [u"{this_name}%".format(this_name=format_name)]

will make args a list containing one unicode.

Now we reach the line which is raising an error:

self.dbCursor.execute(query.encode('utf-8'), args)

query is already unicode. If you encode that unicode, then it becomes a str. So query.encode('utf-8') is a str, but args is a list of unicode. I'm not sure why you wanted to encode query, but your DB adapter should be able to take two unicode arguments. So try

self.dbCursor.execute(query, args)

Now, upon re-reading your comments, it appears you've tried this and it also raises the same error:

UnicodeEncodeError: 'latin-1' codec can't encode character u'\u014d' in position 75: ordinal not in range(256)

I'm not sure why the DB adapter is trying to encode the unicode with latin-1 when you want utf-8 instead. The best solution would be to track down where this choice of latin-1 is coming from.

A hacky workaround would be to try encoding the strings yourself:

query = u'''SELECT something FROM some_table WHERE some_name LIKE %s'''.encode('utf-8')
args = [u"{this_name}%".format(this_name=list_name).encode('utf-8')]
self.dbCursor.execute(query, args)

But let me stress I really don't think this is the best way, nor should this be necessary.

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • so, I don't need to do this: `encoded_q = formatted_query.encode('utf-8')`? – add-semi-colons Apr 27 '13 at 16:31
  • 1
    Regarding Unicode, I think that `"{this_name}%"` needs to be replaced with `u"{this_name}%"`. – Cristian Ciupitu Apr 27 '13 at 16:54
  • I think so because with the suggested solution I get the fallowing: `UnicodeEncodeError: 'ascii' codec can't encode character u'\xf3' in position 1: ordinal not in range(128) because I have to deal with strings like this Hój type= but I check for the type and I have unicoded it prior to this. – add-semi-colons Apr 27 '13 at 16:56
  • @unutbu actually I ran my code with your solution and compared with mine to see where they fail they both fail at the same string: `UnicodeEncodeError: 'latin-1' codec can't encode character u'\u014d' in position 75: ordinal not in range(256) format_strange_user= Sentōkisei type= ` I also did small alteration to your code but that really didn't make any difference I think. I edited the answer to reflect that. – add-semi-colons Apr 28 '13 at 13:35
  • Please post the revised code you are running, and the full traceback error message. It will tell us, among other things, exactly what line is generating the error. – unutbu Apr 28 '13 at 14:03
  • @unbutu nothing has change I added your solution but no improvement `self.dbCursor.execute(user_id_query.encode('utf-8'), args)` this is the line. – add-semi-colons Apr 28 '13 at 14:30
  • @unutbu I have added the error to the question along with the your suggestion. – add-semi-colons Apr 28 '13 at 14:37
  • @find-missing-semicolon: Could you post the `mysql` output of `SHOW CREATE TABLE some_table`? I wonder if the table is set to use the `latin-1` character set. I'm not familiar with the mysql-python DB adapter. You might want to check what is the proper way to configure the adapter to use `utf-8` encoding by default. – unutbu Apr 28 '13 at 15:04
0

I have added an answer to a similar question here, you can take a look at it too!

link :https://stackoverflow.com/a/61042304/8939258

Saurav Panda
  • 558
  • 5
  • 12
0

I give up on escaping it. Instead, I replace the single quote with a wild card which is %

text = "woman's"
text = text.replace("'","%")

It works perfectly in the mysql

Aminah Nuraini
  • 18,120
  • 8
  • 90
  • 108