16

I'm trying to use MySQL Connector/Python from mysql.com with Python 3.

I have tables in UTF-8 coding, and when I fetch the rows, all my chars columns returned like bytearray. This is make some confusion.

How I can fetch directly str?

UPD:

# -*- coding: utf-8 -*-
import mysql.connector
con = mysql.connector.connect( user ="root", db = "vg_site_db", charset = 'utf8' )
cursor = con.cursor()
sql = """select caption from domains
"""
cursor.execute( sql )
row = cursor.fetchone()
while row is not None:
    print( row )
    row = cursor.fetchone()

output:

(bytearray(b'ezsp.ru'),)
(bytearray(b'eazyshop.ru'),)
(bytearray(b'127.0.0.1:8080'),)
(bytearray(b'rmsvet.ru'),)

I want:

('ezsp.ru',)
('eazyshop.ru',)
('127.0.0.1:8080',)
('rmsvet.ru',)

UPD2:

My tables use COLLATE utf8_bin.

Eugene Yarmash
  • 142,882
  • 41
  • 325
  • 378
Gulaev Valentin
  • 575
  • 1
  • 7
  • 19
  • show your python code that reads database content – Ravinder Reddy Dec 19 '14 at 12:24
  • I update post and add the code samples – Gulaev Valentin Dec 19 '14 at 12:45
  • I'm also getting bytearray's from cursor.fetchone() in Python 2.7. Passing them through `str(row[0].decode())` turned them into native strings in both Python 2.7 and 3.4. MySQL [made a change](http://dev.mysql.com/doc/relnotes/connector-python/en/news-2-0-0.html) in Connector version 2. – Bob Stein May 21 '15 at 23:26

6 Answers6

6

Seems like this happens when you use binary collation, at least the same happened to me. To convert the bytearrays to Unicode strings, you can add a custom converter class:

class MyConverter(mysql.connector.conversion.MySQLConverter):

    def row_to_python(self, row, fields):
        row = super(MyConverter, self).row_to_python(row, fields)

        def to_unicode(col):
            if isinstance(col, bytearray):
                return col.decode('utf-8')
            return col

        return[to_unicode(col) for col in row]

sql = mysql.connector.connect(converter_class=MyConverter, host=...)
danmichaelo
  • 1,706
  • 1
  • 25
  • 30
  • Unfortunately this no longer works, because converters now have to inherit from `MySQLConverterBase` and that one doesn't have the `row_to_python` method, unless you copy paste it yourself and ugh. – Tominator Sep 19 '19 at 11:19
  • @Tominator : Sure? It still works for me. Also, `MySQLConverter` extends `MySQLConverterBase` and still has the [`row_to_python` method](https://github.com/mysql/mysql-connector-python/blob/905a75e589c9d0e32fd54b0aaae7ad07be368859/lib/mysql/connector/conversion.py#L378), so I don't immediately see why it shouldn't work. – danmichaelo Sep 19 '19 at 16:31
4

MySQL Connector returns strings (as stored using the CHAR, VARCHAR, and TEXT data types) as bytearrays when respective columns are defined with a binary collation (e.g. utf8_bin). You must call .decode() on values to get Python strings, e.g.:

for row in cursor:
    caption = row[0].decode()

That said, unless you have a specific requirement to use utf8_bin, it's a much better idea to use the utf8mb4 character set with the collation utf8mb4_unicode_ci on the database level. That would solve your problem and allow for full Unicode support. See this and this fore more details.

Eugene Yarmash
  • 142,882
  • 41
  • 325
  • 378
  • Wouldn't you rather recomment `utf8mb_unicode_cs`? As I understand it, `bin` sorts/selects by exact character code, `cs` sorts like a regular user would expect, and `ci` is case insensitive but otherwise like `cs`. – lucidbrot Aug 24 '19 at 14:43
  • 1
    @lucidbrot There's no `utf8mb_unicode_cs` collation, but perhaps you can use [`utf8mb4_0900_as_cs`](http://mysqlserverteam.com/mysql-8-0-1-accent-and-case-sensitive-collations-for-utf8mb4/). As for which one is better, it depends. Generally, capital letters are considered to be identical to their corresponding lower case letters for the purposes of [alphabetical ordering](https://en.wikipedia.org/wiki/Alphabetical_order). – Eugene Yarmash Aug 24 '19 at 18:51
1

Adding mysql-connector-python==8.0.17 to requirements.txt resolved this issue for me.

amcleod83
  • 107
  • 1
  • 4
0

Although the most voted answer (by @danmichaelo) certainly works, I wanted to provide my version that solves a major "but" that is already pointed out by @Tominator; custom converter classes must now inherit MySQLConverterBase and not MySQLConverter. The reason you do not want to inherit MySQLConverter (even if it inherits MySQLConverterBase as @danmichaelo points out) is that it will invoke the corresponding converter in the MySQLConverter class on each returned value. This will implement a logic you may not want.

To avoid the above, you have two choices. First, you could create a higher-level function that will get the data and alter the rows after they are retrieved.

def get_data_from_db(cursor, sql_query):
    cursor.execute(sql)
    row = cursor.fetchone()
    while row is not None:
        row_to_return = row.decode('utf-8') if isinstance(row, bytearray) else row
        row = cursor.fetchone()
    
    return row

If you still want to use the custom converter class method, then you should inherit the MySQLConverterBase as suggested in the documentation (https://dev.mysql.com/doc/connector-python/en/connector-python-connectargs.html this is valid until mysql-connector-python==8.0.26, see below) and then you could extend the MySQLConverterBase.to_python method.

class MyConverter(mysql.connector.conversion.MySQLConverterBase):
    def to_python(self, vtype, value):
        """Convert MySQL data type to Python"""
        if isinstance(value, bytearray):
            return value.decode('utf-8')

        super().to_python(vtype, value)

P.S. The class MyConverter can be used to implement custom converters by creating functions with names as in the MySQLConverter class (find the class here: https://github.com/mysql/mysql-connector-python/blob/master/lib/mysql/connector/conversion.py). For example, I was looking to convert TINYINT to bool and added a method named MyConverter._TINY_to_python(self, value, desc=None)

-- Update mysql-connector-python==8.0.27 --

In the 8.0.27 version, if you create a converter class that inherits the MySQLConverterBase you will probably get an error saying "expected a bytes-like object, str found". It is unclear to me why this happens but my answer above on creating the custom converter seems to not hold anymore. Instead, one should inherit the MySQLConverter class now:

class MyConverter(mysql.connector.conversion.MySQLConverter):
    def to_python(self, vtype, value):
        """Convert MySQL data type to Python"""
        if isinstance(value, bytearray):
            return value.decode('utf-8')

        super().to_python(vtype, value)
faremal
  • 50
  • 2
  • 6
-1

I don't think that you can get the cursor to return strings. The MySQL Connector Documentation says that they chose to return bytearrays so that they only have to maintain one codebase for both Python2 and Python3:

With the use of “raw” cursors, the returned values is of the bytearray type. This is necessary for having both Python 2 and 3 return the same data.

I addressed this issue using a list comprehension to decode each bytearray in the row:

for row in cursor:
    type_fixed_row = tuple([el.decode('utf-8') if type(el) is bytearray else el for el in row])
    print( type_fixed_row )
LSE
  • 73
  • 4
  • 2
    *raw* is not enabled by default, so I find this puzzling. The [documentation](https://dev.mysql.com/doc/connector-python/en/connector-python-connectargs.html#idm140343529540800) says "By default, strings coming from MySQL are returned as Python Unicode literals." – danmichaelo May 07 '17 at 19:55
-2

An easy way to solve this issue is to make sure that you are retrieving 'strings' from your MySQL table. To do so, you just have to add a CAST in your query as follows:

 # -*- coding: utf-8 -*-
import mysql.connector
con = mysql.connector.connect( user ="root", db = "vg_site_db", charset = 'utf8' )
cursor = con.cursor()
sql = "select CAST(caption as CHAR(50)) from domains"
cursor.execute( sql )
row = cursor.fetchone()
while row is not None:
    print( row )
    row = cursor.fetchone()

This should work for you.