4

I am loading data from one table into pandas and then inserting that data into new table. However, instead of normal string value I am seeing bytearray.

bytearray(b'TM16B0I8') it should be TM16B0I8

What am I doing wrong here?

My code:

engine_str = 'mysql+mysqlconnector://user:pass@localhost/db'
engine = sqlalchemy.create_engine(engine_str, echo=False, encoding='utf-8')
connection = engine.connect()

th_df = pd.read_sql('select ticket_id, history_date', con=connection)

for row in th_df.to_dict(orient="records"):
    var_ticket_id = row['ticket_id']
    var_history_date = row['history_date']

    query = 'INSERT INTO new_table(ticket_id, history_date)....'
desertnaut
  • 57,590
  • 26
  • 140
  • 166
r0xette
  • 898
  • 3
  • 11
  • 24
  • where do you see the bytearray? anywhere related to the code above? – Jean-François Fabre Dec 01 '16 at 22:26
  • So when I am printing `th_df['ticket_id']`, instead of giving me a string `'TM16A0JY'` it is giving me this array `[77, 83, 90, 45, 48, 50, 53, 52, 57, 56]` and after the insert when I looked into DB it is showing me `bytearray(b'TM16A0JY')`. Interestingly for integer IDs it is not showing bytearray and also inserting a integer value in db. `4567`. – r0xette Dec 01 '16 at 22:31

4 Answers4

10

For some reason the Python MySql connector only returns bytearrys, (more info in (How return str from mysql using mysql.connector?) but you can decode them into unicode strings with

var_ticket_id = row['ticket_id'].decode()
var_history_date = row['history_date'].decode()
Community
  • 1
  • 1
maxymoo
  • 35,286
  • 11
  • 92
  • 119
6

Make sure you are using the right collation, and encoding. I happen to use UTF8MB4_BIN for one of my website db tables. Changed it to utf8mb4_general_ci, and it did the trick.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Yongju Lee
  • 241
  • 4
  • 5
  • This is perfect - I too had a similar problem that was solved by changing the collation from latin1_bin to COLLATE latin1_general_ci . Thank you @Yongju Lee – Yogesh Devi May 29 '20 at 13:09
1

Producing a bytearray is now the expected behaviour.

It changed with mysql-connector-python 8.0.24 (2021-04-20). According to the v8.0.24 release notes, "Binary columns were returned as strings instead of 'bytes' or 'bytearray'" behaviour was a bug that was fixed in that release.

So producing a Python binaryarray is the correct behaviour, if the database column is a binary type (e.g. binary or varbinary). Previously, it produced a Python string, but now it produces a binaryarray.

So either change the data type in the database to a non-binary data type, or convert the binaryarray to a string in your code. If the column is nullable, you'll have to check for that first; since attempting to invoke decode() method on None would produce an error. You'll also have to be sure the bytes represent a valid string, in the character encoding being used for the decoding/conversion.

Hoylen
  • 16,076
  • 5
  • 30
  • 16
0

Much easier...

How to return str from MySQL using mysql.connector?

Adding mysql-connector-python==8.0.17 to requirements.txt resolved this issue for me

"pip install mysql-connector-python" from terminal

nagmjgg
  • 11