0

I have some code which I'm finally getting around to updating to Python 3.

It runs in Docker using ubuntu:20.04 as the base image.

I am using SQLAlchemy query .all() method method sqlalchemy.orm.Query.all()

The code works with Python 2 however, it fails on Python 3 with the following error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 1: ordinal not in range(128)

I am struggling to understand why the code is trying to decode to ascii.

I've set the locale in the Dockerfile and running locale.getlocale() on the line above the SQLAlchemy line prints ('en_US', 'UTF-8')

The SQLAlchemy connection url specifies utf8:

sqlalchemy.url='mysql+pymysql:{server}&charset=utf8&binary_prefix=true

I've read all the other similar questions but still cannot get this working.

Update:

I've tracked it down to one column:

from sqlalchemy import PickleType


class Schedule(OrgRefMixin, DeclarativeBase):
    __tablename__ = 'flights'
    ...
        routing = Column(PickleType())
    ...

This is a blob column in the database.

Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/flask/app.py", line 2446, in wsgi_app
    response = self.full_dispatch_request()
  File "/opt/venv/lib/python3.8/site-packages/flask/app.py", line 1951, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/opt/venv/lib/python3.8/site-packages/flask/app.py", line 1820, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/opt/venv/lib/python3.8/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/opt/venv/lib/python3.8/site-packages/flask/app.py", line 1949, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/venv/lib/python3.8/site-packages/flask/app.py", line 1935, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/code/flights/flights/json.py", line 182, in view_wrapper
    return view_func(*args, **kwargs)
  File "/code/flights/flights/blueprints/flights.py", line 62, in get_flights
    flights = Flights.get(filters, **valid_args)
  File "/code/flights/flights/resources/flights.py", line 254, in get
    response = query.all()
  File "/opt/venv/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 2643, in all
    return list(self)
  File "/opt/venv/lib/python3.8/site-packages/sqlalchemy/orm/loading.py", line 90, in instances
    util.raise_from_cause(err)
  File "/opt/venv/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 202, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/opt/venv/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 186, in reraise
    raise value
  File "/opt/venv/lib/python3.8/site-packages/sqlalchemy/orm/loading.py", line 77, in instances
    rows = [keyed_tuple([proc(row) for proc in process])
  File "/opt/venv/lib/python3.8/site-packages/sqlalchemy/orm/loading.py", line 77, in <listcomp>
    rows = [keyed_tuple([proc(row) for proc in process])
  File "/opt/venv/lib/python3.8/site-packages/sqlalchemy/orm/loading.py", line 77, in <listcomp>
    rows = [keyed_tuple([proc(row) for proc in process])
  File "/opt/venv/lib/python3.8/site-packages/sqlalchemy/engine/result.py", line 93, in __getitem__
    return processor(self._row[index])
  File "/opt/venv/lib/python3.8/site-packages/sqlalchemy/sql/sqltypes.py", line 1478, in process
    return loads(value)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 1: ordinal not in range(128)
user7692855
  • 1,582
  • 5
  • 19
  • 39
  • A complete stack trace might be helpful. Also, what versions of SQLAlchemy and PyMySQL are you using? – Gord Thompson Mar 30 '23 at 13:24
  • Thanks, I've added full stack trace. I'm using sqlalchemy==1.1.3 and pymysql==0.9.2 – user7692855 Mar 30 '23 at 13:29
  • Wow, those versions are *really* old. Are you hitting a similarly old version of MySQL server? – Gord Thompson Mar 30 '23 at 13:58
  • I understand they're very old and I will be updating them. However, I'm not sure if it's related/causing this issue. – user7692855 Mar 30 '23 at 14:21
  • FWIW, I am unable to reproduce your issue using SQLAlchemy 1.1.3 and PyMySQL 0.9.2 hitting a MySQL 5.7.32 server. Can you create a [mcve] that uses just SQLAlchemy and PyMySQL (without Flask)? – Gord Thompson Mar 30 '23 at 14:28
  • A possibly related answer [here](https://stackoverflow.com/a/60195845/2144390) mentions checking if `$LANG=C` – Gord Thompson Mar 30 '23 at 16:12
  • I'm still trying to get a minimal reproducible example, and while debugging I've tracked it down to one specific model class. Where a column which is `Collation: utf8_general_ci` `varchar(30)` in the database is down as a Unicode column type in the python class definition. I'm guessing this is related / incorrect? – user7692855 Mar 31 '23 at 12:35
  • No, that shouldn't be a problem. `sql.sqltypes.String` and `sql.sqltypes.Unicode` are used to specify column types for databases that differentiate between single-byte character sets (SBCSs) and Unicode. For example, MS SQL Server uses `varchar()` (`'literal example'`) and `nvarchar()` (`N'literal example'`) respectively. With other databases like MySQL and PostgreSQL we can use either `String` or `Unicode` because they both correspond to `varchar()`. – Gord Thompson Mar 31 '23 at 18:20
  • Thats a pity, was really hoping I was onto some there. If I run the raw sql command directly through sqlalchemy it works so it's something in the model classes that is causing the issue. – user7692855 Apr 02 '23 at 11:28
  • [This code](https://gist.github.com/gordthompson/e9da0be84d7d3d0ed4d302b60560d218) works for me on my Ubuntu 20.04 workstation using SQLAlchemy 1.1.3 and PyMySQL 0.9.2 hitting a MySQL 5.7.32 server. Does it fail for you> – Gord Thompson Apr 03 '23 at 14:32
  • That code works for me, so it's definitely something to do with the my python models/SQLAlchemy code. Strangely, if I change `_rrule = Column("rrule", PickleType())` to `_rrule = Column("rrule", Text())` the decode error does not occur. – user7692855 Apr 03 '23 at 17:41
  • 1
    You might be encountering an issue (hinted at [here](https://stackoverflow.com/a/11305924/2144390)) where objects pickled by Python_2 are not 100% compatible with un-pickling in Python_3. You could try retrieving the raw bytes with something like `bin_data = engine.execute("SELECT routing FROM flights WHERE id = 1").scalar()` and then see if you can un-pickle it manually with `thing = pickle.loads(bin_data)`. – Gord Thompson Apr 03 '23 at 19:39
  • @GordThompson I'd say you're right. I'll try some digging around tomorrow and see if I can come up with a custom type so I can still use the ORM functionality of sqlalchemy. Appreciate all the help – user7692855 Apr 03 '23 at 21:26
  • 1
    Possibly relevant: https://stackoverflow.com/q/28218466/2144390 – Gord Thompson Apr 03 '23 at 22:49

1 Answers1

0

when Python tries to decode a byte string to Unicode using the ASCII codec, but the byte the input string contains non-ASCII characters such a error happens. it seems that one of the rows being loaded contains non-ASCII characters. My suspect is explicitly set the encoding used by PickleType to 'bytes'. something like this:

routing = Column(PickleType(pickler=lambda v: pickle.dumps(v, protocol=pickle.HIGHEST_PROTOCOL), 
                                    unpickler=lambda v: pickle.loads(v), 
                                    encoding='bytes'))
Phoenix
  • 1,343
  • 8
  • 10