27

I have a caching problem when I use sqlalchemy.

I use sqlalchemy to insert data into a MySQL database. Then, I have another application process this data, and update it directly.

But sqlalchemy always returns the old data rather than the updated data. I think sqlalchemy cached my request ... so ... how should I disable it?

Hannele
  • 9,301
  • 6
  • 48
  • 68
Zeyi Fan
  • 2,213
  • 3
  • 17
  • 19

7 Answers7

54

The usual cause for people thinking there's a "cache" at play, besides the usual SQLAlchemy identity map which is local to a transaction, is that they are observing the effects of transaction isolation. SQLAlchemy's session works by default in a transactional mode, meaning it waits until session.commit() is called in order to persist data to the database. During this time, other transactions in progress elsewhere will not see this data.

However, due to the isolated nature of transactions, there's an extra twist. Those other transactions in progress will not only not see your transaction's data until it is committed, they also can't see it in some cases until they are committed or rolled back also (which is the same effect your close() is having here). A transaction with an average degree of isolation will hold onto the state that it has loaded thus far, and keep giving you that same state local to the transaction even though the real data has changed - this is called repeatable reads in transaction isolation parlance.

http://en.wikipedia.org/wiki/Isolation_%28database_systems%29

zzzeek
  • 72,307
  • 23
  • 193
  • 185
  • 3
    "SQLAlchemy's session works by default in a transactional mode" --- can you show us a way to stop the default please? I dont want explainations just want 1 line of code to disable transaction completely. Especially for stupid SELECT calls. – est Dec 08 '20 at 08:08
  • 3
    Actually THERE IS caching in SQLAlchemy (at least, now in 2021 ) ). I faced this problem wth `session.execute` command. You can find information about caching here (search "cached since" string on the page) https://github.com/sqlalchemy/sqlalchemy/blob/master/doc/build/core/connections.rst – Anar Salimkhanov Jun 10 '21 at 13:40
  • 1
    @AnarSalimkhanov Mind though, that the caching you are referring to is only a *statement compilation cache*. From your linked doc: it *"is caching the **SQL string that is passed to the database only**, and **not the data** returned by a query. It is in no way a data cache and does not impact the results returned for a particular SQL statement nor does it imply any memory use linked to fetching of result rows."* – amain Nov 24 '21 at 07:49
  • @amain Hmm... Interesting. Because I really had a problem with caching. Though the DB was updated, I used to get old RESPONSE data, until I disabled it. Now I can't test it, because it was in one of my old projects, and I don't remember where it was ) – Anar Salimkhanov Nov 24 '21 at 12:30
24

This issue has been really frustrating for me, but I have finally figured it out.

I have a Flask/SQLAlchemy Application running alongside an older PHP site. The PHP site would write to the database and SQLAlchemy would not be aware of any changes.

I tried the sessionmaker setting autoflush=True unsuccessfully I tried db_session.flush(), db_session.expire_all(), and db_session.commit() before querying and NONE worked. Still showed stale data.

Finally I came across this section of the SQLAlchemy docs: http://docs.sqlalchemy.org/en/latest/dialects/postgresql.html#transaction-isolation-level

Setting the isolation_level worked great. Now my Flask app is "talking" to the PHP app. Here's the code:

engine = create_engine(
    "postgresql+pg8000://scott:tiger@localhost/test",
    isolation_level="READ UNCOMMITTED"
)

When the SQLAlchemy engine is started with the "READ UNCOMMITED" isolation_level it will perform "dirty reads" which means it will read uncommited changes directly from the database.

Hope this helps


Here is a possible solution courtesy of AaronD in the comments

from flask.ext.sqlalchemy import SQLAlchemy

class UnlockedAlchemy(SQLAlchemy):
    def apply_driver_hacks(self, app, info, options):
        if "isolation_level" not in options:
            options["isolation_level"] = "READ COMMITTED"
    return super(UnlockedAlchemy, self).apply_driver_hacks(app, info, options)
Nick Woodhams
  • 11,977
  • 10
  • 50
  • 52
  • 1
    If you are using Flask-SQLAlchemy, you can subclass `flask.ext.sqlalchemy.SQLAlchemy` and override the `apply_driver_hacks` function to set the isolation level, while still keeping all of the Flask integration. Also, probably isolation level `READ COMMITTED` is sufficient providing both applications are committing their writes after they make them and not waiting for a long time. That way you don't have to worry about dirty reads - it just gives you a fresh DB snapshot every time you read. – Aaron D Sep 16 '15 at 05:09
  • @AaronD Could you post your code to subclass `flask.ext.sqlalchemy.SQLAlchemy` as you mentioned? – Nick Woodhams Mar 08 '16 at 20:05
  • 1
    I just have this in my code: `class UnlockedAlchemy(SQLAlchemy): def apply_driver_hacks(self, app, info, options): if not "isolation_level" in options: options["isolation_level"] = "READ COMMITTED" return super(UnlockedAlchemy, self).apply_driver_hacks(app, info, options)` – Aaron D Mar 09 '16 at 14:33
  • 1
    Lifesaver! I am using `engine_from_config` to read the sqlalchemy configuration from file and I simply added: `sqlalchemy.isolation_level = READ UNCOMMITTED` to my config file and external changes are now properly reflected in my app :-) – ozbob Feb 04 '17 at 11:19
  • 2
    This does not make sense. If the transaction to update the database is properly committed (by the php site), why you need to set the isolation level to "READ UNCOMMITTED"? It's more like a problem on how your PHP site is updating the database. – Alan May 09 '18 at 17:54
  • Adding `isolation_level="READ UNCOMMITTED"` worked great for me – Jorge Irún Jan 03 '23 at 18:25
  • Just tried it and was given an error saying it's an invalid value for Postgres, and the valid values are: AUTOCOMMIT, READ COMMITTED, REPEATABLE READ, SERIALIZABLE – odigity Jul 25 '23 at 17:42
4

Additionally to zzzeek excellent answer,

I had a similar issue. I solved the problem by using short living sessions.

with closing(new_session()) as sess:
    # do your stuff

I used a fresh session per task, task group or request (in case of web app). That solved the "caching" problem for me.

This material was very useful for me:

When do I construct a Session, when do I commit it, and when do I close it

Jakub M.
  • 32,471
  • 48
  • 110
  • 179
  • 1
    The link above is going to the docs for session. The title implies it should be pointing here: http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#session-faq-whentocreate – mozey May 29 '15 at 09:13
  • http://docs.sqlalchemy.org/en/latest/orm/session_state_management.html#when-to-expire-or-refresh – phyatt Jun 04 '18 at 18:28
3

This was happening in my Flask application, and my solution was to expire all objects in the session after every request.

from flask.signals import request_finished
def expire_session(sender, response, **extra):
    app.db.session.expire_all()
request_finished.connect(expire_session, flask_app)

Worked like a charm.

smottt
  • 3,272
  • 11
  • 37
  • 44
egafni
  • 1,982
  • 1
  • 16
  • 11
2

I have tried session.commit(), session.flush() none worked for me.

After going through sqlalchemy source code, I found the solution to disable caching.
Setting query_cache_size=0 in create_engine worked.

create_engine(connection_string, convert_unicode=True, echo=True, query_cache_size=0)
Akshay Bande
  • 2,491
  • 2
  • 12
  • 29
  • It's worth noting that the question and the other answers discuss apparent _data caching_, where retrieved data doesn't match the latest data in the database. `query_cache_size` controls the size of SQLAlchemy's [cache of recently generated SQL queries as strings](https://docs.sqlalchemy.org/en/14/core/connections.html#sql-caching). It has no effect on query results, apart from potentially making them slower. It would of course affect memory usage. – snakecharmerb Nov 30 '22 at 08:58
  • Actually, despite wise conversations above, it is the direct answer on question title. And [Source code](https://github.com/sqlalchemy/sqlalchemy/blob/main/lib/sqlalchemy/engine/create.py#L502) comments say: `Set to zero to disable caching`. And it works: [experiment with disabled caching](https://disk.yandex.ru/i/1FSWfWD8BkfQ_A) [experiment with default caching](https://disk.yandex.ru/i/KSJu5PsGrTICRA) – Dmitry Dec 23 '22 at 21:36
  • @Dmitry no, that setting only disables _statement_ caching, it has nothing to do with the caching of _results_. – snakecharmerb Aug 14 '23 at 09:21
-1

First, there is no cache for SQLAlchemy. Based on your method to fetch data from DB, you should do some test after database is updated by others, see whether you can get new data.

(1) use connection:
connection = engine.connect()
result = connection.execute("select username from users")
for row in result:
    print "username:", row['username']
connection.close()
(2) use Engine ...
(3) use MegaData...

please folowing the step in : http://docs.sqlalchemy.org/en/latest/core/connections.html

Another possible reason is your MySQL DB is not updated permanently. Restart MySQL service and have a check.

wuliang
  • 749
  • 5
  • 7
-4

As i know SQLAlchemy does not store caches, so you need to looking at logging output.

Voislav Sauca
  • 3,007
  • 2
  • 18
  • 12