2

I'm learning SQLAlchemy and I want to make sure that I've understood the backref parameter in relationship correctly.

For example

from app import db

class User(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(64), unique=True)

    posts = db.relationship('Post', backref='author', lazy=True)


class Post(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    body = db.Column(db.String(140))

    user_id = db.Column(db.Integer, db.ForeignKey('user.id'))

Say I have a User object j = models.User.query.get(1). My question is, is there any difference between the following things?

  • j.posts
  • Post.query.filter_by(author=j).all()
  • Post.query.with_parent(j).all()
  • Post.query.with_parent(j, property='posts').all()
  • Post.query.with_parent(j, property=User.posts).all()

The results returned are same, but I don't know whether the SQL statements executed are identical.

What I've tried

The SQLAlchemy docs says:

with_parent(instance, property=None, from_entity=None)

...the given property can be None, in which case a search is performed against this Query object’s target mapper.

So the last three statements seem same, but I don't really understand what does this Query object’s target mapper refer to. Is it Post in this case, for this query is performed on Post?

nalzok
  • 14,965
  • 21
  • 72
  • 139

2 Answers2

7

Even if the generated SQL statements are identical, the commands you enlisted may have a different impact on your application, e.g. j.posts will cache (memoize, do not confuse with Werkzeug caching) results you have got, while others will fetch them every single time.

If you remove .all() from your queries you can simply print them:

query = Post.query.filter_by(author=j)
print(query)

Which would result in:

SELECT post.id AS post_id, post.body AS post_body, post.user_id AS post_user_id 
FROM post 
WHERE ? = post.user_id

Using .all() is essentially like getting [m for m in query]).

The trick with query-printing will not work for j.posts which will return something like:

> print(j.posts)
> [Post(...), Post(..)]

Still, you can see all the silently emitted queries using built-in sqlalchemy loggers. See the following code:

from flask import Flask
from flask_sqlalchemy import SQLAlchemy
from sqlalchemy.engine import Engine
from sqlalchemy import event
import logging


app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:////tmp/tests.db'
db = SQLAlchemy(app)


logging.basicConfig()
logger = logging.getLogger('sqlalchemy.engine')


class User(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(64), unique=True)

    posts = db.relationship('Post', backref='author', lazy=True)


class Post(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    body = db.Column(db.String(140))

    user_id = db.Column(db.Integer, db.ForeignKey('user.id'))

db.drop_all()
db.create_all()
user = User(username='test', posts=[Post(body='some body')])
db.session.add(user)
db.session.commit()

# start logging
logger.setLevel(logging.DEBUG)
j = User.query.get(1)

queries = {
    "j.posts",
    "Post.query.filter_by(author=j)",
    "Post.query.with_parent(j)",
    "Post.query.with_parent(j, property='posts')",
    "Post.query.with_parent(j, property=User.posts)",
}

def test_queries():
    for name in queries:
        print('\n=======')
        print('Executing %s:' % name)
        query = eval(name)
        print(query)

test_queries()  # you should see j.posts query here
print('Second test')
test_queries()  # but not here

Getting back to your question: yes, the emitted SQL queries are identical.

In Query object’s target mapper, Query object's target refers to Post in your example. Decoupling this, when you declare Post class, inheriting from db.Model, for SQLAlchemy it is like creating an object Post and mapping the properties of this object to columns of specially created table.

Underneath there is an instance of Mapper class, which is responsible for the mapping for every single model that you create (learn more about mapping here: Types of Mappings). You can simply get this mapper calling class_mapper on your model or object_mapper on an instance of your model:

from sqlalchemy.orm import object_mapper, class_mapper, 
from sqlalchemy.orm.mapper import Mapper
assert object_mapper(j) is class_mapper(User)
assert type(class_mapper(User)) is Mapper

The Mapper has all the necessary information about the columns and relations you have in your model. When calling Post.query.with_parent(j) this information is used to find a property (i.e. relationship) relating Post and User objects, so in your case to populate 'property' with User.posts.

krassowski
  • 13,598
  • 4
  • 60
  • 92
  • This answer is awesome, but still my humble mind has a few questions: [*If you care to answer them I will give you a (large) bounty!*] **1.** shouldn't _query object’s target mapper_ be a `Mapper`? Why is `Post`, an object, the "target mapper", instead of something like `class_mapper(Post)`? **2.** You said "`j.posts` will cache results you have got", so what if the database is updated after I call `j.posts` for the first time? Won't this cause inconsistency? **3.** What exactly does "caching" mean? Extracting data from the database and store it in Python objects? – nalzok Jul 17 '17 at 12:28
  • **4.** According to [the docs](http://flask-sqlalchemy.pocoo.org/2.2/models/#one-to-many-relationships), lazy loading means _load the data as necessary_, but how much is necessary? Take `j.posts` for example, what is being cached, for `lazy=True`, `lazy=False` and `lazy=subquery`respectively? [*Yeah I know I have many many questions, so I will give you a big bounty, maybe 150, if you can answer them!*] – nalzok Jul 17 '17 at 12:34
  • 1. You are right that's what I wanted to write 2, 3 and 4: Will "cache" them until any session changing operation is performed (e.g. `db.session.commit()` or `db.session.rollback()`). It does exactly as you guessed extracts the data and stores them in Python objects (those are instrumented objects to be precise); 4. If you use eager loading, your relationship will be populated with first call e.g. with `j = User.query.get(1)`, not when you access `j.posts` for the first time. – krassowski Jul 17 '17 at 12:40
  • Question 1, 2 and 3 have been solved! _4. If you use eager loading, your relationship will be populated with first call e.g. with `j = User.query.get(1)`, not when you access `j.posts` for the first time._ But when I changed the `lazy=True` in the code in your answer to `lazy=False` and run it, I can still see some SQL statements when `test_queries()` is first executed... – nalzok Jul 17 '17 at 12:46
  • It works as expected if you use `j = db.session.query(User).filter(User.id==1).one()`. Seemingly using `.get()` prevents eager loading, contrary to what I thought. – krassowski Jul 17 '17 at 13:01
  • This is quite counter-intuitive, and IMO worth opening another question for :/ Anyway, thanks for teaching me many thing today! But unfortunately this question is _currently_ too new for me to start a bounty on. I will start one and give it to you after the system permits me to do so, [which would take at least 2+1 days](https://stackoverflow.com/help/bounty). PS: [Printing messages with color](https://stackoverflow.com/questions/287871/print-in-terminal-with-colors-using-python) can make the debug output much clearer. – nalzok Jul 17 '17 at 13:10
  • 1
    Thanks. I agree the strange get() behavior is a separate question. Personally I almost always use full queries rather than get() as those are more potent. PS: The get() behavior may be not intended and related to this bug: https://bitbucket.org/zzzeek/sqlalchemy/issues/1763/eager-loader-options-parameters-for – krassowski Jul 17 '17 at 13:46
  • It's really nice to see it's probably a bug. Also, why is full queries more potent than `get()`s? I thought `get()` is essentially a shortcut of the corresponding full query! – nalzok Jul 17 '17 at 14:00
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/149392/discussion-between-krassowski-and-sun-qingyao). – krassowski Jul 17 '17 at 14:05
1

To see the queries you can run your python script with -i and then run each query individually and it will print out the SQL code it runs. Example:

main.py:

import sqlalchemy
from sqlalchemy import create_engine, Column, Integer, String, Sequence
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
import os

engine = create_engine('sqlite:///:memory:', echo=True)

Base = declarative_base()

class User(Base):
    __tablename__ = 'users'

    id = Column(Integer, Sequence('user_id_seq'), primary_key=True)
    name = Column(String(50))
    fullname = Column(String(50))
    password = Column(String(12))

    def __repr__(self):
        return "< User(name={}, fullname={}, password={} )>".format(self.name, self.fullname, self.password)

Base.metadata.create_all(engine)

ed_user= User(name='ed', fullname='Ed Jones', password='edpassword')
Session = sessionmaker(bind=engine, autoflush=False)
session = Session()
session.add(ed_user)
session.add_all([
    User(name='wendy', fullname='Wendy Williams', password='foobar'),
    User(name='mary', fullname='Mary Contraty', password='xxg527'),
    User(name='fred', fullname='Fred Flinstone', password='blah')
])
session.commit()
os.system('clear')

Now you run it with python -i main.py, type: session.query(User).filter_by(name='ed').first() and you will see the SQL generated. After running all of your tests I concluded that they are all identical. With this method you can test any query and see if there is any difference.

p.s. I added the os.system('clear') to remove all the unnecessary output from creating the database and some other stuff.

kemis
  • 4,404
  • 6
  • 27
  • 40
  • I'm using FLASK-SQLAlchemy so I'm not quite familiar with SQLAlchemy. Can you make the code a complete, runnable one? For example, adding the import statements. – nalzok Jul 17 '17 at 10:06
  • @SunQingyao Edited, sorry for forgetting the imports :) – kemis Jul 17 '17 at 10:14
  • Experiment done, result: **all of these 5 queries are identical.** You may want to include the result in your answer, so that I can accept it :) [Here is the code](https://pastebin.com/8iExFNyT), for future reference. – nalzok Jul 17 '17 at 11:10
  • @SunQingyao Done. If you have anything to add you can freely edit the answer. :) – kemis Jul 17 '17 at 11:21