2

I'm using declarative_base() and trying to figure out how to enforce a lazy inner join when using a many-to-many relationship.

I have tables in MySQL and InnoDB with Foreign key constraints defined.

users: userid, name
permissions: permissionid, name
users_permissions: userid, permissionid

I'm using metadata.reflect() to load my database.


class User_Perm(Base):
    __table__ = Base.metadata.tables['users_permissions']

class User(Base):
    __table__ = Base.metadata.tables['users']

    permissions = orm.relationship('Permission',
            secondary=User_Perm.__table__,
            order_by='Perm.name',
            innerjoin=True,
            lazy=True,
            )

class Permission(Base):
    __table__ = Base.metadata.tables['permissions']

Whenever I select

u = Session().query(User).filter(User.name == 'myuser').first()
u.permissions

The query received by MySQL server is:

SELECT permissions.permissionid AS permissions_permissionid,
     permissions.name AS permissions_name
FROM permissions, users_permissions 
WHERE ? = users_permissions.userid
    AND permissions.permissionid = users_permissions.permissionid
ORDER BY permissions.name

As we can see, the FROM permissions, users_permissions is not an inner join. Can I force this without the need to use lazy=False because if I do so, the cascading effect will be to load too much information as permissions are also in relationship with another table (not mentioned in example) and users is also related to other tables (again, not mentioned in example). I would like to use the same template for relationship for all my classes.

EDIT: CONTEXT I'm trying to replicate all the SQL queries to match those from the current system. I'm trying to migrate from oursql to sqlalchemy orm.

Danosaure
  • 3,578
  • 4
  • 26
  • 41
  • 1
    if you need a character-per-character match between two applications, one using hand-written queries and the other an ORM, you will be disappointed. The SQLAlchemy ORM can certainly reproduce queries that are structurally extremely similar, but some features such as relationship lazy loading by their nature don't have a space for the query to be tailored fully. As mentioned, you'll want to compare MySQL's EXPLAIN output to the INNER JOIN vs. implicit join above to verify equivalent performance. – zzzeek May 06 '12 at 15:03

1 Answers1

4

MySQL is doing an inner join; that is, the query you've shown is one implementation of an inner join.

From the docs:

innerjoin=False – when True, joined eager loads will use an inner join to join against related tables instead of an outer join. The purpose of this option is strictly one of performance, as inner joins generally perform better than outer joins. This flag can be set to True when the relationship references an object via many-to-one using local foreign keys that are not nullable, or when the reference is one-to-one or a collection that is guaranteed to have one or at least one entry.

The key point is inner join vs. outer join. The syntax is irrelevant.

Note that your query can be converted to a query using the inner join syntactic form which has the exact same meaning:

SELECT permissions.permissionid AS permissions_permissionid,
     permissions.name AS permissions_name
FROM permissions
INNER JOIN users_permissions 
    ON permissions.permissionid = users_permissions.permissionid
WHERE ? = users_permissions.userid
ORDER BY permissions.name

MySQL will usually generate an identical execution plan for the two forms.

Community
  • 1
  • 1
Matt Fenwick
  • 48,199
  • 22
  • 128
  • 192
  • You are saying that behind the scene, I am guaranteed that MySQL will treat this type of queries like an inner join? I understand the resultset to be the same, but didn't think optimizer will always guaranty it. – Danosaure May 04 '12 at 14:24
  • 1
    @Danosaure I'm saying that your query is **always logically equivalent** to a query written using `inner join` -- that is, it will return the *exact* same data. In addition, the execution plan that MySQL generates is usually identical, as can be verified with `explain`. – Matt Fenwick May 04 '12 at 15:45
  • @Danosaure I may have slightly misunderstood your issue. Could you clarify? Is it not enough that the queries are logically equivalent? – Matt Fenwick May 04 '12 at 15:59
  • My concern was the "usually". What I understand from your answer is that sqlalchemy gives this select statement because it knows it uses MySQL and that MySQL will "optimize" the query for the execution plan? If so, wouldn't it have been just simpler to always use SQL-92 join syntax? – Danosaure May 04 '12 at 17:45
  • @Danosaure why does that bother you? I assume it's because it's just easier for sqlalchemy to generate the query string. – Matt Fenwick May 04 '12 at 18:40
  • @Danosaure I said "usually" because I don't know if the execution plan is guaranteed to always be identical. In my personal experience, however, I've never seen differing plans for such corresponding queries. – Matt Fenwick May 04 '12 at 18:41
  • It bothers me because `innerjoin=True` only applies to eager loaded, not to non-eager (`inner join` syntax too). But as you mentioned, execution plan is "usually" the same, so I guess, I'm worried for nothing. I'm transferring a running app from oursql to sqlalchemy ORM, and my first way to validate the understanding is trying to get the same SQL queries. Thanks for your help, I'll leave this few days to see if other options are available before accepting the answer. – Danosaure May 04 '12 at 20:38
  • 1
    lazyloading doesn't use a join first because in the vast majority of cases it's loading from exactly one table. in the less common case of a "secondary" table being present, it uses a natural join which keeps things simple. there should be no significant optimization differences. running EXPLAIN is your best bet. – zzzeek May 06 '12 at 15:07