0

I have a, somewhat odd, query that gets me all the items in a parent table that have no matches in its corresponding child table.

If possible, id like to turn it into an SQLAlchemy query. But I have no idea how. I can do basic gets and filters, but this one is beyond my experience so far. Any help you folks might give would be greatly appreciated.

class customerTranslations(Base):
    """parent table. holds customer names"""
    __tablename__ = 'customer_translation'

    id = Column(Integer, primary_key=True)

class customerEmails(Base):
    """child table. hold emails for customers in translation table"""
    __tablename__ = 'customer_emails'

    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('customer_translation.id'))

I want to build:

SELECT * FROM customer_translation 
WHERE id NOT IN (SELECT parent_id FROM customer_emails)
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
user3723688
  • 47
  • 1
  • 10

1 Answers1

1

You have a subquery, so create one first:

all_emails_stmnt = session.query(customerEmails.parent_id).subquery()

and then you can use that to filter your other table:

translations_with_no_email = session.query(customerTranslations).filter(
    ~customerTranslations.id.in_(all_emails_stmnt))

This produces the same SQL (but with all the column names expanded, rather than using *, the ORM then can create your objects):

>>> all_emails_stmnt = session.query(customerEmails.parent_id).subquery()
>>> print(all_emails_stmnt)
SELECT customer_emails.parent_id
FROM customer_emails
>>> translations_with_no_email = session.query(customerTranslations).filter(
...     ~customerTranslations.id.in_(all_emails_stmnt))
>>> print(translations_with_no_email)
SELECT customer_translation.id AS customer_translation_id
FROM customer_translation
WHERE customer_translation.id NOT IN (SELECT customer_emails.parent_id
FROM customer_emails)

You could also use NOT EXISTS:

from sqlalchemy.sql import exists

has_no_email_stmnt = ~exists().where(customerTranslations.id == customerEmails.parent_id)
translations_with_no_email = session.query(customerTranslations).filter(has_no_email_stmnt)

or, if you have a a backreference on the customerTranslations class pointing to emails, named emails, use .any() on the relationship and invert:

 session.query(customerTranslations).filter(
    ~customerTranslations.emails.any())

Back in 2010 NOT EXISTS was a little slower on MySQL but you may want to re-assess if that is still the case.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Is there any way to write it using this format? `customerTranslations.query.filter(etc etc)` – user3723688 Mar 17 '18 at 19:36
  • Yes, just move from `session.query(Class).filter(...)` to `Class.query.filter(...)` – Martijn Pieters Mar 17 '18 at 19:48
  • Amazing. Thanks again. One more expansion if I could. How would I would I go about finding matches with a specific number of hits? For example - psuedocode: `SELECT * FROM customerTranslations WHERE (there are 3 matches in the customerEmails table)` – user3723688 Mar 17 '18 at 19:57
  • @user3723688: for the subquery, use `db.session.query(customerEmails.parent_id).subquery()` (where `db` is the `SQLAlchemy()` class from the Flask-SQLAlchemy library). – Martijn Pieters Mar 17 '18 at 19:58
  • @user3723688: To get a minimum number of email rows you'd have to use `EXISTS (...)` or `WHERE id IN (...)` on a subselect with grouping on `parent_id` and `HAVING COUNT(*) > 2`. – Martijn Pieters Mar 17 '18 at 20:01
  • @user3723688: you really should look over the [ORM tutorial](http://docs.sqlalchemy.org/en/latest/orm/tutorial.html), which teaches how to create such queries. – Martijn Pieters Mar 17 '18 at 20:02
  • `three_or_more_emails = session.query(customerEmails).filter(customerEmails.parent_id == customerTranslations.id).group_by(customerEmails.id).having(func.count(customerEmails.id) >= 3).exists()` and `session.query(customerTranslations).filter(three_or_more_emails)` produces `SELECT customer_translation.id AS customer_translation_id FROM customer_translation WHERE EXISTS (SELECT 1 FROM customer_emails WHERE customer_emails.parent_id = customer_translation.id GROUP BY customer_emails.id HAVING count(customer_emails.id) >= 3)`. – Martijn Pieters Mar 17 '18 at 22:15