2

I have the following Models/Tables in falsk-sqlalchemy.:

class MainTerm(db.Model):
    __tablename__ = 'main_term'
    id = db.Column(db.Integer, primary_key=True)
    text = db.Column(db.String)


terms_in_gzt = db.Table('terms_in_gzt',
    db.Column('gzt_term_id', db.Integer, db.ForeignKey('gzt_term.id')),
    db.Column('gazetteer_id', db.Integer, db.ForeignKey('gazetteer.id')),
)

class Gazetteer(db.Model):
    __tablename__ = 'gazetteer'
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String)
    terms = db.relationship('GztTerm', secondary=terms_in_gzt, backref='gazetteers', lazy='dynamic')

class GztTerm(db.Model):
    __tablename__ = 'gzt_term'
    id = db.Column(db.Integer, primary_key=True)
    text = db.Column(db.String)

I'm currently running sqlite (sqlite version '3.32.3'), but it is later intended to use postgresql. So a solution for postgres would be more important than for sqlite.

And I have this join query that queries the text column of GztTerm entries that are associated with Gazetteer.id == 3:

joinq = db.session.query(GztTerm.text).join(terms_in_gzt).join(Gazetteer).filter(Gazetteer.id == 3)
joinq.all()

With the following result:

[('IBM'), ('Lund'), ('Bluetooth')]

When I now call:

binary_expression = MainTerm.text.contains(joinq)
res = db.session.query(MainTerm).filter(binary_expression).all()

Then I get only results with the first element "IBM". The results with "Bluetooth" or "Lund" are missing, even though they are in the database. When I change the query, I always get only results for the first element of the joinq.

So my question is:
What do I need to change, that my BinaryExpression MainTerm.text.contains(joinq) considers all the elements of the joinq?

Thanks in advance!


Edit:

The following code for constructing the binary_expression archives the desired behaviour:

from sqlalchemy import or_
binary_expression = or_(MainTerm.text.contains(text) for text in joinq)

However, this feels more like a hack and is probably not that efficient.
Is there some way to archive the same behaviour without loop only using sqlalchemy?

MBT
  • 21,733
  • 19
  • 84
  • 102
  • What DB are you using? – Ilja Everilä Oct 09 '20 at 10:33
  • Print out the query itself, you might spot a difference in the two ways it's generating the SQL code – Peter Oct 09 '20 at 10:35
  • @IljaEverilä I'm using sqlite version '3.32.3'. Could this be different for other DBs? – MBT Oct 09 '20 at 12:41
  • @Peter what exactly do you mean by print out? – MBT Oct 09 '20 at 12:42
  • As in `query = db.session.query(MainTerm).filter(binary_expression); result = query.all()`. You can print the query object to see the raw SQL behind it. – Peter Oct 09 '20 at 13:05
  • @Peter thanks for your reply. I have done that. *(too keep it readable I did not post the data here)* `result` only contains entries with `IBM`, the following rows from `joinq` are ignored. No matter how I change the query, only the first element of `joinq` will be considered by the `contains()` expression. – MBT Oct 09 '20 at 13:17
  • @IljaEverilä I tested with postgres. There it does not work att all and I get a `CardinalityViolation: more than one row returned by a subquery used as an expression`. Is this a general limitation for `contains`? The `in_` statement works fine even with postgres. Is there some way to work around this limitation? *(it is intended to later use postgres over sqlite)* – MBT Oct 09 '20 at 13:21
  • Ah, I get the issue now you posted the edit. All contains does is something around the lines of `select from where LIKE "%%"`. Perhaps use something like this as inspiration on how to get it working, you may need to set up a subquery - https://stackoverflow.com/a/30889878/2403000
    – Peter Oct 09 '20 at 14:16
  • In case of PostgreSQL you could use `LIKE ANY`: https://stackoverflow.com/a/42573552/2681632 – Ilja Everilä Oct 09 '20 at 17:21
  • @IljaEverilä Thanks for the suggestion! I tried but without luck, I can construct the expression: `binary_expression = MainTerm.text.contains(any_(joinq))` but when running the final query I get `ProgrammingError: (psycopg2.errors.SyntaxError) syntax error at or near "ANY" LINE 3: WHERE (main_term.text LIKE '%' || (ANY (SELECT gzt_term.text...` where `ANY` is highlighted here. `.contains()` seems to have problems with `any_`, running the same expression with `like` or `ilike` instead works perfectly fine. *(even though `contains` seems to be a variant of `like`)* – MBT Oct 09 '20 at 18:49
  • @IljaEverilä I ended up adding a `column_property('%'+text+'%')` and using `like` instead. This is good enough for me. Thanks a lot for pointing me in the right direction! – MBT Oct 09 '20 at 19:13

1 Answers1

2

I ended up adding a column_property to GztTerm:

from sqlalchemy.orm import column_property

class GztTerm(db.Model):
    __tablename__ = 'gzt_term'
    id = db.Column(db.Integer, primary_key=True)
    text = db.Column(db.String)
    text_like = column_property('%'+text+'%')

This enables me to use like instead of contains (which itself is based on like) together with any_ (Thanks to Ilja Everilä for this idea!).

The final query looks like this:

from sqlalchemy import any_
joinq  = db.session.query(GztTerm.text_like).join(terms_in_gzt).join(Gazetteer).filter(Gazetteer.id == 3)
binary_expression = MainTerm.text.like(any_(joinq))
res = db.session.query(MainTerm).filter(binary_expression).all()

To get the complement notlike and all_ is used for the binary expression:

from sqlalchemy import all_
binary_expression = MainTerm.text.notlike(all_(joinq))

Hopefully this is helpful to someone!

MBT
  • 21,733
  • 19
  • 84
  • 102