22

We're trying to follow the PEP8 guidelines for formatting our Python code and staying below 80 characters per line.

Our SQLAlchemy lines are particularly troublesome, having lots of chained methods and tons of complex parameters, logic, and nested functions.

Are there any particular best practices for formatting Python SQLAlchemy with the constraints of PEP8?

The closest answer I've found is here, but the code I'm dealing with is far, far more complicated.

Community
  • 1
  • 1
Dustin Kirkland
  • 5,323
  • 3
  • 36
  • 34

4 Answers4

40

Came here hoping for a better solution, but I think I prefer the parentheses wrapping style:

subkeyword = (
    Session.query(
        Subkeyword.subkeyword_id, 
        Subkeyword.subkeyword_word
    )
    .filter_by(subkeyword_company_id=self.e_company_id)
    .filter_by(subkeyword_word=subkeyword_word)
    .filter_by(subkeyword_active=True)
    .one()
)

This is nice and clear, and avoids the dreaded backslash.

Nick Sloan
  • 921
  • 1
  • 9
  • 12
9

pep-8 discourages backslashes but for SQLAlchemy code I can't help but think they're the most readable, as you can keep each generative function at the start of its own line. If there's many arguments inside of parenthesis I'll break them out on individual lines too.

subkeyword = Session.query(
                  Subkeyword.subkeyword_id, 
                  Subkeyword.subkeyword_word
             ).\
               filter_by(subkeyword_company_id=self.e_company_id).\
               filter_by(subkeyword_word=subkeyword_word).\
               filter_by(subkeyword_active=True).\
               one()

it of course doesn't matter how complicated the code is, the indentation pattern can be carried on for any amount of code, however in Python we want to avoid excessive nesting. Usually with Query the nesting would occur because you're composing many subqueries together. So definitely construct the subqueries ahead of time:

subq = Session.query(
                Bat.id, 
                func.foo(Bat.x, Bat.y).label('foo')
               ).\
                filter(Bat.id==Bar.name).\
                correlate(Bar).\
                subquery()

subq2 = Session.query(Foo.id, Foo.bar).\
                filter_by(flag>5).\
                subquery()

result = Session.query(
                  subq.c.id,
                  subq.c.foo,
                  subq2.c.bar
                ).\
                join(subq2, 
                     and_(
                      subq.c.id > subq2.c.foo, 
                      subq.bar == subq2.id
                     )
                ).\
                order_by(subq.c.id, subq2.c.bar)

I'd welcome other opinions on the backslash thing.

zzzeek
  • 72,307
  • 23
  • 193
  • 185
  • 2
    nice backslashes usage in pocoo team style guide http://www.pocoo.org/internal/styleguide/ – estin Feb 28 '12 at 12:04
4

I'm a frequent user of the backslashes in a similar way to what zzzeek indicated in his answer. PEP8 is just a guideline, don't lose sleep over it when you violate it!

However, I also frequently use the type of formatting below, where I've stolen zzzeek's first example, mildly tweaked it, and reformatted:

q = Session.query(
    Subkeyword.subkeyword_id, 
    Subkeyword.subkeyword_word,
)
q = q.filter_by(subkeyword_company_id=self.e_company_id)  # first filter
q = q.filter_by(subkeyword_word=subkeyword_word)  # 2nd filter
q = q.filter_by(subkeyword_active=True)

if filter_by_foo:
    q = q.filter(Subkeyword.foo == True)

# Run the query (I usually wrap in a try block)...
subkeyword = q.one()

The repeated reassignment to q seems kind of nasty at first, but I've gotten over it. The performance impact is effectively nil. A big advantage with this way is that you can mix in both trailing comments and comment lines to document your queries (as I've done with the useless additions above). Chaining lines with backslashes limits you here.

This way of formatting is particularly clean when formulating massive queries with tonnes of logic-triggered modifications, embedded scalar selects, etc.

As another example, I have a fairly large (> 150 lines) CTE query I'm generating in SQLAlchemy that has a lot of mixed logic, aliasing, and labeling (which is essential for readability of the generated query) that mixes both methods. A seriously reduced (and mangled) version of it starts something like below:

cte_init = session.\
    query(
        child1.foo.label("child1_foo"),
        sa.literal(1).label("indent"),  # can comment on non-slashed lines
        child2.bar.label("child2bar"),
        #comments between non-slashed lines ok, too 
        sa.func.MAX(toplevel.baz).label("max_baz"),
    ).\
    select_from(top_level).\
    join(child1,
         child1.id == toplevel.fk_child1_id).\
    join(child2.
         child2.id == toplevel.fk_child2.id).\
    filter(top_level.name == "bogus").\
    cte(name = "cte", recursive = True)

if(use_filter_x):
    cte_init = cte_init.filter_by(x = "whatever")

# etc (no, the above doesn't make any sense)...

In general, if you make sure to lead your lines off with the new operations (like many common SQL formatting schemes do), it stays quite readable. Don't be afraid of newlines within brackets, either.

Andy Baker
  • 21,158
  • 12
  • 58
  • 71
Russ
  • 10,835
  • 12
  • 42
  • 57
  • 1
    This also has the added benefit that it makes debugging with `pdb` a *lot* easier! Becasue each filter is on it's own statement! – exhuma Feb 24 '14 at 10:20
1

Yeah, these are going to be nasty no matter what you do, so to the extent that you can split these constructs up into shorter lines, definitely do so.

When you can't, you can probably get rid of all those backslashes by putting the entire RHS in a parenthesis. Python will then parse the multiline constructs properly without the backslashes, but it's also hard so say whether that's better or not. In cases like these, I think you just have to use your best judgement, hold your nose and plunge in.

  • 1
    what are the chances of pep8 being clarified regarding backslashes (literally, "These should be used in preference to using a backslash for line continuation." is the only mention) ? If they're discouraged, why does Python even have them ? – zzzeek Feb 28 '12 at 19:45
  • @zzzeek: Plenty of discouraged things exist in Python, like `map`, `reduce` and `filter`. The parens wrapping style is the best, and particularly, you don't have to move the `\\` around when you edit. – Tom Swirly Aug 04 '21 at 17:47