1

Working with Pandas, I have to rewrite queries implemented as a dict:

query = {"height": 175}

The key is the attribute for the query and the value could be a scalar or iterable. In the first part I check if the value is not NaN and scalar. If this condition holds I write the query expression with the == symbol, but else if the value is Iterable I would need to write the expression with the in keyword. This is the actual code that I need to fix in order to work also with Iterables.

import numpy as np
from collections import Iterable


def query_dict_to_expr(query: dict) -> str:
    expr = " and ".join(["{} == {}"
                        .format(k, v) for k, v in query.items()
                         if (not np.isnan(v)
                             and np.isscalar(v))
                         else "{} in @v".format(k) if isinstance(v, Iterable)
                         ]
                        )
    return expr

but I got invalid syntax in correspondence with the else statement.

cezar
  • 11,616
  • 6
  • 48
  • 84
Juan Carlos
  • 367
  • 2
  • 8
  • 16
  • Possible duplicate of [if/else in Python's list comprehension?](https://stackoverflow.com/questions/4260280/if-else-in-pythons-list-comprehension) – Right leg Aug 29 '17 at 08:19
  • 1
    `else "{} in @v".format(k) if isinstance(v, Iterable)`. Why the second `if` there? – 10101010 Aug 29 '17 at 08:20
  • 2
    If you can't work out how to write a comprehension, then write it out as a loop, that will do no harm unless performance is absolutely critical and will be more readable for you no doubt – Chris_Rands Aug 29 '17 at 08:22
  • to check if the value is Iterable. It has to be an else if statement. – Juan Carlos Aug 29 '17 at 08:23
  • You can't put `if` again after the `else`. The list comprehension can't get it is a nested if/else clause. – cezar Aug 29 '17 at 08:24
  • @Juan not sure if it helps, but you may be able to re-use something from an [old answer](https://stackoverflow.com/questions/45711055/backlash-error-pandas-filter-dataframe-using-dynamic-query-string/45711693#45711693) of mine... (swapping the == with `in` depending on type) - so you're always using the `@` syntax... – Jon Clements Aug 29 '17 at 08:26

2 Answers2

1

If I understand correctly, you don't need to check the type:

In [47]: query
Out[47]: {'height': 175, 'lst_col': [1, 2, 3]}

In [48]: ' and '.join(['{} == {}'.format(k,v) for k,v in query.items()])
Out[48]: 'height == 175 and lst_col == [1, 2, 3]'

Demo:

In [53]: df = pd.DataFrame(np.random.randint(5, size=(5,3)), columns=list('abc'))

In [54]: df
Out[54]:
   a  b  c
0  0  0  3
1  4  2  4
2  2  2  3
3  0  1  0
4  0  4  1

In [55]: query = {"a": 0, 'b':[0,4]}

In [56]: q = ' and '.join(['{} == {}'.format(k,v) for k,v in query.items()])

In [57]: q
Out[57]: 'a == 0 and b == [0, 4]'

In [58]: df.query(q)
Out[58]:
   a  b  c
0  0  0  3
4  0  4  1
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
1

You misplaces the if/else in the comprehension. If you put the if after the for, like f(x) for x in iterable if g(x), this will filter the elements of the iterable (and can not be combined with an else). Instead, you want to keep all the elements, i.e. use f(x) for x in iterable where f(x) just happens to be a ternary expression, i.e. in the form a(x) if c(x) else b(x).

Instead, try like this (simplified non-numpy example):

>>> query = {"foo": 42, "bar": [1,2,3]}  
>>> " and ".join(["{} == {}".format(k, v)
                  if not isinstance(v, list)
                  else "{} in {}".format(k, v)
                  for k, v in query.items()])
'foo == 42 and bar in [1, 2, 3]'
tobias_k
  • 81,265
  • 12
  • 120
  • 179