1

Suppose a I got a string from front end and such as

str='(A==1) & (B==\'A\') & (C>sin(2))'

this is the simplest format, the string could be much much more complex.

and I would like apply the condition in dataframe filtering, such as

data = {'A': [1, 2, 3, 4],\
        'B': ['A','B','C','D'],\
        'C':[0.1,0.2,0.3,0.4]}
df=pd.DataFrame(data)
df_test=df[eval(str)]

To make this work, I have to find variables A,B,C in the string and replace them by df.A, df.B, df.C.

I've tried the following method

import ast
names = [node.id for node in ast.walk(ast.parse(str)) if isinstance(node, ast.Name)]
print(names)

but it returns ['C', 'A', 'B', 'sin'] in which 'sin' is not required.

I also tried pyparse but still can not figure out how to define the pattern of variable name.

It will be much appreciated if you can help to give me some advice on how to find and replace the variable name in string?

  • You need to make your own definition of what you want and don't want; Python is not telepathic. How would it know `sin` is not required? From Python's perspective, `sin` is a variable (one that would, if you `from math import sin`, contain an object of type `builtin_function_or_method`). You might filter by e.g. "variable names that are all uppercase", or "variable names that are in the set of names of `df` columns". – Amadan Nov 15 '18 at 03:14
  • Thanks. If sin is a 'variable' for Python, is there any way to differentiate normal variable and function variable? – enzhong Wang Nov 15 '18 at 05:32
  • [How do I detect whether a Python variable is a function?](https://stackoverflow.com/questions/624926/how-do-i-detect-whether-a-python-variable-is-a-function) – Amadan Nov 15 '18 at 05:45
  • Don't define the variable `str` (or `int` or `float` or `bool` or `dict` or `list` or `tuple` or...) – PaulMcG Dec 04 '19 at 14:45

1 Answers1

0

You can use an ast.NodeTransformer to make the replacements:

import ast
s = '(A==1) & (B==\'A\') & (C>sin(2))'
data = {'A': [1, 2, 3, 4], 'B': ['A', 'B', 'C', 'D'], 'C': [0.1, 0.2, 0.3, 0.4]}
class toDf(ast.NodeTransformer):
    def visit_Name(self, node):
       if node.id in data: #check if variable name exists in the data
           node = ast.Attribute(value=ast.Name(id='df'), attr=node.id)
       return node

new_s = ast.unparse(toDf().visit(ast.parse(s)))
print(new_s)

Output:

(df.A == 1) & (df.B == 'A') & (df.C > sin(2))
Ajax1234
  • 69,937
  • 8
  • 61
  • 102