0

Objective: I am looking to define a function which takes a single argument - a dictionary of column names and values - and returns a list of matching criteria from a Pandas data frame

Details: I am looking to programmatically generate the following string

data[(mydf.anchor_name == 'ing') & (mydf.sales_qty ==8)]

Generates an example pandas dataframe:

import pandas as pd
mydf = pd.DataFrame({'sales_qty' : pd.Series([4,8,10]),
                    'distance' : pd.Series([454.75,477.25,242.12]),
                    'signature' : pd.Series(['ab','cd','ab']),
                    'anchor_name' : pd.Series(['tec','ing','pol']),
                    'station_list' : pd.Series([['t1','t2','t3'],
                    ['4','t2','t3'],['t3','t2','t4']])
                    })

I have been trying to work with this code:

data = mydf
params = {"anchor_name": 'ing', "sales_qty": 8}
filters = ["{}".format(k) for k in params]
t = tuple(params.values())
data += "[df."+ " ) & (df.".join(t).join(filters)+")]"
Student
  • 1,197
  • 4
  • 22
  • 39
  • Kind of related: https://stackoverflow.com/questions/45711055/backlash-error-pandas-filter-dataframe-using-dynamic-query-string – Jon Clements Mar 04 '18 at 19:07

1 Answers1

1

I think that is query

data.query(' and '.join(['{}=="{}"'.format(*tup) for tup in list(zip(filters,t))]))
Out[180]: 
  anchor_name  distance  sales_qty signature station_list
1         ing    477.25          8        cd  [4, t2, t3]
BENY
  • 317,841
  • 20
  • 164
  • 234