3

I have a dict and a dataframe like the examples v and df below. I want to search through the items in df and return the item that has the maximum number of field values in common with the values in the dict. In this case it would be item 3. I was thinking maybe using apply with a lambda function, or transposing the df. I just can't quiet get my head around it. If anyone has a slick way to do this or any tips they're greatly appreciated.

input:

v={'size':1,'color':red}

df:

item size color
2    2    red
3    1    red

Output:
3
jpp
  • 159,742
  • 34
  • 281
  • 339
user3476463
  • 3,967
  • 22
  • 57
  • 117

2 Answers2

2

Create one line DataFrame and merge with original:

a = pd.DataFrame(v, index=[0]).merge(df)['item']
print (a)
0    3
Name: item, dtype: int64

Another solution with query, but if strings values of dict is necessary add another ":

v1 = {k: '"{}"'.format(v) if isinstance(v, str) else v for k, v in v.items()}
print (v1)
{'size': 1, 'color': '"red"'}

df = df.query(' & '.join(['{}=={}'.format(i,j) for i, j in v1.items()]))['item']
print (df)
1    3
Name: item, dtype: int64

In output are possible 3 ways - Series with more values, one value or empty, so helper function was created:

def get_val(v):
    x = pd.DataFrame(v, index=[0]).merge(df)['item']
    if x.empty:
        return 'Not found'
    elif len(x) == 1:
        return x.values[0]
    else:
        return x.values.tolist()
print (get_val({'size':1,'color':'red'}))
3

print (get_val({'size':10,'color':'red'}))
Not found

print (get_val({'color':'red'}))
[2, 3]
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
2

An alternative solution is to work with dictionaries instead of dataframes:

v = {'size': 1, 'color': 'red'}

match_count = {}

fields = df.columns[1:]

for k, value in df.to_dict(orient='index').items():
    match_count[value['item']] = sum(value[i] == v[i] for i in fields & v.keys())

Result

print(match_count)
# {2: 1, 3: 2}

res = max(match_count.items(), key=lambda x: x[1])

print(res)
# (3, 2)
jpp
  • 159,742
  • 34
  • 281
  • 339