Say I have a pandas dataframe and a dictionary as defined below:
import pandas as pd
df = pd.DataFrame( { "c1": np.array(['a','a','b','b','a']) , "c2" : np.array([1,2,2,2,2])} )
c1 c2
0 a 1
1 a 2
2 b 2
3 b 2
4 a 2
to_keep = {'a':[1],'b':[2,3]}
{'a': [1], 'b': [2, 3]}
I want to keep those elements where both the key and one of the values of to_keep
is true. In other words, I want to get the following dataframe:
c1 c2
0 a 1
2 b 2
3 b 2
I have tried many things, like df[(df["c1"] in to_keep.keys) and df["c2"] in to_keep["c1"]]
, but the thing is the I cannot pass the correct argument to the to_keep dict to get the appropriate value. I have thought of making a list of all possible combinations of c1
and c2
, but that may be a bit inefficient regarding the size of dataset I have.
Any suggestions?