I have a DataFrame with a column that contains of dictionaries. My task is to compare first two values inside dict and if they are equal then I want to collect entire row. I can not show any code of mine because I really don't know how to organize this. But I am going to create a small example of my DF to make the situation more clear.
import pandas as pd
test = pd.DataFrame({'one':['hello', 'there', 'every', 'body'],
'two': ['a', 'b', 'c', 'd'],
'dict': [{'composition': 12, 'process': 4, 'pathology': 4},
{'food': 9, 'composition': 9, 'process': 6, 'other_meds': 3},
{'process': 2},
{'composition': 6, 'other_meds': 6, 'pathology': 2, 'process': 1}]})
test
So the data looks like this:
one two dict
0 hello a {'composition': 12, 'process': 4, 'pathology': 4}
1 there b {'food': 9, 'composition': 9, 'process': 6, 'other_meds': 3}
2 every c {'process': 2}
3 body d {'composition': 6, 'other_meds': 6, 'pathology': 2, 'process': 1}
My target is to collect to a new DataFrame rows with index 1 and 3 because two first values of a dict are the same 'food': 9, 'composition': 9
and 'composition': 6, 'other_meds': 6
. Row with index number 0 is having same values but it is not interesting because they are not in first and second position.
I know that we are using loc
and iloc
to collect the rows. But how to assign the condition for dictionary I don't know. Please help!