I have a massive dataframe. The dataframe has column patient.drug. This column contains list of dictionaries as its elements. I want to filter out all the rows that contain 'NIFEDIPINE' word in patient.drug column.
The dataframe is very large. Here is a sample of it.
patient.drug
0 [{'drugcharacterization': '1', 'medicinalproduct': 'PANDOL'}]
1 [{'drugcharacterization': '2', 'medicinalproduct': 'NIFEDIPINE'}]
2 [{'drugcharacterization': '3', 'medicinalproduct': 'SIMVASTATIN'}]
3 [{'drugcharacterization': '4', 'medicinalproduct': 'NIFEDIPINE'}]
so far, I have tried
df[df['patient.drug'].str.contains('NIFEDIPINE')]
but it is giving me an error.
raise KeyError(f"None of [{key}] are in the [{axis_name}]")
KeyError: "None of [Float64Index([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,\n ...\n nan, nan, nan, nan, nan, nan, nan, nan, nan, nan],\n dtype='float64', length=12000)] are in the [columns]"
I have also tried using in
operator and iterating over rows.
lst=[]
for i in range(len(df)):
if 'NIFEDIPINE' in df.loc[i, "patirnt.drug"]:
lst.append(i)
print(lst)
Which is also causing an error. What should I do to get it right?