I have a Pandas Dataframe with about 30_000 records and would like to find all the records for a specific column whose combined count is less than 10. The Dataframe contains clinical trial data and the column I need to filter and update are diseases for each trial. There are diseases that appear in numerous clinical trials so I need to first filter out all the diseases that appear less than 10 times and than for those diseases, change those text to a new string called 'other'. All this information needs to be than updated in that same column.
This is the code that I have come up with but JupyterLab seems to freeze when I try to run it.
df_diseases = df.groupby(['Diseases']).filter(lambda x: x['Diseases'].count() < 10).apply(lambda x: x.replace(x,'other'))