So I'm trying to take a dataframe like this (for example):
ID | reason_for_rejection
--------------------------
1 | invalid insurance
2 | behavior issues
3 | not enough money
4 | no space in hospital
5 | anger issues
...
and, using a hand-written mapping (for example {financial: [invalid insurance, not enough money], patient problems: [behavior issues, anger issues]...} create a new column containing the mapped values and turn this into:
ID | reason_for_rejection | reason_for_rejection_grouped
---------------------------------------------------------------
1 | invalid insurance | financial
2 | behavior issues | patient problems
3 | not enough money | financial
4 | no space in hospital | occupancy
5 | anger issues | patient problems
...
So while the 'reason_for_rejection' column will have a lot of unique values, I want to use some kind of a mapping that maps those unique values into 7 or 8 unique values in 'reason_for_rejection_grouped'.
I considered using a dictionary here, but the key would be a value in 'reason_for_rejection_grouped' and the values would be values in 'reason_for_rejection', so then I'd have to get the key based off the value which would be computationally expensive (and I have a really big dataset to look at).
Any guidance or suggestions would be super helpful!