Each PatientID must have one record that is "Admit" AND one record that is "Discharge", nothing more nothing less.
In this table, for example, PatientID 152096 needs to go.
PatientID | EventType |
---|---|
25173 | Admit |
25173 | Discharge |
25174 | Admit |
25174 | Discharge |
152096 | Admit |
152096 | Admit |
I have got to this point by using
dfGrouped.groupby('PatientID').filter(lambda x: len(x) == 2)
I'm wondering if I should combine the PatientID into a single row first then check, or just check at this point.