0

I have the following dataframe

id code code_date   medication medication_date
1  A     2017-05-18 Y          2017-05-18
1  A     2017-05-25 V          2017-05-18
1  Y     2017-07-18 D          2017-05-18
2  C     2017-08-18 C          2017-05-18
2  C     2017-09-18 Y          2017-05-18
2  Y     2017-03-18 O          2017-05-18

I would like to select the rows where where the earliest code takes place. In the above example, for patient 1 we have code A repeated while for patient 2 C is repeated. I would like to remove those rows of the repeated code and the latter date. Also, note that I do not care about the medication or medication_date but they should be in the new dataframe:

id code code_date   medication medication_date
1  A     2017-05-18 Y          2017-05-18
1  Y     2017-07-18 D          2017-05-18
2  C     2017-08-18 C          2017-05-18
2  Y     2017-03-18 O          2017-05-18

So far I have tried:

df.groupby(["id", "code", "code_date"]).nth(0).reset_index()

But I don't get the right answer. Any suggestions are more than welcome.

Economist_Ayahuasca
  • 1,648
  • 24
  • 33

0 Answers0