How do you drop certain rows per group pandas

Question

I have a Dataframe and would like to drop certain rows for each category. Here is the data:

data={'GROUP':['A','A','A','B','B','B','B','C','C','C','C','C'],'DATE':['202101','202102','202103','201907','201908','201909',
'201910','202003','202004','202005','202006','202007']}
df=pd.DataFrame(data, columns=['GROUP','DATE'])

   GROUP    DATE
0      A  202101
1      A  202102
2      A  202103
3      B  201907
4      B  201908
5      B  201909
6      B  201910
7      C  202003
8      C  202004
9      C  202005
10     C  202006
11     C  202007

I would like to drop all the rows after the second date per group. In other words I would like to produce something to this effect:

  GROUP    DATE
0     A  202101
1     A  202102
3     B  201907
4     B  201908
7     C  202003
8     C  202004

score 1 · Answer 1 · edited Sep 03 '21 at 07:17

1

Use GroupBy.head:

df.groupby('GROUP').head(2)

OUTPUT

  GROUP    DATE
0     A  202101
1     A  202102
3     B  201907
4     B  201908
7     C  202003
8     C  202004

edited Sep 03 '21 at 07:17

ThePyGuy

17,779
5
18
45

answered Sep 03 '21 at 07:15

jezrael

822,522
95
1,334
1,252

score 0 · Answer 2 · answered Sep 03 '21 at 07:14

Group the dataframe by GROUP and apply a function to take a slice of two values only.

>>> df.groupby(['GROUP'])['DATE'].apply(lambda x: x[:2]).droplevel(-1).reset_index()

  GROUP    DATE
0     A  202101
1     A  202102
2     B  201907
3     B  201908
4     C  202003
5     C  202004

How do you drop certain rows per group pandas

2 Answers2