How to delete the first line of each group in pandas

Question

I have a dataframe like this :

   id  values
0   1       3
1   1       6
2   1       3
3   2       7
4   2       6
5   2       3
6   2       9

And I want to delete the first line of each group based on id,the result should like this:

   id  values
1   1       6
2   1       3
4   2       6
5   2       3
6   2       9

I tried it done by: df = df.groupby('id').agg(lambda x:x[1:]),but it doesn't work.

Can someone help me?Thanks in advance

Does this answer your question? [Python: Pandas - Delete the first row by group](https://stackoverflow.com/questions/31226142/python-pandas-delete-the-first-row-by-group) — RichieV, Sep 03 '20 at 04:10

jezrael · Accepted Answer · 2018-05-23T10:42:20.600

5

Use apply with iloc:

df = df.groupby('id', group_keys=False).apply(lambda x:x.iloc[1:])
#also working, not sure if generally
#df = df.groupby('id', group_keys=False).apply(lambda x:x[1:])
print (df)
   id  values
1   1       6
2   1       3
4   2       6
5   2       3
6   2       9

Or duplicated with boolean indexing:

df = df[df['id'].duplicated()]
print (df)
   id  values
1   1       6
2   1       3
4   2       6
5   2       3
6   2       9

Detail:

print (df['id'].duplicated())
0    False
1     True
2     True
3    False
4     True
5     True
6     True
Name: id, dtype: bool

edited May 23 '18 at 10:42

answered May 23 '18 at 10:33

jezrael

822,522
95
1,334
1,252

what is group_keys – Pyd May 23 '18 at 10:34
@pyd - good question - for avoid Multiindex in `groupby.apply` – jezrael May 23 '18 at 10:35

score 1 · Answer 2 · answered May 23 '18 at 10:40

1

Another approach:

df.loc[~df.index.isin(df.drop_duplicates(subset='id').index)]

answered May 23 '18 at 10:40

zipa

27,316
6
40
58

How to delete the first line of each group in pandas

2 Answers2

Linked