How to reset a DataFrame's indexes for all groups in one step?

Question

I've tried to split my dataframe to groups

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
                       'foo', 'bar', 'foo', 'foo'],
                   'B' : ['1', '2', '3', '4',
                       '5', '6', '7', '8'],
                   })

grouped = df.groupby('A')

I get 2 groups

     A  B
0  foo  1
2  foo  3
4  foo  5
6  foo  7
7  foo  8

     A  B
1  bar  2
3  bar  4
5  bar  6

Now I want to reset indexes for each group separately

print grouped.get_group('foo').reset_index()
print grouped.get_group('bar').reset_index()

Finally I get the result

     A  B
0  foo  1
1  foo  3
2  foo  5
3  foo  7
4  foo  8

     A  B
0  bar  2
1  bar  4
2  bar  6

Is there better way how to do this? (For example: automatically call some method for each group)

I don think so, I want to have reseted indexes for each group.. (post updated) — Meloun, Mar 14 '14 at 14:47
Do you really need to index reset on each group (can't it be the sub index of the original dataframe)? if not, why not. — Andy Hayden, Mar 14 '14 at 16:25
Adding to @AndyHayden, would you simply like to slice your group rows by integer position? If so, you could use `.iloc`. For instance, `grouped.get_group('foo').iloc[0:3]` would return the first three rows of 'foo' while maintaining the original indexing. — Greg, Mar 14 '14 at 16:57

Andy Hayden · Answer 1 · 2014-03-14T16:33:55.137

40

Pass in as_index=False to the groupby, then you don't need to reset_index to make the groupby-d columns columns again:

In [11]: grouped = df.groupby('A', as_index=False)

In [12]: grouped.get_group('foo')
Out[12]:
     A  B
0  foo  1
2  foo  3
4  foo  5
6  foo  7
7  foo  8

Note: As pointed out (and seen in the above example) the index above is not [0, 1, 2, ...], I claim that this will never matter in practice - if it does you're going to have to just through some strange hoops - it's going to be more verbose, less readable and less efficient...

edited Mar 14 '14 at 16:33

answered Mar 14 '14 at 15:56

Andy Hayden

359,921
101
625
535

`as_index` doesn't do anything for `get_group`; try `df.groupby('A', as_index=True).get_group('foo').index`; it returns the original data-frame index ( at least on `0.13.1` ) – behzad.nouri Mar 14 '14 at 16:08
I initially thought something like this would work too, but the output indexing is different than what he is looking for. – Greg Mar 14 '14 at 16:20
@Greg That's a good point, however it seems unlikely that this will matter.. presumably what matters is that the grouped by columns are in columns again. – Andy Hayden Mar 14 '14 at 16:24
@behzad.nouri can't think of a time when this would ever be a problem / there would ever be a reason to care about the distinction. – Andy Hayden Mar 14 '14 at 16:26
@behzad.nouri which is to say, **it does do something** it ensures the groupedby columns are not in the index but are columns. – Andy Hayden Mar 14 '14 at 16:38

score 16 · Answer 2 · answered Feb 21 '19 at 07:47

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
                       'foo', 'bar', 'foo', 'foo'],
                   'B' : ['1', '2', '3', '4',
                       '5', '6', '7', '8'],
                   })
grouped = df.groupby('A',as_index = False)

we get two groups

grouped_index = grouped.apply(lambda x: x.reset_index(drop = True)).reset_index()

Result in two new columns level_0 and level_1 getting added and the index is reset


 level_0level_1 A   B
0   0     0    bar  2
1   0     1    bar  4
2   0     2    bar  6
3   1     0    foo  1
4   1     1    foo  3
5   1     2    foo  5
6   1     3    foo  7
7   1     4    foo  8

result = grouped_index.drop('level_0',axis = 1).set_index('level_1')

Creates an index within each group of "A"

          A     B
level_1     
0        bar    2
1        bar    4
2        bar    6
0        foo    1
1        foo    3
2        foo    5
3        foo    7
4        foo    8

score 4 · Answer 3 · edited Jan 23 '19 at 17:05

4

df=df.groupby('A').apply(lambda x: x.reset_index(drop=True)).drop('A',axis=1).reset_index()

edited Jan 23 '19 at 17:05

Zoe

27,060
21
118
148

answered Jan 23 '19 at 07:20

Songhua Hu

191
2
4

4

Welcome to Stackoverflow, along with code please add few lines to explain what you intend to do. – Ubercool Jan 23 '19 at 07:27

score 1 · Answer 4 · answered Mar 14 '14 at 14:54

1

Something like this would work:

for group, index in grouped.indices.iteritems():
    grouped.indices[group] = range(0, len(index))

You could probably make it less verbose if you wanted to.

answered Mar 14 '14 at 14:54

Greg

6,791
3
18
20

1

I would be **wary** of modifying indices like this, it's used behind the scenes in other groupby methods so potentially this will break stuff if you're reusing the groupby. (Kinda clever though..) – Andy Hayden Mar 14 '14 at 16:37

score -3 · Answer 5 · answered Dec 18 '18 at 14:31

-3

Isn't this just grouped = grouped.apply(lambda x: x.reset_index()) ?

answered Dec 18 '18 at 14:31

BAC83

811
1
12
27

1

This is a comment rather than an answer. – Shayan Shafiq Mar 05 '20 at 10:56

How to reset a DataFrame's indexes for all groups in one step?

5 Answers5

Linked

Related