548

I have a dataframe from which I remove some rows. As a result, I get a dataframe in which index is something like that: [1,5,6,10,11] and I would like to reset it to [0,1,2,3,4]. How can I do it?


The following seems to work:

df = df.reset_index()
del df['index']

The following does not work:

df = df.reindex()
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
Roman
  • 124,451
  • 167
  • 349
  • 456

3 Answers3

1098

DataFrame.reset_index is what you're looking for. If you don't want it saved as a column, then do:

df = df.reset_index(drop=True)

If you don't want to reassign:

df.reset_index(drop=True, inplace=True)
Shubham Sharma
  • 68,127
  • 6
  • 24
  • 53
mkln
  • 14,213
  • 4
  • 18
  • 22
73

Another solutions are assign RangeIndex or range:

df.index = pd.RangeIndex(len(df.index))

df.index = range(len(df.index))

It is faster:

df = pd.DataFrame({'a':[8,7], 'c':[2,4]}, index=[7,8])
df = pd.concat([df]*10000)
print (df.head())

In [298]: %timeit df1 = df.reset_index(drop=True)
The slowest run took 7.26 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 105 µs per loop

In [299]: %timeit df.index = pd.RangeIndex(len(df.index))
The slowest run took 15.05 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 7.84 µs per loop

In [300]: %timeit df.index = range(len(df.index))
The slowest run took 7.10 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 14.2 µs per loop
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 2
    @Outcast Source - The fastest is `len(df.index)`, 381ns vs `df.shape` 1.17us. Oyr something missing? – jezrael Jan 03 '18 at 05:15
  • 1
    This is an elegant solution to reset the index. Thank you! I found out that if you try to convert an hdf5 object to pandas.DataFrame object, you have to reset the index before you can edit certain sections of the DataFrame. – troymyname00 Jun 16 '19 at 12:38
  • Does the timing change much if you do `df.reset_index(drop=True, inplace=True)` to avoid the copy? – Cole Mar 27 '22 at 13:54
23
data1.reset_index(inplace=True)
rsc
  • 10,348
  • 5
  • 39
  • 36
user10692571
  • 249
  • 2
  • 2