39

How do I go about resetting the index of my dataframe columns to 0,1,2,3,4?

(How come doing df.reset_index() doesn't reset the column index?)

>>> data = data.drop(data.columns[[1,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19]], axis=1)
>>> data = data.drop(data.index[[0,1]],axis = 0)
>>> print(data.head())
             0         2    3    4    20
2  500292014600       .00  .00  .00  NaN
3  500292014600    100.00  .00  .00  NaN
4  500292014600  11202.00  .00  .00  NaN
>>> data = data.reset_index(drop = True)
>>> print(data.head())
              0         2    3    4    20
 0  500292014600       .00  .00  .00  NaN
 1  500292014600    100.00  .00  .00  NaN
 2  500292014600  11202.00  .00  .00  NaN
smci
  • 32,567
  • 20
  • 113
  • 146
MrClean
  • 1,300
  • 2
  • 12
  • 17

7 Answers7

47

Warning: This method has serious potential side effects and should probably not be used - see comments!

Try the following:

df = df.T.reset_index(drop=True).T
Thomas
  • 4,696
  • 5
  • 36
  • 71
Pablo Fonseca
  • 733
  • 6
  • 6
  • 11
    this sounds like a hack; the `reset_index` method should be possibly applied to column indexes as well – ClementWalter Mar 08 '19 at 22:18
  • 2
    That's a nice way around the problem but is there a direct way to reset both row and column index? – Prashant Kumar Jan 13 '21 at 13:10
  • 1
    Another downside that is not mentioned is that transposing messes up the dtypes. For example `df = pd.DataFrame({'A': [1,2,3], 'B': list('ABC')})` will have int64/object, but `df.T.T` will have all objects. Also, transposing can be expensive. – mozway Sep 07 '22 at 08:52
42

Try replacing the column names:

>>> import numpy as np
>>> import pandas as pd

>>> my_data = [[500292014600, .00, .00, .00, np.nan],
              [500292014600, 100.00, .00, .00, np.nan], 
              [500292014600, 11202.00, .00, .00, np.nan]]
>>> df = pd.DataFrame(my_data, columns=[0,2,3,4,20])
>>> df
              0        2    3    4  20
0  500292014600      0.0  0.0  0.0 NaN
1  500292014600    100.0  0.0  0.0 NaN
2  500292014600  11202.0  0.0  0.0 NaN

>>> df.columns = range(df.columns.size)
>>> df
              0        1    2    3   4
0  500292014600      0.0  0.0  0.0 NaN
1  500292014600    100.0  0.0  0.0 NaN
2  500292014600  11202.0  0.0  0.0 NaN
wisbucky
  • 33,218
  • 10
  • 150
  • 101
Patrick Nieto
  • 582
  • 4
  • 4
4

In pandas, by index you essentially mean row index. As you can see in your data, the row index is reset after drop and reset_index().

For columns, you need to rename them, you can do something like

data.columns = [ 0,1,2,3,4]
Vaishali
  • 37,545
  • 5
  • 58
  • 86
3

I realize there is no answer that can easily be used in a pipeline/method chaining (except the double transpose, but this is a waste of computing in my opinion and has the downside of messing up the dtypes).

One can use set_axis:

df.set_axis(range(df.shape[1]), axis=1)

Used with pipe for method chaining:

df = (pd.DataFrame('x', columns=list('ABCD'), index=range(2))
        .pipe(lambda d: d.set_axis(range(d.shape[1]), axis=1))
     )

output:

   0  1  2  3
0  x  x  x  x
1  x  x  x  x
mozway
  • 194,879
  • 13
  • 39
  • 75
1

If you have numpy imported with import numpy as np

simply set the columns to zero based indexes with data.columns = [np.arange(0,data.shape[1])]

1

Pure Python Implementation

We enumerate the columns of the dataframe to create an array of items. Then we map the function reversed to each item in the array. Lastly, we create and input the dictionary as the parameter columns in the data frame object method rename.

columns = dict(map(reversed, enumerate(df.columns)))
df = df.rename(columns=columns)
df.head()

Results:

              0        1    2    3   4
0  500292014600      0.0  0.0  0.0 NaN
1  500292014600    100.0  0.0  0.0 NaN
2  500292014600  11202.0  0.0  0.0 NaN
Jeff Hernandez
  • 2,063
  • 16
  • 20
1

How come when i use df.reset_index the index of my columns is not reset?

The column "index" is really more of a column title. It's possible that someone may using the "index" as a meaningful title. For example, perhaps they represent "Trial 1", "Trial 2", etc., so you wouldn't want to re-index it automatically and lose the significance.

How do I go about resetting this index to 0,1,2,3,4?

To reset the column indexes:

df.columns = range(df.columns.size)
wisbucky
  • 33,218
  • 10
  • 150
  • 101