77

The index that I have in the dataframe (with 30 rows) is of the form:

Int64Index([171, 174, 173, 172, 199, …, 175, 200])

The index is not strictly increasing because the data frame is the output of a sort().

I want to add a column which is the series:

[1, 2, 3, 4, 5, …, 30]

How should I go about doing that?

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Navneet
  • 9,590
  • 11
  • 34
  • 51

4 Answers4

170

How about:

df['new_col'] = range(1, len(df) + 1)

Alternatively if you want the index to be the ranks and store the original index as a column:

df = df.reset_index()
Chang She
  • 16,692
  • 8
  • 40
  • 25
  • 8
    This answer got me halfway to where I wanted since I already had an index that I wanted replaced. In such a case you can complement with: `df = df.reset_index(drop=True)` – javabeangrinder Nov 03 '16 at 14:14
  • 2
    Using `np.arange` instead of native `range`, like `df['new_col'] = np.arange(1, df.shape[0] + 1)` should speed up the runtime, especially when dealing with large datasets. – panzerpower Oct 12 '20 at 04:24
107

I stumbled on this question while trying to do the same thing (I think). Here is how I did it:

df['index_col'] = df.index

You can then sort on the new index column, if you like.

23

How about this:

from pandas import *

idx = Int64Index([171, 174, 173])
df = DataFrame(index = idx, data =([1,2,3]))
print df

It gives me:

     0
171  1
174  2
173  3

Is this what you are looking for?

Rudy Matela
  • 6,310
  • 2
  • 32
  • 37
nitin
  • 7,234
  • 11
  • 39
  • 53
  • Almost. So, in sum, I need to create another data frame which contains the rank/position of the row. And then, I need to join these. – Navneet Aug 28 '12 at 23:23
  • Yes you combine add this df to your existing dataframe by using df.combine_first(df2) – nitin Aug 29 '12 at 00:00
9

The way to do that would be this:

Resetting the index:

df.reset_index(drop=True, inplace=True)

Sorting an index:

df.sort_index(inplace=True)

Setting a new index from a column:

df.set_index('column_name', inplace=True)

Setting a new index from a range:

df.index = range(1, 31, 1) #a range starting at one ending at 30 with a stepsize of 1.

Sorting a dataframe based on column value:

df.sort_values(by='column_name', inplace=True)

Reassigning variables works as-well:

df=df.reset_index(drop=True)
df=df.sort_index()
df=df.set_index('column_name')
df.index = range(1, 31, 1) #a range starting at one ending at 30 with a stepsize of 1.
df=df.sort_values(by='column_name')
XiB
  • 620
  • 6
  • 19
  • I don't think you answered: `I want to add a column which is the [sort order] series` ie set a column to the index. – flywire Jun 23 '23 at 12:42