How to add a column with values 1 to len(df) to a dataframe

Question

The index that I have in the dataframe (with 30 rows) is of the form:

Int64Index([171, 174, 173, 172, 199, …, 175, 200])

The index is not strictly increasing because the data frame is the output of a sort().

I want to add a column which is the series:

[1, 2, 3, 4, 5, …, 30]

How should I go about doing that?

score 170 · Answer 1 · answered Aug 29 '12 at 03:13

170

How about:

df['new_col'] = range(1, len(df) + 1)

Alternatively if you want the index to be the ranks and store the original index as a column:

df = df.reset_index()

answered Aug 29 '12 at 03:13

Chang She

16,692
8
40
25

8

This answer got me halfway to where I wanted since I already had an index that I wanted replaced. In such a case you can complement with: `df = df.reset_index(drop=True)` – javabeangrinder Nov 03 '16 at 14:14
2

Using `np.arange` instead of native `range`, like `df['new_col'] = np.arange(1, df.shape[0] + 1)` should speed up the runtime, especially when dealing with large datasets. – panzerpower Oct 12 '20 at 04:24

score 107 · Answer 2 · answered Oct 13 '13 at 18:57

107

I stumbled on this question while trying to do the same thing (I think). Here is how I did it:

df['index_col'] = df.index

You can then sort on the new index column, if you like.

answered Oct 13 '13 at 18:57

2

No, that would be unsorted. – pacholik Jul 20 '15 at 13:59
2

more dynamic `df[df.index.name] = df.index` – citynorman Sep 28 '21 at 02:36

score 23 · Accepted Answer · edited Jul 17 '19 at 19:47

23

How about this:

from pandas import *

idx = Int64Index([171, 174, 173])
df = DataFrame(index = idx, data =([1,2,3]))
print df

It gives me:

Is this what you are looking for?

edited Jul 17 '19 at 19:47

Rudy Matela

6,310
2
32
37

answered Aug 28 '12 at 23:13

nitin

7,234
11
39
53

Almost. So, in sum, I need to create another data frame which contains the rank/position of the row. And then, I need to join these. – Navneet Aug 28 '12 at 23:23
Yes you combine add this df to your existing dataframe by using df.combine_first(df2) – nitin Aug 29 '12 at 00:00

score 9 · Answer 4 · answered Oct 20 '21 at 19:21

The way to do that would be this:

Resetting the index:

df.reset_index(drop=True, inplace=True)

Sorting an index:

df.sort_index(inplace=True)

Setting a new index from a column:

df.set_index('column_name', inplace=True)

Setting a new index from a range:

df.index = range(1, 31, 1) #a range starting at one ending at 30 with a stepsize of 1.

Sorting a dataframe based on column value:

df.sort_values(by='column_name', inplace=True)

Reassigning variables works as-well:

df=df.reset_index(drop=True)
df=df.sort_index()
df=df.set_index('column_name')
df.index = range(1, 31, 1) #a range starting at one ending at 30 with a stepsize of 1.
df=df.sort_values(by='column_name')

I don't think you answered: `I want to add a column which is the [sort order] series` ie set a column to the index. — flywire, Jun 23 '23 at 12:42

How to add a column with values 1 to len(df) to a dataframe

4 Answers4