Skip first x rows when setting a column value in a DataFrame with a non-integer index?

Question

Lets say I want to set a columns value in a DataFrame.

It works when I do have standard integer indexes:

df.loc[14:, 'avg_gain'] = 5

but when I have a DatetimeIndex:

df.set_index(keys=['ts'], inplace=True)

(or another Index, which is non-integer), it yields


TypeError: cannot do slice indexing on <class 'pandas.core.indexes.datetimes.DatetimeIndex'> with these indexers [14] of <class 'int'>
)

So how is it possible to skip the first xrows when applying new values on a DataFrame which has an alternative index than the standard one?

Love it.. Works! Just make sure that the column `avg_gain` is already in the `df`. Otherwise `get_loc` can not find it. Would you like to copy it to an answer @Erfan? Otherwise I will summarize it. — gies0r, Feb 13 '20 at 22:23

score 2 · Accepted Answer · answered Feb 13 '20 at 22:25

2

Use DataFrame.iloc which is position based indexing. DataFrame.loc is label based indexing, so it does not recognize 14: if your index is datetime:

df.iloc[14:, df.columns.get_loc('avg_gain')] = 5

Or with loc:

df.loc[df.index[14:], 'avg_gain'] = 5

Note: Index.get_loc will throw an error if the column does not exist, so make sure the column exists.

answered Feb 13 '20 at 22:25

Erfan

40,971
8
66
78

With `.loc`: `np.arange(len(df)) >= 14` for the rows will be safer in case of duplicated indices. – ALollz Feb 14 '20 at 15:40

score 0 · Answer 2 · answered Feb 13 '20 at 22:34

You are getting this error because the .loc[] property is primarily label based. That means that the 14: you have entered grabs the rows from index 14 as a string and not as an integer. If you set the index on a column which contains strings you have to adjust it accordingly.

import pandas as pd
df = pd.DataFrame({'A': [1,2,3,4,5,6],
                   'B': ['a','b','c','d','e','f']})

    A   B
0   1   a
1   2   b
2   3   c
3   4   d
4   5   e
5   6   f

Then you can use .loc

df.loc[:1]

    A   B
0   1   a
1   2   b

Set the index on a column with strings

df = df.set_index('B')
    A
B   
a   1
b   2
c   3
d   4
e   5
f   6

df.loc[:'b']

    A
B   
a   1
b   2

You can also use the .iloc property which is primarily integer position based:

df.iloc[:2,:]
    A
B   
a   1
b   2

Skip first x rows when setting a column value in a DataFrame with a non-integer index?

2 Answers2