0

Lets say I want to set a columns value in a DataFrame.

  1. It works when I do have standard integer indexes:
df.loc[14:, 'avg_gain'] = 5

but when I have a DatetimeIndex:

df.set_index(keys=['ts'], inplace=True)

(or another Index, which is non-integer), it yields


TypeError: cannot do slice indexing on <class 'pandas.core.indexes.datetimes.DatetimeIndex'> with these indexers [14] of <class 'int'>
)

So how is it possible to skip the first xrows when applying new values on a DataFrame which has an alternative index than the standard one?

gies0r
  • 4,723
  • 4
  • 39
  • 50
  • Love it.. Works! Just make sure that the column `avg_gain` is already in the `df`. Otherwise `get_loc` can not find it. Would you like to copy it to an answer @Erfan? Otherwise I will summarize it. – gies0r Feb 13 '20 at 22:23

2 Answers2

2

Use DataFrame.iloc which is position based indexing. DataFrame.loc is label based indexing, so it does not recognize 14: if your index is datetime:

df.iloc[14:, df.columns.get_loc('avg_gain')] = 5

Or with loc:

df.loc[df.index[14:], 'avg_gain'] = 5

Note: Index.get_loc will throw an error if the column does not exist, so make sure the column exists.

Erfan
  • 40,971
  • 8
  • 66
  • 78
  • With `.loc`: `np.arange(len(df)) >= 14` for the rows will be safer in case of duplicated indices. – ALollz Feb 14 '20 at 15:40
0

You are getting this error because the .loc[] property is primarily label based. That means that the 14: you have entered grabs the rows from index 14 as a string and not as an integer. If you set the index on a column which contains strings you have to adjust it accordingly.

import pandas as pd
df = pd.DataFrame({'A': [1,2,3,4,5,6],
                   'B': ['a','b','c','d','e','f']})

    A   B
0   1   a
1   2   b
2   3   c
3   4   d
4   5   e
5   6   f

Then you can use .loc

df.loc[:1]

    A   B
0   1   a
1   2   b

Set the index on a column with strings

df = df.set_index('B')
    A
B   
a   1
b   2
c   3
d   4
e   5
f   6

df.loc[:'b']

    A
B   
a   1
b   2

You can also use the .iloc property which is primarily integer position based:

df.iloc[:2,:]
    A
B   
a   1
b   2
Dimitris Thomas
  • 1,363
  • 9
  • 14