4

I am trying to set the value of a column in a dataframe by selecting via an index.

myindex = (df['city']==old_name) & (df['dt'] >= startDate) & (df['dt'] < endDate)
new_name = 'Boston2
df['proxyCity'].ix[myindex ] = new_name

In the above, I want to assign the value Boston2 in the proxyCity column given the condition in myindex

C:\Users\blah\Anaconda3\lib\site-packages\pandas\core\indexing.py:132: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self._setitem_with_indexer(indexer, value)

What is the correct way to do what I want to do without introducing problems as outlined in the documentation.

The answer in this link using pandas to select rows conditional on multiple equivalencies seems to do it the way I implemented it.

I'm not sure what is the right way to do this.

Community
  • 1
  • 1
codingknob
  • 11,108
  • 25
  • 89
  • 126

2 Answers2

11

To avoid double indexing, change

df['proxyCity'].ix[myindex ] = new_name

to

df.loc[myindex, 'proxyCity'] = new_name

It's more efficient (fewer __getitem__ function calls) and in most cases, will eliminate the SettingWithCopyWarning.

Note, however, that if df is a sub-DataFrame of another DataFrame, it is possible for a SettingWithCopyWarning to be emitted even when using df.loc[...] = new_name. Here, Pandas is warning that modifying df will not affect the other DataFrame. If that is not your intent then the SettingWithCopyWarning can be safely ignored. For ways to silence the SettingWithCopyWarning see this post.

Community
  • 1
  • 1
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
0

You can use mask

mask = (df.city == old_name) & (df.dt >= startDate) & (df.dt < endDate)
new_name = 'Boston2
df.loc[:, 'proxyCity'] = df.proxyCity.mask(mask, new_name)
# Or
df.proxyCity.mask(mask, new_name, inplace=True)
Ninjakannon
  • 3,751
  • 7
  • 53
  • 76
jrovegno
  • 699
  • 5
  • 11