2

I'm trying to change values in my DataFrame after merging it with another DataFrame and coming across some issues (doesn't appear to be an issue prior to merging).

I am indexing and changing values in my DataFrame with:

df.iloc[0]['column'] = 1

Subsequently I've joined (left outer join) along both indexes using merge (I realize left.join(right) would work too). After this when I perform the same value assignment using iloc, I receive the following warning:

__main__:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

A review of the linked document doesn't clarify the understanding hence, am I using an incorrect method of slicing with iloc? (keeping in mind I require positional based slicing for the purpose of my code)

I notice that df.ix[0,'column'] = 1 works, and similarly based on this page I can reference the column location with df.columns.get_loc('column') but on the surface this seems unnecessarily convoluted.

What's the difference between these methods under the hood, and what about merging causes the previous method (df.iloc[0]['column']) to break?

Community
  • 1
  • 1
mpny1
  • 107
  • 2
  • 9
  • See answer here https://stackoverflow.com/questions/25489875/pandas-settingwithcopywarning-when-using-loc – Serendipity Dec 04 '17 at 17:10
  • Have similar problem. Did see, No joy, my intention is to update the values, not create a copy of the dataframe and leave the original as is. Also: my .iloc comes from grp_kfold.split, so I just have the numbers, cannot do df.A=x – ntg Nov 07 '18 at 15:34

2 Answers2

1

You are using chained indexing above, this is to be avoided "df.iloc[0]['column'] = 1" and generates the SettingWithCopy Warning you are getting. The Pandas docs are a bit complicated but see SettingWithCopy Warning with chained indexing for the under the hood explanation on why this does not work.

Instead you should use df.loc[0, 'column'] = 1

.loc is for "Access a group of rows and columns by label(s) or a boolean array."

.iloc is for "Purely integer-location based indexing for selection by position."

Ed_N
  • 61
  • 4
0

It sucks, but the best solution I've come so far about updating a dataframe's column based on the .ilocs is find the iloc of a column, then use .iloc for everything:

column_i_loc = np.where(df.columns == 'column')[0][0] df.iloc[0, column_i_loc] = 1

Note you could also disable the warning, but really do not!...

Also, if you face this warning and were not trying to update some original DataFrame, then you forgot to make a copy and end up with a nasty bug...

ntg
  • 12,950
  • 7
  • 74
  • 95