-1

When I run this code

df = raw.copy() # making a copy of dataframe raw
df['new col'] = ''
for i in range(len(df)):
    df['new col'].loc[i] = 'some thing'

I got this warning (warning 1):

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_block(indexer, value, name)

Based on user aaossa's suggestion, I changed the code as follows

df = raw
df['new col'] = ''
for i in range(len(df)):
    df.loc[i, 'new col'] = 'some thing'

I then get a similar SettingWithCopyWarning (warning 2) with an added tip:

<ipython-input-74-75e3db22bde6>:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['new col'] = ''
SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_single_column(loc, value, pi)

The added tip in warning 2 Try using .loc[row_indexer,col_indexer] = value instead came too late when I already implemented it (df.loc[i, 'new col'] = 'some thing').

My question are:

  1. why wasn't that relevant added tip included in warning 1, which would solve the problem at hand and help me to avoid a novice mistake df = raw in the second attempt with the modified code?
  2. Why was that tip added in warning 2 when I already implemented (df.loc[i, 'new col'] = 'some thing')?
  3. What is pi in this opaque information self._setitem_single_column(loc, value, pi). What does it mean?
Nemo
  • 1,124
  • 2
  • 16
  • 39

1 Answers1

1

The warning seems to refer to the operation inside your for loop. Try this instead:

df.loc[i, 'new col'] = 'some thing'

This statement has the intended effect: it changes the value in the dataframe but it operates directly on the dataframe. When you split the operation (df['new col'].loc[i]) you're operating on the view (df['new col']) and that's the reason for the warning.

EDIT: You should do both, copy your dataframe and use .loc

test2 = real_df.copy()
for i in range(len(test2)):
    test2.loc[i, 'b'] = 5
aaossa
  • 3,763
  • 2
  • 21
  • 34
  • 1
    Thanks, but your code will make more 4 columns with the names of `(i, b)`as shown by this toy example ```test = pd.DataFrame({'a': [1, 2, 3, 4]}) test['b'] = '' for i in range(len(test)): test[i, 'b'] = 5``` – Nemo Mar 31 '22 at 03:13
  • You're right @Nemo , I forgot to add `.loc`. I've updated my answer – aaossa Mar 31 '22 at 05:01
  • 1
    Your answer solved my original problem but I changed the scope of the OP. Appreciate your help though. – Nemo Apr 01 '22 at 11:08