0

I am trying to update missing values in a column in a pandas dataframe, but am getting a warning about the method. When I change to the suggestion, I still get the same error. What can I do to not get yelled at by pandas?

First attempt:

df['column1'].fillna(inplace=True, value='Low') 

First response:

/usr/local/bin/python3.5/site-packages/pandas/core/indexing.py:3295: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

Second attempt:

df.loc[df['column1'].isnull(), 'column1'] = 'Low'

Second response:

/usr/local/bin/python3.5/site-packages/pandas/core/generic.py:476: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame
mikebmassey
  • 8,354
  • 26
  • 70
  • 95
  • Have you tried using `.replace()`? Basically: `import numpy as np; df.replace({'column1: {np.nan: 'Low'}'}, inplace = True)`? – Abdou Jan 09 '17 at 02:49
  • 1
    How did you get `df`? You may be getting the warning because `df` itself was created by slicing some other DataFrame. – BrenBarn Jan 09 '17 at 02:49
  • @BrenBarn = I build df from a subset of big_df. `df = big_df[big_df['columnA'] == 'X']` – mikebmassey Jan 09 '17 at 02:50
  • @mikebmassey: That is probably why you're getting the warning. If you don't care whether `big_df` is affected, you can likely ignore the warning. You could also make a copy (`df = big_df[big_df['columnA'] == 'X'].copy()`) so that pandas will know `df` is no longer "linked" to `big_df`. – BrenBarn Jan 09 '17 at 02:54
  • @Abdou Not liking .replace() either. `df.replace({'column1': {np.nan: 'Low'}}, inplace = True) ^ SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead` – mikebmassey Jan 09 '17 at 02:54
  • @BrenBarn Yup. That did it. I have never had a need to .copy(), but will start. Thanks. Post this as the answer and I will accept it. – mikebmassey Jan 09 '17 at 02:57
  • I think @BrenBarn is right. `df` must be referencing the bigger dataframe still. – Abdou Jan 09 '17 at 02:57
  • @BrenBarn I was looking everywhere to solve this mystery. I'm glad I came across your comment. Thanks. – ba_ul Sep 05 '17 at 17:31

1 Answers1

1

Based on your comments, it sounds like df itself is creating by slicing a larger big_df. This is the reason for the warning. Anytime you get a slice of a DataFrame and try to modify it, you have the possibility of getting this warning, since there is uncertainty about whether modifying df will also affect the big_df of which it is a part.

If you don't care about the effect on big_df you can likely ignore the warning. The other solution is to use .copy() when creating df to ensure that you get an "unencumbered" DataFrame that is not linked to the original larger one.

BrenBarn
  • 242,874
  • 37
  • 412
  • 384