0

My dataframe consists of multiple columns with NaN values. I want to replace NaN values of only specific column ( column name: MarkDown1) with 0.

The statement I wrote is:

data1.loc[:,['MarkDown1']] = data1.loc[:,['MarkDown1']].fillna(0)

My statement is raising a warning:

    C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexing.py:965: SettingWithCopyWarning: 
    A value is trying to be set on a copy of a slice from a DataFrame.
    Try using .loc[row_indexer,col_indexer] = value instead

    See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
    self.obj[item] = s

I request not not to mark my question as duplicate because I have referred the documentation and previous questions and have tried to implement the suggestions given. The suggestion was to use .loc. I have used .loc only in my code as mentioned above. But still I am getting the warning. Kindly suggest the correct syntax to eliminate the warning.

Pavan
  • 381
  • 1
  • 4
  • 19

1 Answers1

1

The source of your problem is that you created data1 probably as a view of another DataFrame.

The result is that:

  • data1 is a separate DataFrame,
  • but it shares its data buffer with another (source) DataFrame.

Example:

  1. I created a DataFrame named df with the following content:

       Xxx  MarkDown1
    0    1       10.0
    1    2       20.0
    2    3        NaN
    3    4       30.0
    4    5       40.0
    
  2. Then I created data1 as a subset of df:

    data1 = df[df.Xxx < 5]
    

    Of course, data1 contains now (actually presents) first 4 rows from df.

  3. When I executed your instruction, your error message was presented.

To avoid it create data1 as a separate DataFrame, with its own data buffer:

data1 = df[df.Xxx < 5].copy()

This time, when you run your instruction, no error occurs.

Valdi_Bo
  • 30,023
  • 4
  • 23
  • 41