-1

I try to filter Pandas DataFrame:

df = pd.read_csv('ml_data.csv', dtype=str)

def df_filter(df):
    #df = df.copy()

    df.replace('(not set)', '(none)', inplace=True) #comment this and warning will disappear!!!
    df = df[df['device_browser'] != '(none)'] #comment this and warning will disappear!!!

    def browser_filter(s): 
        return ''.join([c for c in s if c.isalpha()])
    df['device_browser'] = df['device_browser'].apply(browser_filter)

    return df

df = df_filter(df)

And I receive this warning:


/tmp/ipykernel_2185/1710484338.py:11: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['device_browser'] = df['device_browser'].apply(browser_filter)

But if I uncomment

#df = df.copy() 

OR comment

df.replace('(not set)', '(none)', inplace=True) 

OR comment

df = df[df['device_browser'] != '(none)']

OR will not wrap filtering in df_filter function

this warning will disappear!!! WHY??????????

I danced around the fire and beat the tambourine...

Ars ML
  • 49
  • 4
  • Does this answer your question? [How to deal with SettingWithCopyWarning in Pandas](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas) – rpanai Mar 06 '23 at 11:26
  • NO. Why df = df.copy() removes warning? Or other cases? – Ars ML Mar 06 '23 at 11:41
  • import warnings warnings.filterwarnings('ignore') -- I use this to filter out warnings. There is nothing to stop you. There is only a message. – Tornike Kharitonishvili Mar 06 '23 at 11:45
  • No, I don't want to suppress the warning, I want to investigate it! – Ars ML Mar 06 '23 at 13:46

1 Answers1

0

Because by doing df.copy() you create a deep copy of our dataframe, you can see that in the documentation, deep = True by default.

So if you create a deep copy of your base dataframe, the warning will disappear.

But, if you don't, you will create shallow copy using:df.replace('(not set)', '(none)', inplace=True).
And after you try to filter a shallow copy using df = df[df['device_browser'] != '(none)'], that why you have this warning. So if you remove one the two lines, it is logic that you don't have the warning.

I invite you to check the difference between shallow and deep copy on this stackoverflow question.

Adrien Riaux
  • 266
  • 9
  • OK, but why this warning appears only if execute this code wrapped in df_filter function? – Ars ML Mar 06 '23 at 13:45
  • My intention was to do all filtering inplace, without any copies - deep or shallow. – Ars ML Mar 06 '23 at 13:51
  • Yes you can, but it is good practice to do a copy of the dataframe before modify it anyway. So may be you can start be doing a copy, and after perform your filters. Otherwise, I am sorry, I don't have a better answer. – Adrien Riaux Mar 06 '23 at 16:10