0

I am trying to write a function to de deseasonalize any pandas dataframe. This always works, but I still get the "SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame" Is this not the correct way to set a value?

def deseasonalize(df):
    # Function to deseasonalize any pandas dataframe
    df.loc[df.index.month==1]=df.loc[df.index.month==1]-df.loc[df.index.month==1].mean()
    df.loc[df.index.month==2]=df.loc[df.index.month==2]-df.loc[df.index.month==2].mean()
    df.loc[df.index.month==3]=df.loc[df.index.month==3]-df.loc[df.index.month==3].mean()
    df.loc[df.index.month==4]=df.loc[df.index.month==4]-df.loc[df.index.month==4].mean()
    df.loc[df.index.month==5]=df.loc[df.index.month==5]-df.loc[df.index.month==5].mean()
    df.loc[df.index.month==6]=df.loc[df.index.month==6]-df.loc[df.index.month==6].mean()
    df.loc[df.index.month==7]=df.loc[df.index.month==7]-df.loc[df.index.month==7].mean()
    df.loc[df.index.month==8]=df.loc[df.index.month==8]-df.loc[df.index.month==8].mean()
    df.loc[df.index.month==9]=df.loc[df.index.month==9]-df.loc[df.index.month==9].mean()
    df.loc[df.index.month==10]=df.loc[df.index.month==10]-df.loc[df.index.month==10].mean()
    df.loc[df.index.month==11]=df.loc[df.index.month==11]-df.loc[df.index.month==11].mean()
    df.loc[df.index.month==12]=df.loc[df.index.month==12]-df.loc[df.index.month==12].mean()
    return df
peter_wx
  • 79
  • 1
  • 8
  • 2
    Does this answer your question? [How to deal with SettingWithCopyWarning in Pandas](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas) – Amin S Jan 26 '23 at 16:24
  • I think adding `df = df.copy()` at the beginning of the function might solve the issue. In a function, you want to modify copy of the argument, not the original. This is only when you return the result that the user should chose to override it or not. `df = deseasonalize(df)`. – Florian Fasmeyer Jan 26 '23 at 17:56
  • 1
    I know your question was about the `SettingWithCopyWarning`, but your approach is somewhat of an anti-pattern because you're hard-coding the months – and it's also inefficient to repeatedly slice your df using `.loc`. instead, you can use built-in methods like `groupby` and `transform` that are optimized for dataframes to accomplish the same task – see [this Q&A](https://stackoverflow.com/questions/60716814/pandas-subtracting-after-groupby-mean) for a pretty elegant (and also efficient) solution – Derek O Jan 27 '23 at 01:42

0 Answers0