I am trying to adjust a dataframe by appending columns and changing values but get the well known warning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
I've changed the code but keep getting the error. Am I doing it correctly and do I need to suppress the warning (if so, how can I do it on the specific line)?
The code:
def append_columns(df: pd.DataFrame) -> pd.DataFrame:
"""Create additional columns based on existing information in DataFrame"""
for col in TIMEWINDOWS:
df.loc[:, col + "_time"] = df[col].dt.time # warning here
df["da_datetime"] = pd.to_datetime(df["da_time"], format="%Y-%m-%dT%H:%M:%S").dt.tz_convert(config.TIME_ZONE) # warning here
df["da_time"] = (
df["da_datetime"] - df["da_datetime"].dt.normalize() # warning here
)
df["ud"] = pd.to_datetime(df["psb_time"], format="%Y-%m-%dT%H:%M:%SZ").dt.dayofweek # warning here
df["ud"] = df["ud"].astype(int) # warning here
df["cd"] = df["ud"] # warning here
df.loc[df["psb_time"].dt.hour < 6, "cd"] -= 1
df["cd"] %= 7 # warning here
df["cd"] = df["cd"].astype(int) # warning here
return df
if __name__ == "__main__":
df = pd.read_csv(...)
df = df.pipe(...).pipe(append_columns).pipe(...)
On all of the lines I've tried replacing df[col] with df.loc[:, col] (the preferred method according to: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy) but I keep getting the warning.
Am I doing it correctly? If so: can/do I need to supress these warnings on a per line basis? Does it matter? (I'm overwriting anyways)
I've obviously read:
- https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
- How to deal with SettingWithCopyWarning in Pandas
and I think I understand but can't get rid of these warnings..