SettingWithCopyWarning with Pandas
I have continued to encounter this warning after reading much of the documentation, including the warning link: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
and the questions below:
How to deal with SettingWithCopyWarning in Pandas
Convert Pandas Column to DateTime
I am working on learning Python and I am having a very hard time actually getting this warning message to go away. I am also struggling to replicate the issue using smaller fake data, but my failed attempt to replicate the problem is below.
I do not get the SettingWithCopyWarning in this small example, but every time I try to run the same code on my full dataframe (with 30K simulated VINs and vehicle data), I get the SettingWithCopyWarning. I have read about chained indexing and understand that it is problematic. Unfortunately, I don't understand when the chained indexing is causing a problem (i.e. When do you get a View vs a Copy and which of the below notations are actually chained indexing in the examples I have included? Thanks for any advice on this frustrating topic.
import pandas as pd
vin_dat = pd.DataFrame({'vin' : [1, 2, 3, 4, 5],
'purchase_date' : ["2020-03-26", "2021-04-05", "2021-12-17", "2021-12-18", "2022-01-30"],
'nvlw_end_date' : ["2023-03-26", "2024-04-05", "2024-12-17", "2024-12-18", "2025-01-30"] })
vin_dat.loc[:, ("purchase_date", "nvlw_end_date")] = vin_dat.loc[:, ("purchase_date", "nvlw_end_date")].copy().apply(pd.to_datetime)
# DeprecationWarning: In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array.
vin_dat[["purchase_date", "nvlw_end_date"]] = vin_dat[["purchase_date", "nvlw_end_date"]].apply(pd.to_datetime)
# This works without an error on this sample, but gives me a SettingWithCopyWarning on my larger dataset
vin_dat['purchase_date'] = vin_dat['purchase_date'].apply(pd.to_datetime)
# This works without an error on this sample, but gives me a SettingWithCopyWarning on my larger dataset
vin_dat['nvlw_end_date'] = pd.to_datetime(vin_dat['nvlw_end_date'])
# This works without an error on this sample, but gives me a SettingWithCopyWarning on my larger dataset