0

SettingWithCopyWarning with Pandas

I have continued to encounter this warning after reading much of the documentation, including the warning link: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

and the questions below:

How to deal with SettingWithCopyWarning in Pandas

Convert Pandas Column to DateTime

I am working on learning Python and I am having a very hard time actually getting this warning message to go away. I am also struggling to replicate the issue using smaller fake data, but my failed attempt to replicate the problem is below.

I do not get the SettingWithCopyWarning in this small example, but every time I try to run the same code on my full dataframe (with 30K simulated VINs and vehicle data), I get the SettingWithCopyWarning. I have read about chained indexing and understand that it is problematic. Unfortunately, I don't understand when the chained indexing is causing a problem (i.e. When do you get a View vs a Copy and which of the below notations are actually chained indexing in the examples I have included? Thanks for any advice on this frustrating topic.

import pandas as pd

vin_dat = pd.DataFrame({'vin' : [1, 2, 3, 4, 5],
    'purchase_date' : ["2020-03-26", "2021-04-05", "2021-12-17", "2021-12-18", "2022-01-30"],
    'nvlw_end_date' : ["2023-03-26", "2024-04-05", "2024-12-17", "2024-12-18", "2025-01-30"] })

vin_dat.loc[:, ("purchase_date", "nvlw_end_date")] = vin_dat.loc[:, ("purchase_date", "nvlw_end_date")].copy().apply(pd.to_datetime)
# DeprecationWarning: In a future version, `df.iloc[:, i] = newvals` will attempt to set the values inplace instead of always setting a new array.

vin_dat[["purchase_date", "nvlw_end_date"]] = vin_dat[["purchase_date", "nvlw_end_date"]].apply(pd.to_datetime)
# This works without an error on this sample, but gives me a SettingWithCopyWarning on my larger dataset

vin_dat['purchase_date'] = vin_dat['purchase_date'].apply(pd.to_datetime) 
# This works without an error on this sample, but gives me a SettingWithCopyWarning on my larger dataset

vin_dat['nvlw_end_date'] = pd.to_datetime(vin_dat['nvlw_end_date'])
# This works without an error on this sample, but gives me a SettingWithCopyWarning on my larger dataset
mle
  • 1
  • How large is your larger dataset? Is it the same dtypes? – Learning is a mess May 24 '23 at 07:47
  • My larger dataset is 30,000 VINs and I just made the VINs integers instead of long strings, so the data types should be the same. If you are interested in looking at the Git repository where I simulate the data, it is here: https://github.com/emilysheen/reliabilityAnalysis. I simulated the data myself and then did some plots and now I'm trying to work through some statistical modeling. I think despite the warning the data is right, but I'm very frustrated I keep encountering the warning trying to follow good coding form. – mle May 24 '23 at 19:39

1 Answers1

0

Not really an answer but I need to use this space to share a screenshot. I have not been able to reproduce the warnings with the data file you pointed me towards, see:

enter image description here

Are we using the same data? same pandas version?

Learning is a mess
  • 7,479
  • 7
  • 35
  • 71