0

I know a value should not be set on a view of a pandas dataframe and I'm not doing that but I'm getting this error. I have a function like this:

def do_something(df):
    # id(df) is xxx240
    idx = get_skip_idx(df)  # another function that returns a boolean series
    if any(idx):
        df = df[~idx]
    # id(df) is xxx744, df is now a local variable which is a copy of the input argument
    assert not df._is_view  # This doesn't fail, I'm not having a view
    df['date_fixed'] = pd.to_datetime(df['old_date'].str[:10], format='%Y-%m-%d')
    # I'm getting the warning here which doesn't make any sense to me

I'm using pandas 1.4.1. This sounds like a bug to me, wanted to confirm I'm not missing anything before filing a ticket.

BigBen
  • 46,229
  • 7
  • 24
  • 40
anishtain4
  • 2,342
  • 2
  • 17
  • 21

1 Answers1

0

My understanding is that _is_view can return false negatives and that you are actually working on a view of the original dataframe.

One workaround is to replace df[~idx] with df[~idx].copy():

import pandas as pd

df = pd.DataFrame(
    {
        "value": [1, 2, 3],
        "old_date": ["2022-04-20 abcd", "2022-04-21 efgh", "2022-04-22 ijkl"],
    }
)


def do_something(df, idx):
    if any(idx):
        df = df[~idx].copy()
        df["date_fixed"] = pd.to_datetime(df["old_date"].str[:10], format="%Y-%m-%d")
    return df


print(do_something(df, pd.Series({0: True, 1: False, 2: False})))
# No warning
   value         old_date date_fixed
1      2  2022-04-21 efgh 2022-04-21
2      3  2022-04-22 ijkl 2022-04-22
Laurent
  • 12,287
  • 7
  • 21
  • 37