I have a df
which contains order data:
date order_id cost category
2021-07-12 10 50 A
2021-07-12 10 57 B
2021-08-15 15 76 C
2022-01-11 5 67 C
In reality I have about 40 columns, I get an error when I try to run:
df.date = pd.to_datetime(df.date)
InvalidIndexError: Reindexing only valid with uniquely valued Index objects
I've read many answers and most of them say it has to do with NaN
values or duplicate columns. So I checked:
sum(df.date.isna())
# 0
len(df.columns) == len(set(df.columns))
# True
So I don't have 'NaN' values and I have no duplicates. The other strange thing is that if I close VSCode and restart everything sometimes this error does not appear for the same dataset, so is there some kind of bug? This time I tried restarting everything multiple times but the error persists.
I checked this answer and others:
first answer and tried df = df.reset_index()
which did not help. So I am stuck and unable to find the reason for this error which seems really strange.
Update
I tried df['date'] = pd.to_datetime(df['date'])
which returns the same error.
I checked the following:
{type(x) for x in df.date}
# {datetime.date, pandas._libs.tslibs.timestamps.Timestamp, str}
Which could be the problem.