0

I have a df which contains order data:

date        order_id        cost        category
2021-07-12  10              50          A
2021-07-12  10              57          B
2021-08-15  15              76          C
2022-01-11  5               67          C

In reality I have about 40 columns, I get an error when I try to run:

df.date = pd.to_datetime(df.date)

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

I've read many answers and most of them say it has to do with NaN values or duplicate columns. So I checked:

sum(df.date.isna())
# 0

len(df.columns) == len(set(df.columns))
# True

So I don't have 'NaN' values and I have no duplicates. The other strange thing is that if I close VSCode and restart everything sometimes this error does not appear for the same dataset, so is there some kind of bug? This time I tried restarting everything multiple times but the error persists.

I checked this answer and others:

first answer and tried df = df.reset_index() which did not help. So I am stuck and unable to find the reason for this error which seems really strange.

Update

I tried df['date'] = pd.to_datetime(df['date']) which returns the same error.

I checked the following:

{type(x) for x in df.date}

# {datetime.date, pandas._libs.tslibs.timestamps.Timestamp, str}

Which could be the problem.

Jonas Palačionis
  • 4,591
  • 4
  • 22
  • 55

1 Answers1

1

You can convert values to strings:

df['date'] = pd.to_datetime(df['date'].astype(str))
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252