0

I have an error while trying to find a value in a pandas df from a timestamp. My df has a timestamp index.

My timestamp is :

time = datetime.datetime.fromtimestamp(sub_data_2[0, itime])
print(time)
2021-06-29 09:53:08.805039

My df index looks like this :

print(df.index)
DatetimeIndex(['2021-06-30 08:45:43', '2021-06-30 08:45:45',
               '2021-06-30 08:45:46', '2021-06-30 08:45:47',
               '2021-06-30 08:45:48', '2021-06-30 08:45:50',
               '2021-06-30 08:45:51', '2021-06-30 08:45:52',
               '2021-06-30 08:45:53', '2021-06-30 08:45:54',
               ...
               '2021-06-28 16:34:22', '2021-06-28 16:34:23',
               '2021-06-28 16:34:24', '2021-06-28 16:34:25',
               '2021-06-28 16:34:26', '2021-06-28 16:34:27',
               '2021-06-28 16:34:28', '2021-06-28 16:34:29',
               '2021-06-28 16:34:30', '2021-06-28 16:34:31'],
              dtype='datetime64[ns]', name='T', length=54143, freq=None)

Using the index.get_loc function :

index = df.index.get_loc(time, method='nearest')

The error is :

pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

I see this error can come from the concatenation of dataframes with conflicts of indexes but it is not the case here. Any ideas ?

3 Answers3

1

You have to remove duplicates from the index first. This is already answered here - answer

1

You can find the duplicated index like this:

df[df.index.duplicated(keep=False)]
Andreas
  • 8,694
  • 3
  • 14
  • 38
  • @KarlMontalban, please elaborate. This should not solve it, it should help you find the duplicates which are responsible for the error. – Andreas Sep 06 '21 at 11:29
  • Finally I had to remove duplicates and sort my index. Thanks for your help ! – Karl Montalban Sep 07 '21 at 12:58
  • @KarlMontalban, Glad I could help, if the answer was helpful, please consider to upvote and accept it. It helps future readers and gives me a few points, for you as well when acepting the answer. – Andreas Sep 07 '21 at 13:56
0

I had two problems : duplicates and sort the timestamp index.

df = df.sort_index()
df = df.drop_duplicates()