1

In Python I am trying to remove rows of a dataframe if the dates are lower than the dates of another dataframe. But the comparison does not work.

Here are my two dataframes and the results I try to compare. print(MeteoCH.head()) will result in:

                       TempAvg  TempMin  TempMax
Date                                                                      
2021-11-15 00:00:00      4.4      4.3      4.5     
2021-11-15 01:00:00      4.3      4.3      4.3     
2021-11-15 02:00:00      4.1      4.1      4.2     
2021-11-15 03:00:00      4.0      3.8      4.1    
2021-11-15 04:00:00      3.6      3.4      3.8    

And print(PicoLog.head()) will result in:

                           Temp1   Temp2   Temp3   
Date                                                                        
2021-11-15 18:34:18+01:00  21.268  21.671  21.190     
2021-11-15 18:34:20+01:00  21.266  21.673  21.194     
2021-11-15 18:34:22+01:00  21.270  21.680  21.194     
2021-11-15 18:34:24+01:00  21.263  21.673  21.180    
2021-11-15 18:34:26+01:00  21.262  21.672  21.185

If I try to execute the following command:

MeteoCH.drop(MeteoCH[MeteoCH.index < PicoLog.index.min()], inplace=True)

It results with the following error :

TypeError: Invalid comparison between dtype=datetime64[ns] and Timestamp

Why? How to solve it ?

I tried to "convert" it somehow, but it does not work.

Can someone help me please ?

Elfo2285
  • 15
  • 2
  • 7

1 Answers1

1

Simplier is filter by greater or equal, inverted < like:

MeteoCH[MeteoCH.index >= PicoLog.index.min()]

MeteoCH[~(MeteoCH.index < PicoLog.index.min())]

Your solution is possible change filtering MeteoCH.index, but in my opinion overcomplicated:

MeteoCH.drop(MeteoCH.index[MeteoCH.index < PicoLog.index.min()], inplace=True)

EDIT:

Original problem was timezone offset, solution is DatetimeIndex.tz_localize:

PicoLog.index = PicoLog.index.tz_localize(None)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Hi, thanks for your answer, but it does not work. If I type `MeteoCH[MeteoCH.index >= PicoLog.index.min()]`, I still have `TypeError: Invalid comparison between dtype=datetime64[ns] and Timestamp` – Elfo2285 Nov 24 '21 at 09:10
  • @Elfo2285 - What is your pandas version? – jezrael Nov 24 '21 at 09:11
  • ```$ pip show pandas Name: pandas Version: 1.3.3``` – Elfo2285 Nov 24 '21 at 09:13
  • @Elfo2285 - What is `print (MeteoCH.index.dtype)` and `print (PicoLog.index.dtype)` ? – jezrael Nov 24 '21 at 09:16
  • `print(MeteoCH.index.dtype)` --> `datetime64[ns]` `print(PicoLog.index.dtype)` --> `datetime64[ns, pytz.FixedOffset(60)]` – Elfo2285 Nov 24 '21 at 09:18
  • 1
    By the way, I should remove this fixedoffset. I do not know why it appears when I read my excel file... – Elfo2285 Nov 24 '21 at 09:19
  • 1
    @Elfo2285 - it should help. Try `PicoLog.index = PicoLog.index.tz_localize(None)` (like mentioned [here](https://stackoverflow.com/questions/49198068/how-to-remove-timezone-from-a-timestamp-column-in-a-pandas-dataframe)) – jezrael Nov 24 '21 at 09:21