0

I have a pandas DataFrame which includes a datetime column and I want to filter the data frame between the current hour and 10 hours ago. I have tried different ways to do it but still I cannot handle it. Because when I want to use pandas, the column type is Series and I can't use timedelta to compare them. If I use a for loop to compare the column as a string to my time interval, it is not efficient.

The table is like this:

enter image description here

And I want to filter the 'dateTime' column between current time and 10 hours ago, then filter based on 'weeks' > 80.

I have tried these codes as well But they have not worked:

filter_criteria = main_table['dateTime'].sub(today).abs().apply(lambda x: x.hours <= 10)
main_table.loc[filter_criteria]

This returns an error:

TypeError: unsupported operand type(s) for -: 'str' and 'datetime.datetime'

Similarly this code has the same problem:

main_table.loc[main_table['dateTime'] >= (datetime.datetime.today() - pd.DateOffset(hours=10))]

And:

main_table[(pd.to_datetime('today') - main_table['dateTime'] ).dt.hours.le(10)]

In all of the code above main_table is the name of my data frame.
How can I filter them?

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
V.Nouri
  • 207
  • 3
  • 15
  • have you converted your datetime column to actual datetime object? you can do it using `to_datetime()` – Babak Fi Foo Aug 26 '21 at 07:28
  • what should I import to do it? ` import datetime check = datetime.to_datetime(main_table['dateTime']) print(check - - timedelta(hours=10)) ` – V.Nouri Aug 26 '21 at 07:34
  • it is one of the pandas methods. you can see how it is done in here: https://www.geeksforgeeks.org/convert-the-column-type-from-string-to-datetime-format-in-pandas-dataframe/ – Babak Fi Foo Aug 26 '21 at 07:39
  • I have tried this: check = pd.to_datetime(main_table['dateTime']) But I still do not know how I can filter the dataframe between two hours! – V.Nouri Aug 26 '21 at 07:45
  • I have print the type of check to be sure about the results, unfortunately it is still pandas series: – V.Nouri Aug 26 '21 at 07:49

1 Answers1

0

First you need to make sure that your datatype in datetime column is correct. you can check it by using:

main_table.info()

If it is not datetime (i.e, object) convert it:

# use proper formatting if this line does not work
main_table['dateTime'] = pd.to_datetime(main_table['dateTime']) 

Then you need to find the datetime object of ten hour before current time (ref):

from datetime import datetime, timedelta
date_time_ten_before = datetime.now() - timedelta(hours = 10)

All it remains is to filter the column:

main_table_10 = main_table[main_table['dateTime'] >= date_time_ten_before]
Babak Fi Foo
  • 926
  • 7
  • 17