I have a DataFrame df_holiday and I want to select the row, which consists the next nearest date from todays date in order to extract the holiday name.
+----------+---------------------+
| date | name |
+----------+---------------------+
| 01012019 | Neujahr |
| 06012019 | Heilige Drei Könige |
| 19042019 | Karfreitag |
| 22042019 | Ostermontag |
| 01052019 | Tag der Arbeit |
| 10062019 | Pfingstmontag |
+----------+---------------------+
If I do type(df_holiday['date'][0])
it outputs str
Now I want to convert the column into datetime format using this:
import datetime
df_holiday['date'] = df_holiday['date'].apply(lambda x: datetime.datetime.strptime(x, '%d%m%Y'))
First question here: Doc says it will return a datetime but in my case I get a timestamp, why?
type(df_holiday['date'][0])
pandas._libs.tslibs.timestamps.Timestamp
print(df_holiday['date'][0]
Timestamp('2019-01-01 00:00:00')
I found this post from dawg who suggested the following routine:
min([d for d in df_holiday['date'] if d> datetime.date.today()], key=lambda s:
datetime.datetime.strptime(s, "%d%m%Y").date()-datetime.date.today())
Since the values in the date column are timestamps I got a
TypeError: Cannot compare type 'Timestamp' with type 'date'
I think this will work if its possible to convert the date column in a proper datetime format and not in timestamp format. How can I achieve this? Or is there any better solution for my case?