I have a created a dataframe by reading several excel files using pandas.read_excel and by concatenating them with pandas.concat. let us say the dataframe is named mydata and the first two columns are datetime64[ns] and the rest float64 types. These correspond to date (e.g 2022-05-31 without information on the hour) and time (e.g. 22:00:00 with no information on the date)
I am trying to use the first two columns of this dataframe into the datetime module to combine them and I have a very hard time
So my code is import pandas import datetime
#Reading the excel files
mydata=pandas.DataFrame() #empty data frame
for file in fpath.glob('*.xlsx'):
tempdata=pandas.read_excel(file, skiprows=[1])
mydata = pandas.concat([mydata, tempdata], axis=0)
#Convert the hour strings into datetime
mydata['Hour'] = pandas.to_datetime(mydata['Hour'], format='%H:%M:%S')
# This does not work
datetime.datetime.combine(mydata['Date'] ,mydata['Hour'] )
TypeError: combine() argument 1 must be datetime.date, not Series
# This does not work either
date_series=mydata.Date.array
time_series=mydata.Hour.array
datetime.datetime.combine(date_series.date ,time_series.time )
TypeError: combine() argument 1 must be datetime.date, not numpy.ndarray
# Extracting lists wont help either. They are not datetime anymore
date_list=mydata.Date.to_list()
time_list=mydata.Hour.to_list()
# This will work but it is slow and bad coding (as far as I am concerned)
date_series=mydata.Date.array
time_series=mydata.Hour.array
for i in range(0, len(date_series)): # making the conversion one by one
mydata.loc[i,'Date']=datetime.datetime.combine(date_series.date[i] ,time_series.time[i] )
The question is whether there is a better way to do this? Or to reformulate: How to extract a datetime object from a dataframe. So far I have only managed to extract series, arrays and lists