I have the following pandas dataset containing information about flights
YEAR int64
MONTH int64
DAY int64
DAY_OF_WEEK int64
AIRLINE object
FLIGHT_NUMBER int64
TAIL_NUMBER object
DEPARTURE_TIME object
ARRIVAL_TIME object
ELAPSED_TIME float64
CANCELLED int64
dtype: object
and I want to calculate the time difference in minutes between DEPARTURE_TIME and ARRIVAL_TIME columns. The two columns have 'datetime.time' dtypes and when I try to subtract them using
(f.DEPARTURE_TIME-f.ARRIVAL_TIME)
I get the following error
TypeError: unsupported operand type(s) for -: 'datetime.time' and 'datetime.time'
How can I get the time difference in minutes between the two columns? I want to use this difference to subtract from EPASED_TIME column in a later step so I want to be a float.
Edit:
The original data can be found here
The steps I followed to create the dataframe are:
import pandas as pd
# read dataset
flights = pd.read_csv('flights.csv',
dtype = {'DEPARTURE_TIME': object, 'ARRIVAL_TIME': object},
low_memory=False)
# keep only columns of interest
F = flights[['YEAR', 'MONTH', 'DAY',
'DAY_OF_WEEK', 'AIRLINE',
'FLIGHT_NUMBER', 'TAIL_NUMBER',
'DEPARTURE_TIME', 'ARRIVAL_TIME',
'ELAPSED_TIME', 'CANCELLED']]
# drop rows contain na
f = F.dropna()
# format columns to datetime
f['DEPARTURE_TIME'] = pd.to_datetime(f['DEPARTURE_TIME'],format='%H%M', errors='coerce').dt.time
f['ARRIVAL_TIME'] = pd.to_datetime(f['ARRIVAL_TIME'], format='%H%M', errors = 'coerce').dt.time
# calculate difference
(f.DEPARTURE_TIME-f.ARRIVAL_TIME)