I have a dataframe with two columns, each one formed by a set of dates. I want to compute the difference between dates and return the the number of days. However, the process (described above) is very slow. Does anyone knows how to accelerate the process? This code is being used in a big file and speed is important.
dfx = pd.DataFrame([[datetime(2014,1,1), datetime(2014,1,10)],[datetime(2014,1,1), datetime(2015,1,10)],[datetime(2013,1,1), datetime(2014,1,12)]], columns = ['x', 'y'])
dfx['diffx'] = dfx['y']-dfx['x']
dfx['diff'] = dfx['diffx'].apply(lambda x: x.days)
dfx
Final goal: