I have 2 pandas dataframes, both of them having the same columns but different line numbers depending on missing rows, one of the columns is the Date
having the following format 29/09/2020 13.22.57
the day month year omitted sometimes below for simplicity and irrelevance
The Dates might be an exact match in df
as df_2
or there might be an acceptable delay of a threshold that we preset, in this case it's 2s.
Sample data of df['Date']
:
13.24.19
13.24.35
13.25.07
13.25.23
13.26.00
13.26.13
13.26.54
Sample data of df_2['Date']
:
13.24.19
13.24.35
13.25.23
13.26.13
13.26.38
expected
df['Date']:
13.22.57
13.23.13
13.23.44
13.24.02
13.24.19
13.24.35
0
13.25.23
0
13.26.13
13.26.38
df_2['Date']:
13.24.19
13.24.35
13.25.07
13.25.23
13.26.00
13.26.13
0
13.26.54
the increment can happen for either df
or df_2
depends which has a bigger time for the missing column and at the end both should have the same number of rows as the rows that are not matching will have a 0 value now and an increment will happen to the one under.
Dataframes:
d = {'Date': ['13.24.19', '13.24.35','13.25.07', '13.25.23','13.26.00', '13.26.13','13.26.54'], 'col2': [1, 2, 3, 4, 5, 6, 7]}
df = pd.DataFrame(data=d)
df['Date'] = pd.to_datetime(df['Date'], format='%H.%M.%S')
d2 = {'Date': ['13.24.19', '13.24.35','13.25.23', '13.26.13','13.26.38'], 'col2': [1, 2, 3, 4, 5]}
df_2 = pd.DataFrame(data=d2)
df_2['Date'] = pd.to_datetime(df_2['Date'], format='%H.%M.%S')