I am new to python, and need some help with a question I am having regarding the date time function.
I have df_a
which has a column titled time
, and I am trying to create a new column id
in this df_a
.
I want the id
column to be determined by whether or not the time is contained within a range of times on df_b
columns between "date" and "date_new", for example the first row has a date of "2019-01-07 20:52:41" and "date_new" of "2019-01-07 21:07:41" (a 15 minute time interval), I would like the index for this row, to appear as my id in df_a
for when the time is "2019-01-07 20:56:30" (i.e. with id=0) and so on for all the rows in df_a
,
This question is similar, but cannot figure out how to make it work with mine as I keep getting
python assign value to pandas df if falls between range of dates in another df
s = pd.Series(df_b['id'].values,pd.IntervalIndex.from_arrays(df_b['date'],df_b['date_new']))
df_a['id']=df_a['time'].map(s)
ValueError: cannot handle non-unique indices
one caveat is that the ranges in df_b are not always unique, meaning some of the intervals contain the same periods of time, in these cases its fine if it uses the id of the first time period in df_b that it falls in, additionally there are over 200 rows in df_b, and 2000 in df_a, so it will take to long to define each time period in a for-loop type format, unless there is an easier way to do it than defining each, thank you in advance for all of your help! if this could use any clarification please let me know!
df_a
time id
2019-01-07 22:02:56 NaN
2019-01-07 21:57:12 NaN
2019-01-08 09:35:30 NaN
df_b
date date_new id
2019-01-07 21:50:56 2019-01-07 22:05:56 0
2019-01-08 09:30:30 2019-01-08 09:45:30 1
Expected Result
df_a
time id
2019-01-07 22:02:56 0
2019-01-07 21:57:12 0
2019-01-08 09:35:30 1