I have some (very large) list of tuples that come from a database containing an id, start_time, and end_time
I also have a list of times in a regular interval and ordered (these are all datetime objects).
I basically need to loop through these times and find all the tuples in which the time falls within their range.
I'm wondering what the most efficient way to do this is. The first idea that comes to mind is like this (pseudo-code):
for time in times:
for tuple in tuples:
if tuple.start_time <= time <= tuple.end_time:
# add tuple to some_other_list
if tuple.end_time < time
# remove tuple from tuples
The reason I'd do that is to iterate over a smaller and smaller list, hoping to cut some time there, however I am open to completely different approaches as well. I suppose another idea would be to just query the db with the given time each iteration but I'd imagine the latency there would far outweigh just having the full dataset in memory and working with it that way.
So for example, I would have a list of tuples where each tuple would look like:
[('783', datetime.datetime(2017, 12, 31, 20, 49, 28), datetime.datetime(2017, 12, 31, 23, 49, 28)), ('5274', datetime.datetime(2017, 12, 31, 20, 49, 45), datetime.datetime(2018, 1, 1, 0, 0)), ('757', datetime.datetime(2017, 12, 31, 20, 50, 25), datetime.datetime(2018, 1, 1, 1, 50, 25)), ('5600', datetime.datetime(2017, 12, 31, 20, 50, 59), datetime.datetime(2017, 12, 31, 23, 39)), ('5176', datetime.datetime(2017, 12, 31, 20, 51, 23), datetime.datetime(2018, 1, 1, 1, 51, 23)), ('5323', datetime.datetime(2017, 12, 31, 20, 52, 39), datetime.datetime(2018, 1, 1, 0, 0)), ('464', datetime.datetime(2017, 12, 31, 20, 52, 41), datetime.datetime(2018, 1, 1, 0, 52, 41))]
And the list of times would be stored in a generator basically using this answer, so looping through them would produce something like this:
2017-12-15 00:00:00
2017-12-22 00:00:00
2017-12-29 00:00:00
2018-01-05 00:00:00
2018-01-12 00:00:00
2018-01-19 00:00:00
And the actual output I'm fairly agnostic to, it would just be some dictionary of
{'2017-12-15 00:00:00': [list of matching ids], '2017-12-22 00:00:00': [list of matching ids], ...}
Any ideas or advice would be appreciated!