I am looking for a good data structure to contain a list of tuples with (hash, timestamp)
values. Basically, I want to use it in the following way:
- Data comes in, check to see if it's already present in the data structure (hash equality, not timestamp).
- If it is, update the timestamp to "now"
- If not, add it to the set with timestamp "now"
Periodically, I wish to remove and return a list of tuples that older than a specific timestamp (I need to update various other elements when they 'expire'). Timestamp does not have to be anything specific (it can be a unix timestamp, a python datetime
object, or some other easy-to-compare hash/string).
I am using this to receive incoming data, update it if it's already present and purge data older than X seconds/minutes.
Multiple data structures can be a valid suggestion as well (I originally went with a priority queue + set, but a priority queue is less-than-optimal for constantly updating values).
Other approaches to achieve the same thing are welcome as well. The end goal is to track when elements are a) new to the system, b) exist in the system already and c) when they expire.