2

I'm wondering if there is a way to throttle events in pure python. For example, usually other tools have a way to look at a specific event, say:

{"id":111,"message":"hello","host":"example"}

and there might be another event that comes right after or within a time interval that looks like:

 {"id":112,"message":"hello","host":"example"}

Given these two events, you typically can look at 2 keys in the events, so for this example, the message key and the host key have the same values between the two events. Because these 2 events have the same key and value, you can treat them as one. This is how I would like to go about "throttling".

I'm wondering if there is a good way to do this in python, I'm thinking about using a database/redis but unsure what the best method is to approach this. For the events, the id is unique.

Thanks!

EDIT: I have actually already accomplished this is Logstash, but I was looking for a way to do it in pure python.

pm1391
  • 275
  • 2
  • 14

1 Answers1

1

How about something like this? I'm assuming these are dictionaries you're looking at (that's what they look like).

First, let keys be the set of keys you want to look for. Suppose D and E are the events you're looking at.

found = False
for k in keys:
    if D[k] == E[k]:
        keys2 = copy(keys)
        keys2.remove(k)
        for l in keys2:
            if D[l] == E[l]:
                found = True
                break
    if found: break

There's probably a better "Python way" to do this but it's been a while and I don't have time to look it up. At least this works on examples I've tried.

EDIT: Notice that if you only have those two keys you want to look for, you can discard both keys and the loop, and just test for them directly.

John Perry
  • 2,497
  • 2
  • 19
  • 28
  • hmm interesting, and storing results over time? Say for example, 5 minutes? I'm guessing adding a new id for the matching events and then adding it to a dictionary and doing a lookup for any future events that might match? – pm1391 May 17 '18 at 15:16
  • 1
    If you don't care for the id's but want to filter by certain characteristics (which is what it sounds like), you can [remove the id](https://stackoverflow.com/questions/11277432/how-to-remove-a-key-from-a-python-dictionary#11277439) and any other keys you don't want, convert the `dictionary` to a `tuple`, then add what remains to a set `S`. In that case, adding to the set automatically "throttles" repeated data. If you want to hold on to all the data, you can store unique items in a second list or set, adding to this second collection only when `S` doesn't contain what you want (`contains()`). – John Perry May 17 '18 at 15:34