0

I am gathering network traffic from a small number of very chatty air con devices using Python. I see two types of message - one contains only the IP address and device name, and the other contains the IP address and the values I want to record - fan speed, temperature etc.

The devices generate messages far too fast (every 2-3 seconds) to be useful, and I don't want to generate hundreds of pointless database inserts. My idea is to parse the message, and if I have not seen the device before, create a data structure with the IP and Name.. if I have seen it before, I would like to update the data structure with the missing values. Once every minute or so, I would like to iterate the data structure to update a database with the current values.

I thought an array of objects might be the right way to go, but can't find simple examples; for each instance of a device, a dictionary looks like a sensible structure, but should I group these in an array?

Apologies if this is simplistic - I have not coded in many years, and have enjoyed figuring out how to capture network traffic and parse it using regular expression matching... but the huge range of data structures in python is overwhelming! What is a simple data structure that would let me easily query "does this device exist", and either create a new one, or update the existing one?

Ian Lowe
  • 3
  • 1

3 Answers3

0

This usage of defaultdict comes from this answer, and results in a dictionary that will create the same kind of dictionary for missing keys when you attempt to access them (a normal dict would raise a KeyError). I'm assuming you might see multiple devices coming from the same IP but no single device will show up with multiple IPs; if the latter is not the case, this won't work exactly as desired.

from pprint import pprint
from collections import defaultdict


NestedDict = lambda: defaultdict(NestedDict)


# Catch and parse messages; contrived example
messages = (
    {"ip": "1.2.3.4", "device": "BlackBerry", "values": {"temp": 99}},
    {"ip": "1.2.3.4", "device": "Android", "values": {"fan_speed": 2}},
    {"ip": "1.2.3.4", "device": "BlackBerry", "values": {"temp": 80, "fan_speed": 2}},
    {"ip": "9.2.3.9", "device": "MacBook"},
    {"ip": "9.2.3.9", "device": "Buick", "values": {"tire_pressure": 35}},
)

devices_by_ip = NestedDict()

for message in messages:
    devices_by_ip[message["ip"]][message["device"]].update(message.get("values", {}))

pprint(devices_by_ip)

Edited to add: If the message that contains values doesn't include the device name, it's a bit different, and we have to assume only one device per IP (i.e. none of these messages are coming from different devices inside a separate network).

messages = (
    {"ip": "1.2.3.4", "device": "BlackBerry"},
    {"ip": "1.2.3.4", "values": {"temp": 99}},
    {"ip": "2.2.3.4", "values": {"fan_speed": 2}},
    {"ip": "2.2.3.4", "device": "Android"},
    {"ip": "1.2.3.4", "values": {"temp": 80, "fan_speed": 2}},
)

devices_by_ip = NestedDict()

for message in messages:
    if "device" in message:
        devices_by_ip[message["ip"]]["device"] = message["device"]
    else:
        devices_by_ip[message["ip"]].update(message.get("values", {}))

pprint(devices_by_ip)
kungphu
  • 4,592
  • 3
  • 28
  • 37
  • Thanks - this looks like it would do the job, and I may revisit it in the future - it's a little intimidating though. I may see a device whose IP changes (DHCP), and I'm not sure how to handle that. – Ian Lowe Oct 15 '18 at 21:13
0

I would store a dictionary with IPs (or IP + device name if just IP isn't unique) as keys and other details (fan speed, temperature) as another dictionary for values. This will give you constant time lookup based on IP, which is important here as you'll be perform many lookups and updates frequently. The data structure would look something like this:

device_messages = {
  '192.168.4.3': {'device name': 'Cisco FW', 'fanspeed': 5, temperature: 56},
  '192.168.6.1': {'device name': 'NSX', 'fanspeed': 10, temperature: 90},
  '192.168.1.9': {'device name': 'Windows XP', 'fanspeed': 18, temperature: 600}
}

So if a new device comes in you can perform your lookup with:

if new_device_ip in device_messages: # this is faster than searching a list
    # update with missing information
    device_messages[new_device_ip]['fan_speed'] = new_fan_speed
    # and so on
else:
    device_messages[new_device_ip] = {}

Using a list of objects and then searching for an entry with a certain IP will be much less efficient (O(n) as opposed to O(1) that you get with a dictionary).

slider
  • 12,810
  • 1
  • 26
  • 42
  • 1
    Thanks a lot for this - It seems like a great solution - I am off to play with this idea and see how it works.. – Ian Lowe Oct 15 '18 at 19:43
  • Yep, had a play with the code, and this does exactly as I need. capturing data nicely. thanks! – Ian Lowe Oct 15 '18 at 21:15
0

Python dictionaries can be keyed by a tuple, so you can have the ip and name component as a tuple and use that to key into a dictionary that stores further information about the device.

You can write a Device class:

class Device(object):        
    def __init__(self, name, ip='0.0.0.0', fanspeed=0, temp=0.0):
        self.name= name
        self.ip = ip
        self.fanspeed=fanspeed
        self.temp=temp

Then you can have a dictionary that is keyed by the ip:

devices = {}

Now every time you get a new message from a device, you simply create the default object in the dictionary:

devices['some_ip'] = Device('some_name')

And when it comes time to update the device with properties, you find it in the dictionary and add some values to the object.

device = devices.get('some_ip')
if device is not None:
    # set properties of device here

As for writing them to a database, you simply iterate the dictionary like so:

for device in devices.values():
    # Update database with device info
smac89
  • 39,374
  • 15
  • 132
  • 179