Efficiently detect overlapping networks

Question

I know how to detect overlapping networks. There are two ways to do this: by using netaddr's "IPNetwork in IPNetwork" or ipaddress's "overlaps" method. The question is how to find overlapping networks in array in most efficient way.

At the moment, I use this code:

networks = {
    IPNetwork('10.1.0.0/24'): 'net2',
    IPNetwork('10.0.0.0/8'): 'net1',
    IPNetwork('10.0.0.0/16'): 'net3',
    IPNetwork('10.0.0.0/24'): 'net4',
    IPNetwork('192.168.0.0/16'): 'net5',
    IPNetwork('192.168.0.0/23'): 'net6',
    IPNetwork('192.168.1.0/24'): 'net7',
    IPNetwork('172.16.0.0/12'): 'net8'
}

net = sorted([key for key in networks.keys()])
for pi in range(0, len(net)-1):
    for si in range(pi+1, len(net)):
        if net[si] in net[pi]:
            print(f'{net[si]} ({networks[net[si]]}) overlaps with '
                  f'{net[pi]} ({networks[net[pi]]})')

It does the job, but uses many iterations (e.g. for 8 entries there are 7+6+5+... = 28 comparisons) and I'm looking for a more efficient way.

Maybe you could create a dict like `{"10.1.0": ["10.1.0.0/24"], "10.0.0": ["10.0.0.0/8", "10.0.0.0/16", ...], ...}` to reduce the number of ranges to compare? — tobias_k, May 03 '19 at 08:59
Looks like [this one](https://netaddr.readthedocs.io/en/latest/index.html) — Marco Bonelli, May 03 '19 at 09:04
Possible duplicate of [Check if two CIDR addresses intersect?](https://stackoverflow.com/questions/17206679/check-if-two-cidr-addresses-intersect) — Marco Bonelli, May 03 '19 at 09:14
@MarcoBonelli Not a duplicate IMHO. That other question is about checking how two addresses overlap, but OP seems to know that. He's looking for an efficient way (better than O(n²)) to do that for a lot of addresses. — tobias_k, May 03 '19 at 10:28
@tobias_k, I use Python's netaddr library. And yes, I know how to detect overlapping, I'm seeking for effective way to do this with lot of networks. — Volodymyr Litovka, May 04 '19 at 18:43

score 1 · Accepted Answer · answered May 03 '19 at 09:53

This will handle all of your cases, but it may not find EVERY duplicate, e.g. given (a, a', a'', a''', b) it will not show that (a''' overlaps a'). If you're really interested in the primary supersets, then this code can be simplified


from netaddr.ip import IPNetwork
from collections import namedtuple

Network = namedtuple("Network", "name ip")

networks = {
    IPNetwork('10.1.0.0/24'): 'net2',
    IPNetwork('10.0.0.0/8'): 'net1',
    IPNetwork('10.0.0.0/16'): 'net3',
    IPNetwork('10.0.0.0/24'): 'net4',
    IPNetwork('192.168.0.0/16'): 'net5',
    IPNetwork('192.168.0.0/23'): 'net6',
    IPNetwork('192.168.1.0/24'): 'net7',
    IPNetwork('172.16.0.0/12'): 'net8'
}

print("original")
net = sorted([key for key in networks.keys()])
for pi in range(0, len(net)-1):
    for si in range(pi+1, len(net)):
        if net[si] in net[pi]:
            print(f'{net[si]} ({networks[net[si]]}) overlaps with '
                  f'{net[pi]} ({networks[net[pi]]})')

nets = sorted([Network(v, k) for k, v in networks.items()], key=lambda a: a.ip)

print()
print("faster")
aa = None
first = True
for a, b in zip(nets[0:-1], nets[1:]):
  if aa is None:
    aa = a
  if b.ip in aa.ip:
    print(f'{b.ip} ({b.name}) overlaps with {aa.ip} ({aa.name})')
    if not first and aa != a and b.ip in a.ip: 
      # it's already a subset of an earlier one...
      # only if you care about secondary overlaps
      print(f'{b.ip} ({b.name}) overlaps with {a.ip} ({a.name})')
    # aa = a
  elif b.ip in a.ip:
    print(f'{b.ip} ({b.name}) overlaps with {a.ip} ({a.name})')
    aa = a
  else:
    # print(f'! {b.ip} ({b.name}) overlaps with {aa.ip} ({aa.name})')
    aa = None
  first = False

Unfortunately, your code also misses some overlaps. You compare current (b.ip) to first (aa.ip) and previous (a.ip) entries and, thus, will miss entries between them. Try with this set of networks: '10.0.0.0/8', '10.0.0.0/23', '10.0.0.0/24', '10.0.0.0/25' and '10.0.1.0/25' - in this case overlaps of two last networks with second one will be missed. But anyway thanks for try. — Volodymyr Litovka, May 04 '19 at 22:57
Actually, your code is about 4 times faster than my in OP. I think I will reconsider requirements - it doesn't make sense to find all overlaps of one network if there is at least one. — Volodymyr Litovka, May 05 '19 at 07:10

score 1 · Answer 2 · answered Jan 26 '22 at 14:29

I recently had to solve this problem in firewalld.

You can achieve O(n) detection by presorting the networks ahead of time. Python sorts networks first by network bits (ip address) and then by netmask. This plus address normalization done by IPv4Network means that overlapping entries will be adjacent in the sorted list.

The detection is then done by comparing adjacent items:

from ipaddress import IPv4Network

networks = {
    IPv4Network('10.1.0.0/24'): 'net2',
    IPv4Network('10.0.0.0/8'): 'net1',
    IPv4Network('10.0.0.0/16'): 'net3',
    IPv4Network('10.0.0.0/24'): 'net4',
    IPv4Network('192.168.0.0/16'): 'net5',
    IPv4Network('192.168.0.0/23'): 'net6',
    IPv4Network('192.168.1.0/24'): 'net7',
    IPv4Network('172.16.0.0/12'): 'net8'
}

networks_list = sorted(networks.keys())

prev_network = networks_list.pop(0)
for current_network in networks_list:
    if prev_network.overlaps(current_network):
        print("Network {} overlaps {}".format(prev_network, current_network))
    prev_network = current_network

Results:

Network 10.0.0.0/8 overlaps 10.0.0.0/16
Network 10.0.0.0/16 overlaps 10.0.0.0/24
Network 192.168.0.0/16 overlaps 192.168.0.0/23
Network 192.168.0.0/23 overlaps 192.168.1.0/24

Efficiently detect overlapping networks

2 Answers2