2

I'm trying to write some code that can quickly return a properly compacted IPv6 address. I've tried...

socket.inet_pton(socket.AF_INET6,socket.inet_PTON(socket.AF_INET6,address))
ipaddress.IPv6Address(address)
IPy.IP(address)

...listed from faster to slower in their speed of handling IPv6 compaction. The first is the fastest (~3.6 seconds per 65,565 IP addresses), the second is less than half as fast as the first (~8.4 seconds per 65,565 IP addresses), the last one is almost twice as slow as the second (~14.4 seconds per 65,565 IP addresses).

So, I set out to create my own...

import re
from ipaddress import IPv6Address

IPaddlist = [
    '2001:db8:00:0:0:0:cafe:1111',
    '2001:db8::a:1:2:3:4',
    '2001:0DB8:AAAA:0000:0000:0000:0000:000C',
    '2001:db8::1:0:0:0:4',
    '2001:4958:5555::4b3:ffff',
  ]

for addr in IPaddlist:
  address = ":".join('' if i=='0000' else i.lstrip('0') for i in addr.split(':'))
  address2 = (re.sub(r'(:)\1+', r'\1\1', address).lower())
  print(address2)
  print(IPv6Address(addr))
  print('\n')

It returns:

2001:db8::cafe:1111
2001:db8::cafe:1111

2001:db8::a:1:2:3:4
2001:db8:0:a:1:2:3:4

2001:db8:aaaa::c
2001:db8:aaaa::c

2001:db8::1::4
2001:db8:0:1::4

2001:4958:5555::4b3:ffff
2001:4958:5555::4b3:ffff

The first line of each entry is my code, the second is the correct compaction, using ipaddress.IPv6Address.

As you can see, I'm close, but you know what they say about 'close'...

Anyone have any pointers? I seem to have hit a roadblock.

PyNewbie
  • 23
  • 4
  • The problem is that you can also use `::` to compact the longest sequence of zeros. A simple regex *cannot* handle this contextual information... However you could try to check after the fact. So see if `::` appears at least twice and if so determine which is the correct occurrence and replace the others with `:0:`. – Bakuriu Oct 09 '16 at 07:50

1 Answers1

1

Just use socket functions. The first line of code in your question is almost 10 times faster than your string manipulations:

from socket import inet_ntop, inet_pton, AF_INET6
def compact1(addr, inet_ntop=inet_ntop, inet_pton=inet_pton, AF_INET6=AF_INET6):
    return inet_ntop(AF_INET6, inet_pton(AF_INET6, addr))

from ipaddress import IPv6Address
def compact2(addr, IPv6Address=IPv6Address):
    return IPv6Address(addr)

import re
def compact3(addr, sub=re.sub):
    address = ":".join('' if i=='0000' else i.lstrip('0') for i in addr.split(':'))
    return sub(r'(:)\1+', r'\1\1', address).lower()

And now let's %timeit:

In[9]: ips = [':'.join('{:x}'.format(random.randint(0, 2**16 - 1)) for i in range(8)) for _ in range(65565)]

In[10]: %timeit for ip in ips: compact1(ip)
10 loops, best of 3: 52.9 ms per loop

In[11]: %timeit for ip in ips: compact2(ip)
1 loop, best of 3: 715 ms per loop

In[12]: %timeit for ip in ips: compact3(ip)
1 loop, best of 3: 411 ms per loop
skovorodkin
  • 9,394
  • 1
  • 39
  • 30
  • Thanks for timing it, Skovorodkin. For some reason, the timing module won't install on my Windows version of Python, so I'm forced to rely upon creating my own timing code, which I was doing by looking at how fast it worked through the IP addresses, starting at '::', and looking at each octet it completed. I was hoping to turn this into a .pyd file if I could get it fast enough. Part of the problem is I'm currently forced to use ipaddress.IPv6Address because my code is just using a loop and integer addition to a base IP address ('::'), rather than generating full IPv6 addresses. – PyNewbie Oct 09 '16 at 16:11
  • In going to code that attempts to generate a full IPv6 address, it also generates the zeros in each octet, which I was trying to strip out. Is there code to generate compliant compacted IPv6 addresses that is fast? My IPv4 code can search for the MD5 hashes of all IPv4 addresses in 25 minutes, 25 seconds... the IPv6 code is 8 times slower (due to using ipaddress.IPv6Address), which is a problem, given how huge the IPv6 address space is. – PyNewbie Oct 09 '16 at 16:19