1

I have an IP address R (ie: "255.255.255.0") in string form that I'm hashing, and taking the first 4 bytes of that hash. I want to then convert that hashed result to binary format:

def H(R):
    h = hashlib.sha256(R.encode('utf-8'))
    return unhexlify(h.hexdigest())[0:4]

I tried doing the following, but I only get 30 bits instead of 32 (I remove the first 2 chars of the string because it's the 0b prefix):

bin(struct.unpack('!I', H(R))[0])[2:]

How can I do this correctly? The result of H(R) looks something like b',\xc3Z\xfb'. I've tried the methods here and none work with the format I'm converting from. Convert bytes to bits in python

  • What I have: 4 bytes from the Hash of a 32 bit IP Address string, ie: b',\xc3Z\xfb'
  • What I'm trying to get: the 32 binary representation as a string, ie: '10101010101010101010101010101010'
Zero Piraeus
  • 56,143
  • 27
  • 150
  • 160
user1045890
  • 379
  • 1
  • 5
  • 11
  • You ask how to do "that" correctly. Can you describe what "that" is exactly? In English, what is the goal of this code? What's the surrounding use case? Can you give example inputs and expected outputs? – John Zwinck Nov 11 '18 at 07:11
  • @JohnZwinck does the edit clarify? – user1045890 Nov 11 '18 at 07:16
  • Did you know that an IPv4 address is 4 bytes? Why bother hashing it then taking the first 4 bytes? You could just use the binary value directly. – John Zwinck Nov 11 '18 at 08:34
  • @JohnZwinck in fairness, the idea here may be to anonymize IP addresses for logging purposes (and getting the binary representation could be to test that the anonymization scheme is sufficiently "random"). – Zero Piraeus Nov 11 '18 at 08:47

2 Answers2

2

Guess this would does the work

import hashlib
import binascii

def H(R):
    h = hashlib.sha256(R.encode('utf-8'))
    return binascii.unhexlify(h.hexdigest())[0:4]

def binstr(x: bytes) -> str:
    s = ""
    for char in x:
        ch = bin(char)[2:] # 0b101 -> 101
        s += "0" * (8-len(ch)) + ch # 101 -> 00000101
    return s

print(binstr(H("127.0.0.1"))) # 00010010110010100001011110110100
print(binstr(H("255.255.255.255"))) # 11110100010101000110001010111111
2

bin() gives you a binary representation of an integer. The particular integer you're asking for a binary representation of in this case is the result of struct.unpack('!I', b',\xc3Z\xfb')[0], which happens to be 751000315:

>>> struct.unpack('!I', b',\xc3Z\xfb')[0]
751000315

The binary representation of 751000315 that bin() gives you is 0b101100110000110101101011111011, which is correct:

>>> bin(751000315)
'0b101100110000110101101011111011'
>>> 0b101100110000110101101011111011
751000315

It has thirty digits (plus the 0b prefix) because that's how many digits are necessary to represent that integer. if the result of struct.unpack('!I', H(R))[0] had been, say, the integer 38 (for example, if R were '247.69.16.15'), the binary representation bin() gave you would be 0b100110, which is even shorter.

bin() can't guess that you want leading zeroes, and it certainly can't guess how many. What you need to do is format your integer, like this:

>>> '{:032b}'.format(struct.unpack('!I', b',\xc3Z\xfb')[0])
'00101100110000110101101011111011'

… or, in the extreme example I gave above:

>>> '{:032b}'.format(struct.unpack('!I', H('247.69.16.15'))[0])
'00000000000000000000000000100110'
Zero Piraeus
  • 56,143
  • 27
  • 150
  • 160