0

So I've been learning Python3 for the last month or so. I'm going through "Black Hat Python" by Justin Seitz right now, but all of the code is in Python2. Most of the code so far was easy to convert to Python3, but I ran across a hexdump function in a tcp_proxy program that has me stumped. Below is the Python2 code from the book.

def hexdump(src, length=16):
    result = []
    digits = 4 if isinstance(src, unicode) else 2

    for i in xrange(0, len(src), length):
        s = src[i:i+length]
        hexa = b' '.join(["%0*X" % (digits, ord(x)) for x in s])
        text = b''.join([x if 0x20 <= ord(x) < 0x7F else b'.' for x in s])
        result.append(b"%04X %-*s %s" % (i, length*(digits + 1), hexa, text))

    print b'\n'.join(result)

I have a few questions that I haven't been able to find online. Why would digits need to be unpacked if it's a single int? Would the equivalent of "%0*X" % (digits, ord(x)) be "{0:X}".format(*digits, ord(x))? Why are there two arguments for this? I noticed that there was an additional argument in result.append() as well. Any help would be greatly appreciated.

Joel B.
  • 1
  • 1
  • You can use the `%`-formatting in Python 3 as well. See docs at https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting . Your main problem is mixing up `bytes` literals (starting with `b'`) with strings (without b). – Michael Butscher Oct 21 '17 at 23:52
  • 1
    In the interests of a [mcve] what is the value of src? I get a NameError on 'unicode'. Also, if you're running this in python3, you need parentheses for the print function. This link might be relevant: https://stackoverflow.com/questions/19877306/nameerror-global-name-unicode-is-not-defined-in-python-3 – Kenny Ostrom Oct 21 '17 at 23:54
  • In Python 3 by default all strings _are_ Unicode (UTF-8), so the `digits = 4 if isinstance(src, unicode) else 2` becomes meaningless (even if it "worked"). Unicode glyphs can require different numbers of bytes to represent, depending on what the exact Unicode encoding is being used is. – martineau Oct 21 '17 at 23:58
  • This [answer](https://stackoverflow.com/a/14067233/355230) to another question might help (although it's written for Python 2). – martineau Oct 22 '17 at 00:07
  • Src is the data that was received from the remote socket object. The hexdump function is supposed to output the packet details with both the hexadecimal values and ASCII-printable characters. So if the digits variable isn't required, does that mean I don't have to use the asterisk in the string formatters? – Joel B. Oct 22 '17 at 18:13

2 Answers2

0

This worked for me. Changed the xrange to range, isinstance isn't really necessary but 2to3 suggested it. Removed byte strings.

def hexdump(src, length=16):
  result = []
  digits = 4 if isinstance(src, str) else 2
  for i in range(0, len(src), length):
   s = src[i:i+length]
   hexa = " ".join(map("{0:0>2X}".format,src))
   text = "".join([chr(x) if 0x20 <= x < 0x7F else "." for x in s])
   result.append("%04X   %-*s   %s" % (i, length*(digits + 1), hexa, text) )
  return "\n".join(result)
Doug
  • 3
  • 1
0
def hexdump(src, length=16):
    result = []
    digits = 4

    s = src[:]
    print(s)
    hexa = " ".join(["%0*X" % (digits, ord(x)) for x in s.decode("ascii")])
    text = "".join([x if 0x20 <= ord(x) < 0x7F else "." for x in s.decode("ascii")])
    result.append("%04X   %-*s   %s" % (1, length * (digits + 1), hexa, text))

    print("\n".join(result))

This is what you need.

The function from the book is doing unnecessary things just to confuse you. not to bash on the book but the code in that book is slightly bad.

Nenoj
  • 197
  • 1
  • 8