0

I have some file with little-endian encoding bytes in it, I want to take N bytes, specify endianess and convert them into a decimal number using python (any version). How to do it correctly?

warchantua
  • 1,154
  • 1
  • 10
  • 24

3 Answers3

4

In Python 3 you can use something like this:

int.from_bytes(byte_string, byteorder='little')
ham
  • 716
  • 5
  • 12
2

As Harshad Mulmuley' answer shows, this is easy in Python 3, using the int.from_bytes method. In Python 2, it's a little trickier.

The struct module is designed to handle standard C data types. It won't handle arbitrary length integers (Python 2 long integers), as these are not native to C. But you can convert them using a simple for loop. I expect that this will be significantly slower than the Python 3 way, since Python for loops are slower than looping at C speed, like int.from_bytes (probably) does.

from binascii import hexlify

def int_from_bytes_LE(s):
    total = 0
    for c in reversed(s):
        total = (total << 8) + ord(c)
    return total

# Test

data = (
    (b'\x01\x02\x03\x04', 0x04030201),
    (b'\x01\x02\x03\x04\x05\x06\x07\x08', 0x0807060504030201),
    (b'\x01\x23\x45\x67\x89\xab\xcd\xef\x01\x23\x45\x67\x89\xab\xcd\xef', 
        0xefcdab8967452301efcdab8967452301),
)

for s, u in data:
    print hexlify(s), u, int_from_bytes_LE(s)
    #print(hexlify(s), u, int.from_bytes(s, 'little'))

output

01020304 67305985 67305985
0102030405060708 578437695752307201 578437695752307201
0123456789abcdef0123456789abcdef 318753391026855559389420636404904698625 318753391026855559389420636404904698625

(I put that Python 3 print call in there so you can easily verify that my function gives the same result as int.from_bytes).

If your data is really large and you don't want to waste RAM reversing your byte string you can do it this way:

def int_from_bytes_LE(s):
    m = 1
    total = 0
    for c in s:
        total += m * ord(c)
        m <<= 8
    return total

Of course, that uses some RAM for m, but it won't be as much as the RAM used for reversing the input string.

PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
0

Using Python 3 (or 2), you can achieve this with the struct library.

with open('blob.dat', 'rb') as f:
    data = f.read(n)

Now, you unpack using the appropriate format specifier string. For example, big-endian int:

num = struct.unpack(">i",data)
juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172