1

I have a byte stream with 3 bytes that I have to cast into an unsigned int 32. In C I would just use bzero to zero out the memory region of my integer and then copy the memory from the stream, something like

char* stream = ...;
Uint32 myInt;
bzero(myInt,4);
memcopy(myInt, stream[0], stream[3])

Question:

How would I do this in Python?

Cheers

ezdazuzena
  • 6,120
  • 6
  • 41
  • 71

3 Answers3

3

Even in C you should not use memcpy for binary de-serialization, because it doesn't deal with binary representation, i.e. endianess, sign and the like. Explicitly decode binary values from a fixed representation in the stream to the platform's native representation.

In Python, you'd read the bytes from the stream, pad them according to the endianess, and then unpack them with struct:

chunk = fileobj.read(3)
# decode bytes as little-endian, signed integer
chunk += '\x00'
number = struct.unpack('<i', chunk)

fileobj is a file-like object, representing the opened stream, i.e. a file opened with open(filename, 'rb').

  • The questioner's code with `memcpy` is fine for deserialising a network-order 32 bit unsigned integer from a stream. You deal with endianness *after* you read a value, that's what `ntohl` is for. Granted the same trick wouldn't work for a signed value if either end is non-2's-complement. – Steve Jessop Aug 01 '12 at 08:49
  • Thanks, that's actually what I was looking for. – ezdazuzena Aug 01 '12 at 08:49
1

Have you tried:

myInt = (ord (myStr[0]) * 256 + ord (myStr[1])) * 256 + ord (myStr[2])

The following program:

myStr = "abc"
myInt = ord (myStr[0]) * 65536 + ord (myStr[1]) * 256 + ord (myStr[2])
print myInt

outputs 6382179 as expected:

'a' (97) * 65536 = 6356992
'b' (98) *   256 =   25088
'c' (99) *     1 =      99
                   -------
                   6382179

This is, of course, assuming the strings are in big-endian format. For little-endian, where the most significant octet is the third in the string, you can just reverse the string indexes (2, 1, 0 rather than 0, 1, 2).

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
0

As @paxdiablo says, ord is the solution. Here is a solution with << that makes it more familiar with C:

sum([ord(c) << 8 * i for i, c in enumerate(s) ])
Emmanuel
  • 13,935
  • 12
  • 50
  • 72