2

I'm trying to convert int numbers (>=0 && <2^32 of course) in Python into a 4 byte unsigned int representation and back.

As I understand the docs the sizes given in struct.pack are a standard but the size is not guaranteed.. how can I make sure that I get 4 exactly bytes?

One way I found using ctypes:

byte_repr=bytes(ctypes.c_uint32(data))

Is this the most pythonic one there is? And what would be the way back (for this or any other solution)?

Cookie
  • 678
  • 1
  • 11
  • 22
  • 1
    Unless you're serializing objects, if you care about how many bytes something is in Python, then you've already thrown "pythonic" out the window. – Paul M. Apr 05 '20 at 16:47
  • @user10987432 Well that's just not true.. If for example he's sending something over the network at a specified size, it is not "... throw(ing) pythonic out the window" – maor10 Apr 05 '20 at 16:50
  • @maor10 If you're sending something over the network, that is "serializing an object" by definition. – Paul M. Apr 05 '20 at 16:54
  • Well anything in python is "an object" so that's a bit of a moot point. He's specifically asking about a number- converting a number to a binary representation is not "unpythonic", and it's a bit ridiculous to say that – maor10 Apr 05 '20 at 16:55
  • What do you mean by "*the size is not guaranteed*" (for *struct*)? – CristiFati Apr 05 '20 at 23:21
  • @CristiFati I just read the paragraph again and found the statement "The ‘Standard size’ column refers to the size of the packed value in bytes when using standard size; that is, when the format string starts with one of '<', '>', '!' or '='. When using native size, the size of the packed value is platform-dependent." From this I actually just read "the size of the packed value is platform-dependent" so maybe this is already the solution – Cookie Apr 06 '20 at 09:13
  • @user10987432 i'm "serializing" into a fixed size representation indeed. Have to squeeze my data into a QR code. You are right that usually it should not matter but for some applications it does – Cookie Apr 06 '20 at 09:16

2 Answers2

2

The types int and bytes have the methods you need for that.

Note that I am calling from_bytes from the class int but it can be called from an int instance object:

>>> a = 2**32-1
>>> a.to_bytes(4, 'little')
b'\xff\xff\xff\xff'
>>> b = a.to_bytes(4, 'little')
>>> c = int.from_bytes(b, 'little')
>>> c
4294967295
>>> a
4294967295
>>>
progmatico
  • 4,714
  • 1
  • 16
  • 27
1

Given the mentioned interval, you're talking about unsigned ints.
[Python 3.Docs]: struct - Interpret strings as packed binary data works fine (well, on platforms (compilers) where sizeof(int) == 4).
Since for a vast majority of environments the above is true, you can safely use it (unless you're positive that the code will run on an exotic platform, where the compiler used to build Python is different).

>>> import struct
>>>
>>> bo = "<"  # byte order: little endian
>>>
>>> ui_max = 0xFFFFFFFF
>>>
>>> ui_max
4294967295
>>> buf = struct.pack(bo + "I", ui_max)
>>> buf, len(buf)
(b'\xff\xff\xff\xff', 4)
>>>
>>> ui0 = struct.unpack(bo + "I", buf)[0]
>>> ui0
4294967295
>>>
>>> i0 = struct.unpack(bo + "i", buf)[0]  # signed int
>>> i0
-1
>>> struct.pack(bo + "I", 0)
b'\x00\x00\x00\x00'
>>>
>>> struct.pack(bo + "I", ui_max + 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
struct.error: argument out of range
>>>
>>> struct.unpack(bo + "I", b"1234")
(875770417,)
>>>
>>> struct.unpack(bo + "I", b"123")  # 3 bytes buffer
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
struct.error: unpack requires a buffer of 4 bytes
>>>
>>> struct.unpack(bo + "I", b"12345")  # 5 bytes buffer
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
struct.error: unpack requires a buffer of 4 bytes

Related (remotely): [SO]: Maximum and minimum value of C types integers from Python.

[Python 3.Docs]: ctypes - A foreign function library for Python variant:

>>> # Continuation of previous snippet
>>> import ctypes as ct
>>>
>>> ct_ui_max = ct.c_uint32(ui_max)
>>>
>>> ct_ui_max
c_ulong(4294967295)
>>>
>>> buf = bytes(ct_ui_max)
>>> buf, len(buf)
(b'\xff\xff\xff\xff', 4)
>>>
>>> ct.c_uint32(ui_max + 1)
c_ulong(0)
>>>
>>> ct.c_uint32.from_buffer_copy(buf)
c_ulong(4294967295)
>>> ct.c_uint32.from_buffer_copy(buf + b"\x00")
c_ulong(4294967295)
>>> ct.c_uint32.from_buffer_copy(b"\x00" + buf)  # 0xFFFFFF00 (little endian)
c_ulong(4294967040)
>>>
>>> ct.c_uint32.from_buffer_copy(buf[:-1])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Buffer size too small (3 instead of at least 4 bytes)

Note: @progmatico's answer is simpler and more straightforward as it doesn't involve any module other than builtin ([Python 3.Docs]: Built-in Types - Additional Methods on Integer Types). As a side note, sys.byteorder could be used.

CristiFati
  • 38,250
  • 9
  • 50
  • 87