22

I'm working on a program where I store some data in an integer and process it bitwise. For example, I might receive the number 48, which I will process bit-by-bit. In general the endianness of integers depends on the machine representation of integers, but does Python do anything to guarantee that the ints will always be little-endian? Or do I need to check endianness like I would in C and then write separate code for the two cases?

I ask because my code runs on a Sun machine and, although the one it's running on now uses Intel processors, I might have to switch to a machine with Sun processors in the future, which I know is big-endian.

Gordon Seidoh Worley
  • 7,839
  • 6
  • 45
  • 82

4 Answers4

28

Python's int has the same endianness as the processor it runs on. The struct module lets you convert byte blobs to ints (and viceversa, and some other data types too) in either native, little-endian, or big-endian ways, depending on the format string you choose: start the format with @ or no endianness character to use native endianness (and native sizes -- everything else uses standard sizes), '~' for native, '<' for little-endian, '>' or '!' for big-endian.

This is byte-by-byte, not bit-by-bit; not sure exactly what you mean by bit-by-bit processing in this context, but I assume it can be accomodated similarly.

For fast "bulk" processing in simple cases, consider also the array module -- the fromstring and tostring methods can operate on large number of bytes speedily, and the byteswap method can get you the "other" endianness (native to non-native or vice versa), again rapidly and for a large number of items (the whole array).

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • What about binary notation, though? `0b10000001101` will this give the number 1037 on all systems and Python versions? – Niklas R Jan 29 '17 at 11:45
  • 1
    @NiklasR Yes. [The specification](https://docs.python.org/3/reference/expressions.html#literals) doesn't explicitly mention endianness, but it does make it clear that binary literals work the same as any other integer literal, and that it yields an object with the given *value*, rather than the given *representation*. – Ray Dec 07 '21 at 21:32
22

If you need to process your data 'bitwise' then the bitstring module might be of help to you. It can also deal with endianness between platforms.

The struct module is the best standard method of dealing with endianness between platforms. For example this packs and unpack the integers 1, 2, 3 into two 'shorts' and one 'long' (2 and 4 bytes on most platforms) using native endianness:

>>> from struct import *
>>> pack('hhl', 1, 2, 3)
'\x00\x01\x00\x02\x00\x00\x00\x03'
>>> unpack('hhl', '\x00\x01\x00\x02\x00\x00\x00\x03')
(1, 2, 3)

To check the endianness of the platform programmatically you can use

>>> import sys
>>> sys.byteorder

which will either return "big" or "little".

Scott Griffiths
  • 21,438
  • 8
  • 55
  • 85
  • I've seen a lot ot this kind of explanations ( also cudo for sys.byteorder , did not know that ) but i have to ask. Lets say that i have some unknown file that i want to read how can i know if some const chars are short or long and/or big and little endian ? – Danilo May 08 '17 at 00:08
  • 1
    @Danilo: In general you can't tell. To reverse engineer an unknown file format you could look at the data and guess what size/endianness made the most sense. To illustrate, if you unpack my example with the wrong endianness you get (256, 512, 50331648) instead of (1, 2, 3) which is a reasonable clue you've got it wrong... – Scott Griffiths May 08 '17 at 15:18
4

The following snippet will tell you if your system default is little endian (otherwise it is big-endian)

import struct
little_endian = (struct.unpack('<I', struct.pack('=I', 1))[0] == 1)

Note, however, this will not affect the behavior of bitwise operators: 1<<1 is equal to 2 regardless of the default endianness of your system.

augurar
  • 12,081
  • 6
  • 50
  • 65
3

Check when?

When doing bitwise operations, the int in will have the same endianess as the ints you put in. You don't need to check that. You only need to care about this when converting to/from sequences of bytes, in both languages, afaik.

In Python you use the struct module for this, most commonly struct.pack() and struct.unpack().

Lennart Regebro
  • 167,292
  • 41
  • 224
  • 251
  • 2
    It matters because I do things in my code like this: if (a >> 2 & 1) ... elif (b >> 3 & 1) ... but on bigendian I'd have to write if (a << 2 & 1) ... – Gordon Seidoh Worley Sep 09 '09 at 14:43
  • 1
    @Gordon: I don't think that's right. Is there some confusion here between byte-wise big and little endianness and bit-wise big and little endianness? If `a` is an integer then you probably don't have to worry about its endianness, it's only a question of how you created it from raw byte data. – Scott Griffiths Sep 09 '09 at 14:54
  • 2
    @Gordon: No you would not. Big/small-endian does not change the order of bits, but the order of *bytes*. The shift operations handle this, both in Python and C (as they in fact both use the processors shift operations). – Lennart Regebro Sep 09 '09 at 15:11