Endianness refers to how multi-byte values are stored in memory, sent between devices or stored on disk. "Big-Endian" values are stored with their most-significant byte first, and "Little-Endian" values are stored with their least-significant byte first. Other byte-orders are possible but very uncommon, and cannot be described this way.
Endianness is the organization and ordering of byte values in multi-byte words. There are two main forms of endianness: big-endian and little-endian.
Big endian (BE) means that the most significant bits are stored first (lowest address). It is similar to reading or saying the name of a decimal number in reading order.
Little endian (LE) means that the least significant bits are stored first. The bytes are stored in reverse of the big-endian format.
There are other forms of byte orderings, but they are rare. They may also be called mixed-endian.
Usage of endianness
When we talk about endian, we often refer to the endianness of an instruction architecture/CPU or the endianness of a file. The endianness of an architecture or a CPU is how the processor organizes the bits in a multi-byte word.
- Motorola 68000 is a big-endian architecture. It stores multi-byte words in big-endian ordering.
- Intel processors and the x86 architecture are little-endian.
- MIPS can run in both big-endian and little-endian format, and you can select the endianness. MIPS is a Bi-endian format.
The endianness of a file indicates how the bytes of a multi-byte word is ordered in a given file (applies both to binary and text files). Sometimes, we indicate the endianness of a file by putting a byte-order mark (BOM) as the first byte of that file.
- A big-endian UTF-16 text file with BOM would begin with the two bytes
FE FF
and have all the two-byte characters (each surrogate in a surrogate pair is also one character) be expressed in big endian. - A little-endian UTF-16 text file with BOM would begin with the two bytes
FF FE
and have all the two-byte characters be expressed in little endian.
Examples of endianness
A 32-bit signed int value, 12356789 is stored as four bytes in two's complement format.
- In big endian, the value is stored as
07 5B CD 15
in hexadecimal notation. - In little endian, the value is stored as
15 CD 58 07
in hexadecmial notation.
A UTF-16 text file with BOM contains these characters: A 汉
.
- The BOM character has value
U+FEFF
. The emoji has Unicode valueU+1F197
is expressed as two surrogate pairs,U+D83C U+DD97
- In big endian, the characters are stored as
FEFF 0041 0020 6C49 D83C DD97
- In little endian, they are stored as
FFFE 4100 2000 496C 3CD8 97DD
Read More
Related tags:
- cpu-architecture, computer-architecture, file, byte
- unicode, utf-16, byte-order-mark: Character encodings that can be endian-dependent
External links: