0

I'm trying to convert integers into 2 bytes, but I'm seeing some conflicting answers online between:

a[0] = (byte)(theInt & 0xFF);
a[1] = (byte)((theInt >>> 8) & 0xFF);

and

a[0] = (byte)((theInt >>> 8) & 0xFF);
a[1] = (byte)(theInt & 0xFF);

The first seems to be the more common answer (How do I split an integer into 2 byte binary?).

However, for me personally, the second seems to be working better. If I set theInt = 10000, I get the desired {27, 10}. But for the first method I get the reverse {10, 27}.

So is there any risk in me going against the popular answer and using the first method? Am I missing something? Thanks

1 Answers1

0

The concept of how to sequence the bytes of a multibyte numeric value (such as the 16-bit value you're converting here) is called 'endianness'.

Little Endian is the term for 'send the least significant data first', that'd be the first snippet. (send '10' first, then '27').

Big Endian is the term for 'send the most significant data first', so, the second snippet.

You'd think Big Endian is sensible (it matches how we humans write things, for example, and matches how we think about bits within bytes too; 128 is, in bits, '10000000' - the most significant data is written first, after all), and Little Endian is insane, and wonder why the concept of LE even exists.

The primary reason for that is that the intel CPU architecture is Little Endian, and that is very popular CPU architecture. If you first write 32-bit int with value '1' to some memory address, and then read it back out byte for byte, then you get, in order: '1', '0', '0', and '0' out: Little Endian - the least significant byte is written first. These days with pipelines, micro architecture and who knows what asking an intel processor to write it out in BE form is probably not really slower, but it is more machinecode, and certainly in the past, significantly slower. Thus, if you were trying to squeeze max performance out 2 machines talking to each other with a really really fast pipe in between and both machines had intel chips, it'd go faster if you send in little endian: That way both CPUs are just copying, that's all, vs. sending in BE which would require the sender chip to swap bytes 1/4 and 2/3 around for every int it sends, and the receiver chip to apply the same conversion, wasting cycles.

Thus, from time to time, you find a protocol defined to be LE. That's.. short sighted in this world of varied hardware where you're just as likely to end up having both sender and receiver e.g. by an ARM chip, or worse, for 10 chips to be involved (bunch of packet-inspecting routers in between), probably all of them BE. But now you know why LE as a concept does exist.

Because of this modern age of varied hardware, and because most other CPUs are big endian, almost all network protocols are generally defined to be Big Endian. java is generally Big Endian as well (most APIs, such as IntBuffer and co, let you pick which endian-ness you want, but where it is not available, or where defaults are concerned, it's big endian). Formats like UTF-8 are also defined as being big-endian. When in doubt, Big Endian is far more likely to be the intended ordering than LE would be.

The ARM chips that run android devices are also Big Endian.

Thus: Just use Big Endian (second snippet).

That just leaves one mystery: Why is the accepted answer to your linked question the 'weird' one (Little Endian), and why does it get that many upvotes even though it doesn't highlight this? The question even specifically asks for Big Endian (it describes it, doesn't use the term of art for it, but nevertheless, it describes BE).

I don't know. It's a stupid answer with a checkbox and 68 votes. Mysterious.

I did my part, and downvoted it.

rzwitserloot
  • 85,357
  • 5
  • 51
  • 72
  • 1
    Hint: you could always come in and suggest an edit for another answer. – GhostCat Dec 02 '20 at 16:30
  • @GhostCat My answer predates, so I'm not sure who you're trying to give a hint to. – rzwitserloot Dec 02 '20 at 16:32
  • The OP and many others were looking to split the integer into bytes. That was the hard part. The endianness is easy to figure out. So, the "stupid" answer did the job :-) – Tarik Dec 02 '20 at 16:33
  • Thanks for the reply. Do I need the 0xFF beside theInt by the way? I've seen some answers leave that part out as well –  Dec 02 '20 at 18:35
  • `x & 0xFF` is java-ese for: Convert this byte to an int, interpreting the byte as unsinged (vs `(int) x` or even just `x` and relying on auto-widening, which converts to int but interpreting as signed. One turns the byte 0xFF into -1, the other into 255. It's not actually needed here, though - you're going from int to byte, not byte to int. – rzwitserloot Dec 02 '20 at 21:23