As you can see from the several answers so far, there are multiple approaches, and some perhaps surprising subtleties.
"Mathematical" approach. You separate the bytes using shifting and masking (or, equivalently, division and remainder), and recombine them similarly. This is "option 1" in Felix Palmen's answer. This approach has the advantage that it is completely independent of "endianness" issues. It has the complication that it's subject to some sign-extension and implementation-definedness issues. It's safest if you use an unsigned
type for both the composite int
and the byte-separated parts of the equation. If you use signed types, you'll typically need extra casts and/or masks. (But with that said, this is the approach I prefer.)
"Memory" approach. You use pointers, or a union
, to directly access the bytes making up an int
. This is "option 2" in Felix Palmen's answer. The very significant issue here is byte order, or "endianness". Also, depending on how you implement it, you may run afoul of the "strict aliasing" rule.
If you use the "mathematical" approach, make sure you test it on values that both do and don't have the high bit of the various bytes set. For example, for 16 bits, a complete set of tests might include the values 0x0101
, 0x0180
, 0x8001
, and 0x8080
. If you don't write the code correctly (if you implement it using signed types, or if you leave out some of the otherwise necessary masks), you will typically find extra 0xff
's creeping into the reconstructed result, corrupting the transmission. (Also, you might want to think about writing a formal unit test, so that you can maximize the likelihood that the code will be re-tested, and any latent bugs detected, if/when it's ported to a machine which makes different implementation choices which affect it.)
If you do want to transmit signed values, you will have a few additional complications. In particular, if you reconstruct your 16-bit integer on a machine where type int
is bigger than 16 bits, you may have to explicitly sign extend it to preserve its value. Again, comprehensive testing should ensure that you've adequately addressed these complications (at least on the platforms where you've tested your code so far :-) ).
Going back to the test values I suggested (0x0101
, 0x0180
, 0x8001
, and 0x8080
), if you're transmitting unsigned integers, these correspond to 257, 384, 32769, and 32896. If you're transmitting signed integers, they correspond to 257, 384, -32767, and -32640. And if on the other end you get values like -693 or 65281 (which correspond to hexadecimal 0xff01
), or if you get 32896 when you expected -32640, it indicates that you need to go back and be more careful with your signed/unsigned usage, with your masking, and/or with your explicit sign extension.
Finally, if you use the "memory" approach, and if your sending and receiving code runs on machines of different byte orders, you'll find the bytes swapped. 0x0102
will turn into 0x0201
. There are various ways to solve this, but it can be quite a nuisance. (This is why, as I said, I usually prefer the "mathematical" approach, so I can just sidestep the byte order problem.)