Setting an unsigned short in an unsigned char array

Question

I have the following typedefs

typedef unsigned char   BYTE;  

typedef unsigned short  WORD;

Now, I have an array that looks like this

BYTE redundantMessage[6];

and a field which looks like this

WORD vehicleSpeedToWord = static_cast<WORD>(redundantVelocity);

I would like to set the third and fourth bytes of this message to the value of vehicleSpeedToWord. Will this do so:

redundantMessage[3] = vehicleSpeedToWord;

Will the third byte of redundantMessage automatically be overwritten?

Unless those types are defined by the OS (like in Windows) then use [the standard fixed-width integers](http://en.cppreference.com/w/c/types/integer). — Some programmer dude, Jun 04 '18 at 09:00
And note that `redundantDataMessage[3] = vehicleSpeedToWord` is basically the same as `redundantDataMessage[3] = (BYTE) vehicleSpeedToWord`, which should answer your question (as currently asked). — Some programmer dude, Jun 04 '18 at 09:03
Thank you for the answer, but I would like to know if my approach would work? And it would be more convenient in my application to used the datatypes I mentioned. — Wballer3, Jun 04 '18 at 09:04
I guess that this would solve my problem: memcpy( &redundantDataMessage[3], &vehicleSpeedToWord, sizeof( vehicleSpeedToWord ) ); — Wballer3, Jun 04 '18 at 09:10
You're on the right track, but it's still not quite correct. I assume you want to write to the *fourth* element because you have heard of [*endianness*](https://en.wikipedia.org/wiki/Endianness)? Well that's not quite how it works. — Some programmer dude, Jun 04 '18 at 09:17
Some programmer dude : The message is 6 bytes long, third and fourth are for the vehicleSpeedToWord. The message uses little endian. Will it work then? — Wballer3, Jun 04 '18 at 09:21
I already have the right endian, since I use boost native_to_little_inplace() to get the correct endian in vehicleSpeedToWord . But now I need to set 2 bytes in my message. — Wballer3, Jun 04 '18 at 09:24
Your `memcpy` call will write to the fourth and *fifth* element. — Some programmer dude, Jun 04 '18 at 09:42
If you're doing this to achieve serialisation, *please* just use a proper serialisation library (e.g. Protocol Buffers), which will take into account stuff like endianness and other portability issues you could run into by just copying raw bytes... — Sean Burton, Jun 04 '18 at 11:35

score 0 · Answer 1 · edited Jun 19 '18 at 07:09

I would like to set the third and fourth bytes of this message [fn. redundantMessage] to the value of vehicleSpeedToWord.

Little endian or big endian? Assuming unsigned short is exactly 16-bit (!) (ie. sizeof(unsigned short) == 2 && CHAR_BIT == 8, then:

// little endian
// set the third byte of redundantMessage to  (vehicleSpeedToWord&0xff)
redundantMessage[2] = vehicleSpeedToWord;
// sets the fourth byte of redundantMessage to ((vehicleSpeedToWord&0xff00)>>8)
redundantMessage[3] = vehicleSpeedToWord>>8;

or

// big endian
redundantMessage[2] = vehicleSpeedToWord>>8;
redundantMessage[3] = vehicleSpeedToWord;

If you want to use your host endianess, you need to tell the compiler to assign WORD data:

*reinterpret_cast<WORD*>(&redundantMessage[2]) = vehicleSpeedToWord;

but this is not really reliable.
short is not 16-bit, but at least 16-bit. So it may be 64-bit on x64 machines, or 1024-bits on 1024-bit machines. It is best to use fixed width integer types:

#include <cstdint>
typedef uint8_t BYTE;
typedef uint16_t WORD;

Thank you, it is little endian! The Device I will send the message to expects little endian. — Wballer3, Jun 04 '18 at 09:28

Acorn · Answer 2 · 2018-06-04T09:28:41.457

As you proposed, the best way to do it is using std::memcpy(). However, you need to pass the address, not the value; and if you really meant the third and fourth bytes, it should start at 2, rather than 3:

std::memcpy(&redundantDataMessage[2], vehicleSpeedToWord, sizeof(vehicleSpeedToWord));

Of course, you may do it "manually" by fiddling with the bits, e.g. (assuming CHAR_BIT == 8):

const BYTE high = vehicleSpeedToWord >> 8;
const BYTE low = vehicleSpeedToWord & static_cast<WORD>(0x00FF);
redundantDataMessage[2] = high;
redundantDataMessage[3] = low;

Do not be concerned with the performance of the std::memcpy(), the generated code should be the same.

Another point that you discuss in the comments is the endianness. If you are dealing with a network protocol, you must implement whatever endianness they specify in it; and convert accordingly. For this, the best is to convert beforehand your WORD using some functions to the proper endianness (i.e. from your arch's endianness to the protocol's endianness -- this conversion may be the identity function if they match).

Compilers/environments typically define a set of functions to deal with that. If you need portable code, wrap them inside your own function or implement your own, see How do I convert between big-endian and little-endian values in C++? for more details.

Richard Hodges · Answer 3 · 2018-06-04T10:37:08.020

You don't say whether you want the data to be stored in little-endian format (e.g. intel processors) or big-endian (network byte order).

Here's how I would tackle the problem.

I have provided both versions for comparison.

#include <cstdint>
#include <type_traits>
#include <cstddef>
#include <iterator>

struct little_endian {}; // low bytes first
struct big_endian {}; // high bytes first

template<class T>
auto integral_to_bytes(T value, unsigned char* target, little_endian)
-> std::enable_if_t<std::is_unsigned_v<T>>
{
    for(auto count = sizeof(T) ; count-- ; )
    {
        *target++ = static_cast<unsigned char>(value & T(0xff));
        value /= 0x100;
    }

}

template<class T>
auto integral_to_bytes(T value, unsigned char* target, big_endian)
-> std::enable_if_t<std::is_unsigned_v<T>>
{
    auto count = sizeof(T);
    auto first = std::make_reverse_iterator(target + count);

    while(count--)
    {
        *first++ = static_cast<unsigned char>(value & T(0xff));
        value /= 0x100;
    }

}


int main()
{
    extern std::uint16_t get_some_value();
    extern void foo(unsigned char*);

    unsigned char buffer[6];
    std::uint16_t some_value = get_some_value();

    // little_endian
    integral_to_bytes(some_value, buffer + 3, little_endian());
    foo(buffer);

    // big-endian
    integral_to_bytes(some_value, buffer + 3, big_endian());    
    foo(buffer);

}

You can take a look at the resulting assembler here. You can see that either way, the compiler does a very good job of converting logical intent into very efficient code.

update: we can improve style without cost in emitted code. Modern c++ compilers are amazing:

#include <cstdint>
#include <type_traits>
#include <cstddef>
#include <iterator>

struct little_endian {}; // low bytes first
struct big_endian {}; // high bytes first

template<class T, class Iter> 
void copy_bytes_le(T value, Iter first)
{
    for(auto count = sizeof(T) ; count-- ; )
    {
        *first++ = static_cast<unsigned char>(value & T(0xff));
        value /= 0x100;
    }
}

template<class T, class Iter>
auto integral_to_bytes(T value, Iter target, little_endian)
-> std::enable_if_t<std::is_unsigned_v<T>>
{
    copy_bytes_le(value, target);
}

template<class T, class Iter>
auto integral_to_bytes(T value, Iter target, big_endian)
-> std::enable_if_t<std::is_unsigned_v<T>>
{
    copy_bytes_le(value, 
                  std::make_reverse_iterator(target + sizeof(T)));
}


int main()
{
    extern std::uint16_t get_some_value();
    extern void foo(unsigned char*);

    unsigned char buffer[6];
    std::uint16_t some_value = get_some_value();

    // little_endian
    integral_to_bytes(some_value, buffer + 3, little_endian());
    foo(buffer);

    // big-endian
    integral_to_bytes(some_value, buffer + 3, big_endian());    
    foo(buffer);
}

Setting an unsigned short in an unsigned char array

3 Answers3