take uint16_t from array of uint8_t give wrong result?

Question

I have an array of byte :

uint8_t* data = 10101010 01000001 00000000 00010010 00000000 00000010..........



uint8_t get_U8(uint8_t * data, int* offset)
{
    uint8_t tmp = *((uint8_t*)(data + *offset));
    *offset += sizeof(uint8_t);
    return tmp;
}

uint16_t get_U16(uint8_t* data, int* offset)
{
    uint16_t tmp = *((uint16_t*)(data + *offset));
    *offset += sizeof(uint16_t);
    return tmp;
}

offset here is 2.

get_U8(data, 0) = 10101010 = 170  ===> OK

get_U8(data, 1) = 01000001 = 65   ===> OK

get_U8(data, 2) = 00000000 = 0    ===> OK

get_U8(data, 3) = 00010010 = 18   ===> OK

but

get_U16(data, 2) = 4608    ===> NOT OK (should be 18)

4608 = 00010010 00000000

So I understand that the 2 bytes are inverted.

I don't understand why get_U16 is inverting the position of the bytes, and it's not a big endian / little endian issue because here it's the first 8 bits inverted with the 8 second bits.

I am just expecting uint16_t to just take the 16 bits at the given position, and return 18 here.

Can anyone tell me what I am doing wrong?

`and it's not big endian / little endian issue because here it's the first 8 bits inverted with the 8 second bits.` It is exactly that, a endian issue. — tkausl, Mar 21 '19 at 12:31
It's a little endian issue _and_ undefined behavior due to strict aliasing violation. [What is the strict aliasing rule?](https://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule) — Lundin, Mar 21 '19 at 12:37
In addition, some processors require accesses to be naturally aligned, so a 16-bit access must be aligned to a 16-bit boundary, which isn't guaranteed when making half-word accesses at arbitrary offsets in a byte array. What is the hardware platform being used here? — njuffa, Mar 21 '19 at 12:37

Thomas Jager · Answer 1 · 2019-03-21T12:38:58.703

I don't understand why get_U16 is inverting the position of the bytes, and it's not big endian / little endian issue because here it's the first 8 bits inverted with the 8 second bits.

This is incorrect. Endianess regards the order of bytes in larger data types. In this case, the least-significant byte seems to be stored at the lowest address.

When you read the address (call it p) of index 2 for a uint16_t, the memory at that address contains the lower byte of the value. The address p + 1 contains the upper byte of the value.

score 2 · Answer 2 · answered Mar 21 '19 at 12:51

You should handle everything manually.

uint8_t get_U8(uint8_t * data, int* offset)
{
    uint8_t tmp;
    # I think the following should work even on systems where `sizeof(uint8_t) != 1`
    memcpy(&tmp, &((unsigned char*)data)[*offset], sizeof(uint8_t));
    *offset += sizeof(uint8_t);
    return tmp;
}

uint16_t get_U16(uint8_t* data, int* offset)
{
    uint8_t tmp1 = get_U8(data, offset);
    uint8_t tmp2 = get_U8(data, offset);
    uint16_t tmp = tmp1 << 16 | tmp2; 
    # or tmp = tmp2 << 16 | tmp1; depending on the endianess you want to have
    return tmp;
}

Doing:

uint16_t tmp = *((uint16_t*)(data + *offset));

is bad, very bad for buffers you *offset += shift manually. It can very easy cause undefined behavior (read as: segmentation fault) if data + *offset is not uint16_t aligned. Don't do this. You want uint16_t from two bytes? Read char by char and use bit shifts and only bit shifts.

it's not big endian / little endian issue because here it's the first 8 bits inverted with the 8 second bits.

Shortly speaking: That's exactly how endianess works. It inverts the first 8 bits with the second 8 bits.

Anyone can tell me what I am doing wrong?

You are doing nothing wrong and you code may work as it is.

"You are doing nothing wrong and you code may work as it is." The cast to `uint16_t*` is plenty wrong for multiple reasons: endianess, alignment and pointer aliasing. — Lundin, Mar 21 '19 at 13:01

take uint16_t from array of uint8_t give wrong result?

2 Answers2