-1

I have an array of byte :

uint8_t* data = 10101010 01000001 00000000 00010010 00000000 00000010..........



uint8_t get_U8(uint8_t * data, int* offset)
{
    uint8_t tmp = *((uint8_t*)(data + *offset));
    *offset += sizeof(uint8_t);
    return tmp;
}

uint16_t get_U16(uint8_t* data, int* offset)
{
    uint16_t tmp = *((uint16_t*)(data + *offset));
    *offset += sizeof(uint16_t);
    return tmp;
}

offset here is 2.

get_U8(data, 0) = 10101010 = 170  ===> OK

get_U8(data, 1) = 01000001 = 65   ===> OK

get_U8(data, 2) = 00000000 = 0    ===> OK

get_U8(data, 3) = 00010010 = 18   ===> OK

but

get_U16(data, 2) = 4608    ===> NOT OK (should be 18)

4608 = 00010010 00000000 

So I understand that the 2 bytes are inverted.

I don't understand why get_U16 is inverting the position of the bytes, and it's not a big endian / little endian issue because here it's the first 8 bits inverted with the 8 second bits.

I am just expecting uint16_t to just take the 16 bits at the given position, and return 18 here.

Can anyone tell me what I am doing wrong?

static_cast
  • 1,174
  • 1
  • 15
  • 21
iliès
  • 34
  • 3
  • 7
    `and it's not big endian / little endian issue because here it's the first 8 bits inverted with the 8 second bits.` It is exactly that, a endian issue. – tkausl Mar 21 '19 at 12:31
  • 3
    It's a little endian issue _and_ undefined behavior due to strict aliasing violation. [What is the strict aliasing rule?](https://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule) – Lundin Mar 21 '19 at 12:37
  • 2
    In addition, some processors require accesses to be naturally aligned, so a 16-bit access must be aligned to a 16-bit boundary, which isn't guaranteed when making half-word accesses at arbitrary offsets in a byte array. What is the hardware platform being used here? – njuffa Mar 21 '19 at 12:37

2 Answers2

3

I don't understand why get_U16 is inverting the position of the bytes, and it's not big endian / little endian issue because here it's the first 8 bits inverted with the 8 second bits.

This is incorrect. Endianess regards the order of bytes in larger data types. In this case, the least-significant byte seems to be stored at the lowest address.

When you read the address (call it p) of index 2 for a uint16_t, the memory at that address contains the lower byte of the value. The address p + 1 contains the upper byte of the value.

Thomas Jager
  • 4,836
  • 2
  • 16
  • 30
2

You should handle everything manually.

uint8_t get_U8(uint8_t * data, int* offset)
{
    uint8_t tmp;
    # I think the following should work even on systems where `sizeof(uint8_t) != 1`
    memcpy(&tmp, &((unsigned char*)data)[*offset], sizeof(uint8_t));
    *offset += sizeof(uint8_t);
    return tmp;
}

uint16_t get_U16(uint8_t* data, int* offset)
{
    uint8_t tmp1 = get_U8(data, offset);
    uint8_t tmp2 = get_U8(data, offset);
    uint16_t tmp = tmp1 << 16 | tmp2; 
    # or tmp = tmp2 << 16 | tmp1; depending on the endianess you want to have
    return tmp;
}

Doing:

uint16_t tmp = *((uint16_t*)(data + *offset));

is bad, very bad for buffers you *offset += shift manually. It can very easy cause undefined behavior (read as: segmentation fault) if data + *offset is not uint16_t aligned. Don't do this. You want uint16_t from two bytes? Read char by char and use bit shifts and only bit shifts.

it's not big endian / little endian issue because here it's the first 8 bits inverted with the 8 second bits.

Shortly speaking: That's exactly how endianess works. It inverts the first 8 bits with the second 8 bits.

Anyone can tell me what I am doing wrong?

You are doing nothing wrong and you code may work as it is.

KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • 1
    "You are doing nothing wrong and you code may work as it is." The cast to `uint16_t*` is plenty wrong for multiple reasons: endianess, alignment and pointer aliasing. – Lundin Mar 21 '19 at 13:01