Returning 16-bits given a pointer in C

Question

A chunk is represented by a 64-bit long integer, which is broken into 4 16-bit sections.

I need to return a 16-bit section using the function below.

unsigned short get_16bitsection(unsigned long *start, int index) {
// Fill this in 
}

I get that, but I'm asking about byte order. Is this presuming little-endian or big-endian architecture? — tadman, Jul 09 '20 at 01:37
Why is `start` a pointer? That's very misleading. You could just pass in a value. — tadman, Jul 09 '20 at 01:47
p = start; p = p + (3-index) ; your confusion stems from understanding of pointer arithmetic. Incrementing pointer makes it to point next address based on its datatype. For example p is pointer of type int ( 2 bytes) and points to address 0x1200 then incrementing pointer makes it to point to 0x1202 — Babajan, Jul 09 '20 at 02:44

score 4 · Answer 1 · answered Jul 09 '20 at 01:59

It is tempting to use casts to achieve this, but it is a common misconception that "everything is just bytes" and thus that you can do that safely. A rule called strict aliasing actually prohibits doing so. Your code may appear to work, particularly on older and less sophisticated compilers, but in the age of heavy optimisations you are really playing with fire by violating the language rules like that.

Instead, you should copy the bytes you need into a uint16_t, then return it:

uint16_t get_16bitsection(uint64_t *start, int index) {
  uint16_t result;
  memcpy(&result, (char*)start + index*sizeof(uint16_t), sizeof(uint16_t));
  return result;
}

Here I cast to char* so that we can navigate byte-wise through your chunk (this aliasing is a specifically permitted exception to the usual strict-aliasing rule), then apply an offset of index*sizeof(uint16_t) to reach the desired index (assuming little endian, which you have specified). Finally, we copy the bytes into result, and return it.

If you're concerned about performance, don't be. You were already copying a uint16_t from local scope into the calling scope; just now it has a name. And if this function is any slower than the aliasing-violating version, then that's evidence that you've confused the optimiser into going too far.

This is an interesting question where the answers list some bad things that can happen when you violate strict aliasing: https://stackoverflow.com/questions/2958633/gcc-strict-aliasing-and-horror-stories And here's an example of a failure on x86 systems (interestingly enough it involves `unsigned short` aliasing): https://stackoverflow.com/questions/46790550/c-undefined-behavior-strict-aliasing-rule-or-incorrect-alignment/46790815#46790815 So x86 isn't immune to address alignment restrictions either. — Andrew Henle, Jul 09 '20 at 02:14

score 0 · Answer 2 · answered Jul 09 '20 at 02:05

0

Just use a union.

long int x=0x123456789abcdef0;

union {
    long int x;
    unsigned short arr[4];
} c;

c.x = x;
printf("%04x %04x %04x %04x\n", c.arr[0], c.arr[1], c.arr[2], c.arr[3]);

Result:

def0 9abc 5678 1234

answered Jul 09 '20 at 02:05

sizzzzlerz

4,277
3
27
35

Actually I take it back; it could be explicitly valid since C11. I'm a bit behind on C. Still, this requires modifying the declaration of the source data. – Asteroids With Wings Jul 09 '20 at 02:07

score 0 · Answer 3 · answered Jul 09 '20 at 13:51

Returning 16-bits given a pointer
A chunk is represented by a 64-bit long integer, which is broken into 4 16-bit sections

To access the data in a endian independent portable way and retrieve the 0:LS 16-bit to 3:MS 16-bit, use >>.

As unsigned long may only be 32-bit, recommend unsigned long long or uint_least64_t.
Consider making pointer const to allow this function use on const data.

unsigned short get_16bitsection(const unsigned long long *start, int index) {
  #define MASK_16BIT 0xFFFFu
  return MASK_16BIT & (*start >> (16*index));
}

Mask useful on rare machines where unsigned short is not 16 bit. IAC, I prefer mask over casts - gentler way to reduce range.
Alternatively use a cast: (unsigned short) or (uint16_t) though this is slightly less portable as uint16_t may not exist and unsigned short may be > 16-bit.

tadman · Accepted Answer · 2020-07-09T01:52:52.977

Maybe I'm missing the point here but it could be as easy as this:

unsigned short get_16bitsection_be(unsigned long *start, int index) {
  unsigned short *p = (unsigned short*) start;

  return p[3 - index];
}

unsigned short get_16bitsection_le(unsigned long *start, int index) {
  unsigned short *p = (unsigned short*) start;

  return p[index];
}

Where the difference between big and little endian is relevant here.

Note you should consider using stdint.h to give these types more meaningful names and make it clear what you're actually doing:

uint16_t get_16bitsection_le(uint64_t *start, int index) {
  uint16_t *p = (uint16_t*) start;

  return p[index];
}

uint16_t get_16bitsection_be(uint64_t *start, int index) {
  uint16_t *p = (uint16_t*) start;

  return p[3 - index];
}

You were on the right track with your second approach, but that code is heavily cluttered by a lot of things that don't matter, plus the * 8 offset which makes no sense.

Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/217508/discussion-on-answer-by-tadman-returning-16-bits-given-a-pointer-in-c). — Samuel Liew, Jul 09 '20 at 04:42

Returning 16-bits given a pointer in C

4 Answers4