3

Here's the code under consideration:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

char buffer[512];
int pos;
int posf;
int i;
struct timeval *tv;

int main(int argc, char **argv)
{
    pos = 0;
    for (i = 0; i < 512; i++) buffer[i] = 0;
    for (i = 0; i < 4; i++)
    {
        printf("pos = %d\n", pos);
        *(int *)(buffer + pos + 4) = 0x12345678;
        pos += 9;
    }

    for (i = 0; i < 9 * 4; i++)
    {
        printf(" %02X", (int)(unsigned char)*(buffer + i));
        if ((i % 9) == 8) printf("\n");
    }
    printf("\n");

    // ---  

    pos = 0;
    for (i = 0; i < 512; i++) buffer[i] = 0;
    *(int *)(buffer + 4) = 0x12345678;
    *(int *)(buffer + 9 + 4) = 0x12345678;
    *(int *)(buffer + 18 + 4) = 0x12345678;
    *(int *)(buffer + 27 + 4) = 0x12345678;

    for (i = 0; i < 9 * 4; i++)
    {
        printf(" %02X", (int)(unsigned char)*(buffer + i));
        if ((i % 9) == 8) printf("\n");
    }
    printf("\n");

    return 0;
}

And the output of code is

pos = 0
pos = 9
pos = 18
pos = 27
 00 00 00 00 78 56 34 12 00
 00 00 00 78 56 34 12 00 00
 00 00 78 56 34 12 00 00 00
 00 78 56 34 12 00 00 00 00

 00 00 00 00 78 56 34 12 00
 00 00 00 00 78 56 34 12 00
 00 00 00 00 78 56 34 12 00
 00 00 00 00 78 56 34 12 00

I can not get why

*(int *)(buffer + pos + 4) = 0x12345678;

is being placed into the address aligned to size of int (4 bytes). I expect the following actions during the execution of this command:

  1. pointer to buffer, which is char*, increased by the value of pos (0, 9, 18, 27) and then increased by 4. The resulting pointer is char* pointing to char array index [pos + 4];
  2. char* pointer in the brackets is being converted to the int*, causing resulting pointer addressing integer of 4 bytes size at base location (buffer + pos + 4) and integer array index [0];
  3. resulting int* location is being stored with bytes 78 56 34 12 in this order (little endian system).

Instead I see pointer in brackets being aligned to size of int (4 bytes), however direct addressing using constants (see second piece of code) works properly as expected.

  • target CPU is i.MX287 (ARM9);
  • target operating system is OpenWrt Linux [...] 3.18.29 #431 Fri Feb 11 15:57:31 2022 armv5tejl GNU/Linux;
  • compiled on Linux [...] 4.15.0-142-generic #146~16.04.1-Ubuntu SMP Tue Apr 13 09:27:15 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux, installed in Virtual machine;
  • GCC compiler version gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
  • I compile as a part of whole system image compilation, flags are CFLAGS = -Os -Wall -Wmissing-declarations -g3.

Update: thanks to Andrew Henle, I now replace

*(int*)(buffer + pos + 4) = 0x12345678;

with

        buffer[pos + 4] = value & 0xff;
        buffer[pos + 5] = (value >> 8) & 0xff;
        buffer[pos + 6] = (value >> 16) & 0xff;
        buffer[pos + 7] = (value >> 24) & 0xff;

and can't believe I must do it on 32-bit microprocessor system, whatever architecture it has, and that GCC is not able to properly slice int into bytes or partial int words and perform RMW for those parts.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
Anonymous
  • 561
  • 3
  • 7
  • 24
  • 2
    The [strict aliasing](https://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule) violations in your code are rampant. Code such as `*(int*)(buffer + 27 + 4) = 0x12345678;` invokes undefined behavior, and given you're running on ARM, you risk `SIGBUS` also because ARM chips in general don't allow the misaligned accesses that sheltered x86 programmers are usually unaware of, [like this](https://stackoverflow.com/questions/26158510/gcc-float-pointer-casting-in-c-causing-sigbus-error) In short, you simply can not take an array of `char` and treat any random part of it as an `int` – Andrew Henle Mar 19 '22 at 16:42

1 Answers1

2

char* pointer in the brackets is being converted to the int*, causing resulting pointer addressing integer of 4 bytes size at base location (buffer + pos + 4) and integer array index [0]

This incurs undefined behavior (UB) when the alignments requirements of int * are not met.

Instead copy with memcpy(). A good compiler will emit valid optimized code.

// *(int*)(buffer + pos + 4) = 0x12345678;
memcpy(buffer + pos + 4, &(int){0x12345678}, sizeof (int));
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • Excellent, thank you. It also solves portability problem when size of `int` may be different and doing >> manually in loop with count of bytes in `int` is crazy way of programming – Anonymous Mar 19 '22 at 18:25
  • 2
    "It also solves portability problem when size of int may be different" --> not quite. When `int` is 16-bit, initializing with `0x12345678` leads to implementation defined behavior. Consider instead `memcpy(buffer + pos + 4, &(unsigned){0x12345678}, sizeof (unsigned));` or `memcpy(buffer + pos + 4, &(uint32_t){0x12345678}, sizeof (uint32_t));` depending on your goals. – chux - Reinstate Monica Mar 19 '22 at 18:32
  • thank you. The fixed value was just an example, in my original, more complex code, I used `int` variable. – Anonymous Mar 19 '22 at 18:51