0

Looking at the code:

#include <stdio.h>
#include <stdint.h>

int main() {

    char foo[512]={};

    printf("%d", *((uint32_t*)foo));

    return 0;
}

I'm having hard time understanding what *((uint32_t*)foo)) does, Using different values in array I'm getting all kinds of return values. What exactly does it point to and so what's the return value?

Lundin
  • 195,001
  • 40
  • 254
  • 396
Broman3100
  • 19
  • 5
  • 2
    It means "Pretend that foo is a pointer to a `uint32_t` (even though it isn't); and dereference it." – Brendan Mar 20 '23 at 08:45
  • 1
    `foo` is a pointer to the first position of your `char` array. Then you are converting this `char` pointer into a `uint32_t` pointer using the `(uint32_t*)`. After this, you are dereferencing the pointer using the `*(expression)` which essentially fetches the value of your array. However, since you're telling it to treat it as a 4 byte value (`uint32_t`) the returned value are the 4 adjacent bytes to the first position. – jvieira88 Mar 20 '23 at 08:49
  • 1
    `char foo[512]={};` isn't valid C. `*((uint32_t*)foo)` invokes undefined behavior. – Lundin Mar 20 '23 at 09:05
  • 4
    One rule of thumb is: if you are a beginner in C, you should pretty much _never_ use the cast operator. There are numerous ways which it can go wrong and it is rarely ever the correct solution to any beginner (to intermediate) problem. – Lundin Mar 20 '23 at 09:21
  • Please don't "fix" code as pointed out in answers, since that will make those remarks obsolete and confusing for future readers. In case you have follow-up questions etc it is better to post a new question. – Lundin Mar 20 '23 at 10:17

3 Answers3

5
  • char foo[512]={}; is invalid syntax, empty initializer lists isn't allowed in C. You have to use {0} if you wish to initialize it.
  • (uint32_t*)foo is fishy since a uint32_t* isn't necessary compatible with char*. Furthermore, the char array might not be aligned. 1) The rule of thumb is that we can cast from any object pointer type to a character pointer type, but not the other way around.
  • *((uint32_t*)foo) invokes undefined behavior, possible in several ways. foo could be misaligned. And it is also a strict pointer aliasing violation (What is the strict aliasing rule?). The TL;DR is basically that the compiler is free to assume that the char array is never used in the code you posted, since it is free to assume that a char will never get accessed through a uint32_t*.

Ignoring all of the above - which we shouldn't since undefined behavior means anything could happen - then it is likely (but not guaranteed) that the compiler will grab 4 bytes from the char array and reinterpret them as a uint32_t. Assuming char is 8 bits, then it will (probably) do so according to the CPU endianess. That is, if we do char foo[512]={'A','B','C','D'}; and the CPU has little endian format, then the 'D' will end up in the lowest byte of the uint32_t. What is CPU endianness? So using ASCII, it would become the number 0x44434241.

Please note that %d is the wrong format specifier for printing uint32_t. You should use %u or the most correct form printf("%"PRIu32, ...) from inttypes.h.


1)C17 6.3.2.3 is the relevant rule in the C standard:

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Re “`(uint32_t*)foo` is fishy since a `uint32_t*` isn't necessary compatible with `char*`”: Compatibility of those types is not an issue. Compatibility of the types the point to could be. – Eric Postpischil Mar 20 '23 at 09:26
  • @EricPostpischil Splitting hairs... but I added the relevant quote to the answer. – Lundin Mar 20 '23 at 09:33
  • @Lundin good answer - but much too complicated assuming OPs level. It is like explaining children who can add subtract octonion algebra – 0___________ Mar 20 '23 at 09:47
  • @0___________ As a wrote in a comment to the question, the best approach for beginners is to never use casts and never write code this way to begin with. As soon as you do, you are no longer on beginner territory. I guess what they are mostly after is the link explaining endianess. – Lundin Mar 20 '23 at 09:55
  • It is indeed a bit confusing to understand for me... But I get the point, I'm not using the casting myself, just scrolling through someone else's code and came up to that. That said, The answer was very helpful, thanks. – Broman3100 Mar 20 '23 at 10:14
3
*((uint32_t*)foo))

In this expression, foo is type-casted to a uint32_t pointer and then dereferenced.

Dereferencing a cast of a variable from one type of pointer to a different type is usually in violation of the strict aliasing rule¹, and

*((uint32_t*)foo))

does exactly that. So the expression invokes undefined behaviour.

Furthermore, foo might not be properly aligned:

From C11:

1 The behavior is undefined in the following circumstances:

....

Conversion between two pointer types produces a result that is incorrectly aligned (6.3.2.3)

With 6.3.2.3p7 saying

[...] If the resulting pointer is not correctly aligned [68] for the referenced type, the behavior is undefined. [...]

Unaligned data is data at an address (pointer value) that is not evenly divisible by its alignment (which is usually its size).

NB that empty initializer lists are not valid till C23, and just because int32_t happens to be int on your compiler/platform doesn't mean that it might not be long on another.

%d in not the correct format specifier for an int32_t. If you don't want to use the specific macros for fixed-width integer types, another approach is to cast to intmax_t / uintmax_t and use %jd and %ju respectively.

Footnote:

1

See: What is the strict aliasing rule?

Harith
  • 4,663
  • 1
  • 5
  • 20
  • When quoting the C standard you shouldn't use Annex J, which is only an informative summary. Instead you should go where Annex J points at, in this case 6.3.2.3, and find the actual normative rule there. In this case C11 6.3.2.3/7. – Lundin Mar 20 '23 at 09:57
  • @Lundin Acknowledged. – Harith Mar 20 '23 at 10:03
2

It is called "pointer punning". It is used to reinterpret the binary representation of one type as another type. It invokes "undefined behaviour" UB and has to be avoided

A much better (and safe) way is to use memcpy function. Modern optimizing compilers in many cases will optimize the memcpy call out.

Example:

uint32_t charAsuint32(const char *charr)
{
    uint32_t result;

    memcpy(&result, charr, sizeof(result));
    return result;
}

And generated code:

charAsuint32:
        mov     eax, DWORD PTR [rdi]
        ret

PS You use the wrong format to display uint32_t value - and it is a UB too.

Your example:

int main(void) 
{
    char foo[512]={0x11,0x22,0x33,0x44};
    uint32_t u32;

    memcpy(&u32, foo, sizeof(u32));

    printf("%"PRIx32"\n", u32);

    return 0;
}

https://godbolt.org/z/E1eTc8djj

0___________
  • 60,014
  • 4
  • 34
  • 74