Question:
Can I have two pointers of different types (uint32_t *
and char *
) pointing to the very same address?
Here is why I want to have this:
I want to convert UTF-8 to UTF-32 and vice versa in C
.
Lets say, I have a variable of type uint32_t
that contains one UTF-32 encoded unicode character. And I already know that it needs 4 byte when encoded in UTF-8. It's binary representation is this:
00000000000aaabbbbbbccccccdddddd
a, b, c and d are 4 different ranges where each bit can be 0 or 1.
With clever bitwise &
, |
and <<
operations I can rearrange these bits so that at the end there is this new distribution:
00000aaa00bbbbbb00cccccc00dddddd
And then I can flip some bits (using |
again), to get this
11110aaa10bbbbbb10cccccc10dddddd
When I split this into 4 subsequent char
variables in an array I have this:
11110aaa 10bbbbbb 10cccccc 10dddddd
which is exactly the UTF-8 encoding of the same unicode character.
So, the very same 4 byte in memory shall be one single uint32_t
variable and at the same time an array of 4 char
variables:
So, I want to have this:
uint32_t *utf32;
char utf8[4];
*utf32
is a pointer that points to a single 4 bytes longuint32_t
variable.utf8
is a pointer to an array of 4char
elements, each 1 byte long.
And I want that both pointers point to the very same address. So I can write a utf32 encoded character into the variable utf32
, transform it in place, and then read the result form the array utf32
. Is this possible? If so: How can I do it?
(I used this technique very often when I was coding in COBOL in the previous millennium, because in COBOL it's easy to overload the same region in the memory with many different definitions. But I don't know how to do it in C.)
I have found a lot of questions dealing with 2 pointers pointing to the same address, but in these questions the pointers have always the same type. And some other questions are about why you get an error if a pointer defined with a certain type points to an address that was defined with another type. But I didn't find anything about two pointers of different types sharing the same address.