This is more of a curiosity than anything else. But I was wondering, how legitimate would this code be for implementing memcpy()
for a bare-metal environment?
#define MY_MEMCPY(DST, SRC, SIZE) \
{ struct tmp { char mem[SIZE]; }; *((struct tmp *) ((void *) DST)) = *((struct tmp *) ((void *) SRC)); }
We can then test it using
#include <stdio.h>
#define MY_MEMCPY(DST, SRC, SIZE) \
{ struct tmp { char mem[SIZE]; }; *((struct tmp *) ((void *) DST)) = *((struct tmp *) ((void *) SRC)); }
int main () {
char buffer[100] = "Hello world";
printf("%s\n", buffer);
MY_MEMCPY(buffer, "one", 4)
printf("%s\n", buffer);
MY_MEMCPY(buffer, "two", 4)
printf("%s\n", buffer);
MY_MEMCPY(buffer, "three", 6)
printf("%s\n", buffer);
return 0;
}
which prints
Hello world
one
two
three
From what I understand it would not violate the strict aliasing rule, since a pointer to a struct
is always equal to a pointer to its first member, and in this case the first member is a char
. See 6.7.2.1p15:
A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa.
It would also not have alignment problems, since its _Alignof()
is 1
.
Further reading:
EDIT #1
Only for literal strings, we can create another version of the macro that does not require any length to be passed as argument. Of course the destination still needs to have enough memory to hold the new string.
Here is the modified version:
/*
**WARNING** This macro works only when `SRC` is **a literal string** or
in all other cases where its size can be calculated using `sizeof()`
*/
#define MY_LITERAL_MEMCPY(DST, SRC) \
{ struct tmp { char mem[sizeof(SRC)]; }; *((struct tmp *) ((void *) DST)) = *((struct tmp *) ((void *) SRC)); }
We can test it using
#include <stdio.h>
/*
**WARNING** This macro works only when `SRC` is **a literal string** or
in all other cases where its size can be calculated using `sizeof()`
*/
#define MY_LITERAL_MEMCPY(DST, SRC) \
{ struct tmp { char mem[sizeof(SRC)]; }; *((struct tmp *) ((void *) DST)) = *((struct tmp *) ((void *) SRC)); }
int main () {
char buffer[100] = "Hello world";
printf("%s\n", buffer);
MY_LITERAL_MEMCPY(buffer, "one")
printf("%s\n", buffer);
MY_LITERAL_MEMCPY(buffer, "two")
printf("%s\n", buffer);
MY_LITERAL_MEMCPY(buffer, "three")
printf("%s\n", buffer);
return 0;
}
EDIT #2
In case you are worried about any possible padding added by a hypothetical alien compiler, adding a _Static_assert()
will make the macro very safe:
MY_MEMCPY()
:
#define MY_MEMCPY(DST, SRC, SIZE) \
{ struct tmp { char mem[SIZE]; }; _Static_assert(sizeof(struct tmp) \
== SIZE, "You have a very stupid compiler"); \
*((struct tmp *) ((void *) DST)) = *((struct tmp *) ((void *) SRC)); }
MY_LITERAL_MEMCPY()
:
/*
**WARNING** This macro works only when `SRC` is **a literal string** or
in all other cases where its size can be calculated using `sizeof()`
*/
#define MY_LITERAL_MEMCPY(DST, SRC) \
{ struct tmp { char mem[sizeof(SRC)]; }; _Static_assert(sizeof(struct tmp) \
== sizeof(SRC), "You have a very stupid compiler"); \
*((struct tmp *) ((void *) DST)) = *((struct tmp *) ((void *) SRC)); }
EDIT #3
Discussion about the legitimacy of the code
If it is legal to cast any memory location to a char *
, then we can map each single byte of a non-char
type to a different char *
variable:
some_non_char_type test;
char * one = (char *) &test;
char * two = (char *) &test + 1;
char * three = (char *) &test + 2;
...
char * last = (char *) &test + sizeof(test) - 1;
If the code above is legal, it is also legal to map collectively all the bytes above to a single char
array, since we are mapping adjacent bytes:
char (* all_of_them)[sizeof(some_non_char_type)] = (char (*)[sizeof(some_non_char_type)]) &test;
In this case we would access them as (*all_of_them)[0]
, (*all_of_them)[1]
, (*all_of_them)[2]
, etc.
If it is legal to map a collection of adjacent bytes to a char
array, then it is legal to cast such array as a single-member aggregate type, provided that the compiler does not add padding to the latter:
struct tmp {
char mem[sizeof(some_non_char_type)];
};
_Static_assert(sizeof(struct tmp) == sizeof(some_non_char_type),
"You have a very stupid compiler");
struct tmp * wrap = (struct tmp *) &test;
EDIT #4
This is a reply to Nate Eldredge's answer – as it seems that with optimizations enabled the compiler can make wrong assumptions. It is enough to tell explicitly the compiler about our aliasing by adding a simple *((char *) DST) = 0
before copying the data to DST
. Here the new versions of the macros that will work also with optimizations enabled:
MY_MEMCPY()
:
#define MY_MEMCPY(DST, SRC, SIZE) \
{ struct tmp { char mem[SIZE]; }; _Static_assert(sizeof(struct tmp) \
== SIZE, "You have a very stupid compiler"); *((char *) DST) = 0; \
*((struct tmp *) ((void *) DST)) = *((struct tmp *) ((void *) SRC)); }
MY_LITERAL_MEMCPY():
:
/*
**WARNING** This macro works only when `SRC` is **a literal string** or
in all other cases where its size can be calculated using `sizeof()`
*/
#define MY_LITERAL_MEMCPY(DST, SRC) \
{ struct tmp { char mem[sizeof(SRC)]; }; _Static_assert(sizeof(struct tmp) \
== sizeof(SRC), "You have a very stupid compiler"); *((char *) DST) = 0; \
*((struct tmp *) ((void *) DST)) = *((struct tmp *) ((void *) SRC)); }