2

I have a statically allocated array of chars. Can I reuse this array for storing different types without violating strict aliasing rule? I don't understand strict aliasing really well, but here's an example of a code that does what I want to do:

#include <stdio.h>

static char memory_pool[256 * 1024];

struct m1
{
    int f1;
    int f2;
};

struct m2
{
    long f1;
    long f2;
};

struct m3
{
    float f1;
    float f2;
    float f3;
};

int main()
{
    void *at;
    struct m1 *m1;
    struct m2 *m2;
    struct m3 *m3;

    at = &memory_pool[0];
    
    m1 = (struct m1 *)at;
    m1->f1 = 10;
    m1->f2 = 20;

    printf("m1->f1 = %d, m1->f2 = %d;\n", m1->f1, m1->f2);

    m2 = (struct m2 *)at;
    m2->f1 = 30L;
    m2->f2 = 40L;

    printf("m2->f1 = %ld, m2->f2 = %ld;\n", m2->f1, m2->f2);

    m3 = (struct m3 *)at;
    m3->f1 = 5.0;
    m3->f2 = 6.0;
    m3->f3 = 7.0;

    printf("m3->f1 = %f, m3->f2 = %f, m3->f3 = %f;\n", m3->f1, m3->f2, m3->f3);

    return 0;
}

I've compiled this code using gcc with -Wstrict-aliasing=3 -fstrict-aliasing, and it works as intended:

m1->f1 = 10, m1->f2 = 20;
m2->f1 = 30, m2->f2 = 40;
m3->f1 = 5.000000, m3->f2 = 6.000000, m3->f3 = 7.000000;

Is that code safe? Assume memory_pool is always large enough.

curiousguy
  • 8,038
  • 2
  • 40
  • 58
valentinsp
  • 23
  • 3
  • 1
    Why don't you just use a union? – Bill Lynch May 15 '21 at 05:53
  • I'm not an expert nor language-lawyer, but I believe this is UB because aliasing concerns multiple names _of the same type_ being used to refer to the same memory location - as you're using different types that makes this more like a `union` - in which case you should use `union` specifically. – Dai May 15 '21 at 05:53
  • 1
    @BillLynch Is there any difference between doing this and using a `union`? – valentinsp May 15 '21 at 06:03
  • [C11 Standard - §6.5 Expressions (p7)](http://port70.net/~nsz/c/c11/n1570.html#6.5p7) (last bullet) – David C. Rankin May 15 '21 at 08:24
  • The rule is weird, but no. According to the rule https://port70.net/~nsz/c/c11/n1570.html#6.5p6, the effective type of a declared object is its declared type (char) and that prevents you from accessing that object through a different type https://port70.net/~nsz/c/c11/n1570.html#6.5p7. If you had somehow allocated the array so it had no declared type, then your code would be fine, because each write through a struct mX pointer imprint the struct mX type into the object as its new effective type. – Petr Skocik May 15 '21 at 08:47
  • https://stackoverflow.com/questions/38510557/can-a-char-array-be-used-with-any-data-type – Petr Skocik May 15 '21 at 08:52
  • @PSkocik - it is generally read the opposite of that. Since a `char` type is an effective type for all pointer types, there is no strict aliasing violation using it for access of another type. (but I agree, the wording is clear as *Mud*...) – David C. Rankin May 15 '21 at 08:58
  • @PSkocik It's weird. If you can't do that, than how would ever be able to write a custom allocator, for example? – valentinsp May 15 '21 at 09:03
  • @DavidC.Rankin The current rules are that you can read an object of declared type X_type through a character pointer (https://port70.net/~nsz/c/c11/n1570.html#6.5p7), but not an object of type char through an X_type pointer. It's been discussed on this site numerous times before. – Petr Skocik May 15 '21 at 09:14
  • @valentinsp That's what's weird about it. You officially can't unless you use a function/builtin, such as `malloc`, `calloc`, `mmap`, or `alloca`, that returns a pointer to an object with *no declared type* and then slice *that* object up. This is somewhat unfortunate because except for `alloca` (which isn't standard and can't be always used) that means an extra syscall that could have been avoided if you could just slice up static char arrays. – Petr Skocik May 15 '21 at 09:21
  • 2
    @valentinsp Practically, though, you can if you use some translation unit isolation, most compilers will have no way of knowing whether `void *myalloc(void);` or some `extern char *myblock;` will return a pointer with a declared type or not, so they'll have to conservatively treat accesses through it as accesses to memory with no declared type (even if the type is declared in some inaccessible translation unit). – Petr Skocik May 15 '21 at 09:28
  • 1
    @PSkocik -- yes, that is the exact section I provided the link for... `"An object shall have its stored value accessed only by an lvalue expression that has one of the following types: --- a character type"`. Yes, there are many discussion on this and all end in a qualified uncertainty. Here you have a block of memory for some object that is stored block of type `char`. It can be accessed through either the object type or type `char`. Getting it into the block of char will take a `memcpy()` and that is discussed in (P6). There is also the caveat on whether the access will modify the object. – David C. Rankin May 15 '21 at 17:29
  • @PSkocik: The set of constructs for which the Standard does or does not mandate support will seem much less weird if one considers that the authors saw no need to forbid compilers from doing things that would have been recognized as silly, or render them unsuitable for many purposes. – supercat May 19 '21 at 21:58

2 Answers2

1

Is it possible to use a character array as a memory pool without violating strict aliasing?

No. The rule in C 2018 6.5 7 says an object defined as array of char may be accessed as:

  1. a type compatible with array of char,
  2. a qualified version of a type compatible with array of char,
  3. a type that is the signed or unsigned type corresponding to array of char,
  4. a type that is the signed or unsigned type corresponding to array of char,
  5. an aggregate or union type that includes array of char among its members, or
  6. a character type.

3 and 4 are not possible for array of char; they apply only when the original type is an integer type. In your various examples with structures, the structures are not types compatible with array of char (nor are their members), ruling out 1 and 2. They do not include array of char among their members, ruling out 5. They are not character types, ruling out 6.

I've compiled this code using gcc with -Wstrict-aliasing=3 -fstrict-aliasing, and it works as intended:

The sample output shows that the code produced desired output in one test. This is not equivalent to showing it works as intended.

Is that code safe?

No. The code can be made safe in certain situations. First, declare it with appropriate alignment, such as static _Alignas(max_align_t) memory_pool[256 * 1024];. (max_align_t is defined in <stddef.h>.) That makes the pointer conversions partially defined.

Second, if you are using GCC or Clang and request -fno-strict-aliasing, the compiler provides an extension to the C language that relaxes C 2018 6.5 7. Alternatively, in some cases, it may be possible to deduce from knowledge of the compiler and linker design that your program will work even if 6.5 7 is violated: If the program is compiled in separate translation units, and the object modules contain no type information or no fancy link-time optimization is used, and no aliasing violation occurs in the translation unit that implements the memory pool, then there cannot be adverse consequences from violating 6.5 7 because no way exists for the C implementation to distinguish code that violates 6.5 7 in regard to the memory pool from code that does not. Additionally, you must know that the pointer conversions work as desired, that they effectively produce pointers to the same addresses (rather than merely intermediate data that can be converted back to the original pointer value but not directly used as a pointer to the same memory).

The deduction that there are no adverse consequences is fragile and should be used with care. For example, it is easy to accidentally violate 6.5 7 in the translation unit implementing the memory pool, as by storing a pointer in a freed memory block or by storing size information in a hidden header preceding an allocated block.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • It's too bad that the authors of the Standard failed to make clear that they made no effort to systematically ensure that the Standard defined behavior in every situation where the Committee not only expected compilers to work a certain way, *but couldn't imagine them doing otherwise*, and that in situations where existing compilers would process a construct consistently, quality implementations should be expected to follow precedent absent a documented and compelling reason to deviate from it. – supercat May 19 '21 at 22:17
1

The Standard deliberately refrains from requiring that all implementations be suitable for low-level programming, but allows implementations intended for low-level programming to extend the language to support such usage by specifying their behaviors in more cases than mandated by the Standard. Even when using compilers designed for low-level programming, however, using a character array as a memory pool is generally not a good idea. For compatibility with the widest range of compilers and platforms, however, one should declare memory-pool objects as either an array of the type with the widest alignment, or a union containing a character array long with the type having the widest alignment, e.g.

 static uint64_t my_memory_pool_allocation[(MY_MEMORY_POOL_SIZE+7)/8];
 void *my_memory_pool_start = my_memory_pool_allocation;

or

 union
 {
   unsigned char bytes[MY_MEMORY_POOL_SIZE];
   double alignment_force;
 } my_memory_pool_allocation;
 void *my_memory_pool_start = my_memory_pool_allocation.bytes;

Note that clang and gcc may be configured to extend the language in a manner suitable for low-level programming by using the -fno-strict-aliasing flag, and commercial compilers can often support low-level concepts like memory pools even when using type-based aliasing, since they recognize pointer-type conversions as barriers to likely-erroneous type-based aliasing assumptions.

If a void* is initialized to the address of a static object whose symbol is used in no other context, I don't think any commonplace compiler is going to care about the type that was used for the initialization. Jumping through the hoops to follow the Standard here is a fool's errand. When not using -fno-strict-aliasing, neither clang nor gcc will handle all of the corner cases mandated by the Standard, and with -fno-strict-aliasing, and they'll extend the semantics of the language to allow memory pools to be used conveniently whether the Standard requires them to or not.

supercat
  • 77,689
  • 9
  • 166
  • 211