Do FreeRTOS heap implementations violate C aliasing rules?

Question

Looking at the code for heap 1 in FreeRTOS...

#if ( configAPPLICATION_ALLOCATED_HEAP == 1 )

/* The application writer has already defined the array used for the RTOS
* heap - probably so it can be placed in a special segment or address. */
    extern uint8_t ucHeap[ configTOTAL_HEAP_SIZE ];
#else
    static uint8_t ucHeap[ configTOTAL_HEAP_SIZE ];
#endif /* configAPPLICATION_ALLOCATED_HEAP */

...we see that a heap is just an array of uint8_t objects.

But then, in its void* pvPortMalloc(size_t xWantedSize) function, it defines a uint8_t* called pucAlignedHeap, and a size_t called xNextFreeByte.

Our return value pvReturn is then defined in this block...

 /* Check there is enough room left for the allocation and. */
        if( ( xWantedSize > 0 ) &&                                /* valid size */
            ( ( xNextFreeByte + xWantedSize ) < configADJUSTED_HEAP_SIZE ) &&
            ( ( xNextFreeByte + xWantedSize ) > xNextFreeByte ) ) /* Check for overflow. */
        {
            /* Return the next free byte then increment the index past this
             * block. */
            pvReturn = pucAlignedHeap + xNextFreeByte;
            xNextFreeByte += xWantedSize;
        }

...and is then expected to be used by the programmer to store whatever data they want:

//Some example:
my_struct* x = pvPortMalloc(sizeof(my_struct));

But since the underlying data type is an array of uint8_t, doesn't that mean that any real usage of the heap violates C's aliasing requirements?

And if that's true, then why are they allowed to violate these requirements without worrying about UB? FreeRTOS is hardly a small hobby project, so they must know what they're doing, and yet it surely looks like this is UB. Why can they do this, but I can't? They do not appear to have -fno-strict-aliasing defined, so I don't think it's that.

I believe there's an exception in the aliasing rules for `char`, and it's reasonable for that to extend to `uint8_t`. I'm not a language lawyer so I can't produce a proper answer. — Mark Ransom, May 07 '23 at 03:56
Does https://stackoverflow.com/questions/38515179/is-it-possible-to-write-a-conformant-implementation-of-malloc-in-c answer your question? — KamilCuk, May 07 '23 at 04:09
OT: What type would you expect a `malloc` implementation to use? There is no such thing as `static void ucHeap[ configTOTAL_HEAP_SIZE ];` and use of `unsigned char` would solve nothing... — Support Ukraine, May 07 '23 at 05:08
@SupportUkraine, to clarify: I expect there to be no "standards compliant" way for a memory pool to be allocated. I was just confirming my suspicions. — Idunnoanymore, May 15 '23 at 05:42

score 1 · Accepted Answer · answered May 08 '23 at 18:44

Because there are many tasks that would never require the ability to recycle storage to hold multiple unrelated kinds of objects within its lifetime, the C Standard does not require that all implementations support such recycling. The Standard allows implementations to extend the language by supporting usage patterns beyond those mandated, and any implementation which is suitable for tasks that would require recycling storage within its lifetime will necessarily extend the language in that fashion. The Standard waives jurisdiction over such things, however.

In the language processed by clang and gcc when not using -fno-strict-aliasing, once any storage has been written via an lvalue of non-character type, that storage will have that as an Effective Type for "for that access and for subsequent accesses that do not modify the stored value". Because that phrase doesn't say "all subsequent accesses until the stored value is modified using some other type", there is no way to usefully change the Effective Type of storage once it has been written. The storage may be written using other types, but after storage is written using two or more incompatible non-character types, any attempt to read it using any non-character-type would be incompatible with at least one of the Effective Types written to the storage, and thus invoke UB.

Thus, all code that repurposes storage for use as different types within its lifetime will violate the aliasing constraints given in the Standard unless it limits itself to using character-type reads. Programmers shouldn't jump through hoops to satisfy such constraints, however, because implementations that are suitable for tasks that would require reuse of storage will support such tasks regardless of whether or not the Standard would require that they do so. Unfortunately, the Standard offers no guidance as to what constructs should be supported, treating such support as a Quality of Implementation issue.

Support Ukraine · Answer 2 · 2023-05-07T05:01:36.377

Why can they do this, but I can't?

You can... but it comes with a price.

The C standard defines a number of behavior rules that any standard compliant implementation must adhere to.

Further the C standard leaves a number of things to the implementation. This is called implementation-defined behavior. Quoted from draft N1570 for C standard 20111:

3.4.1 implementation-defined behavior unspecified behavior where each implementation documents how the choice is made

On top of that the standard has the concept of undefined behavior.

3.4.3 undefined behavior behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

For undefined behavior a note says:

Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message)

Now if you are willing to write C code that can be used only on specific implementations/paltforms/environments, you can write non standard compliant code that works fine as long as the targeted implementation defines the behavior. No problem. And there are lots of code out there doing that.

The price is that your code can't be used on any implementation.

BTW:

The specific code mentioned in the question uses uint8_t. By doing that the code is limited to be used on implementations that supports uint8_t. And as written in "7.20.1.1 Exact-width integer types" of N1570, the C standard doesn't require all implementations to implement that type.

score 0 · Answer 3 · answered May 07 '23 at 11:19

When the allocation routines are compiled in a separate translation unit without link-time optimization, then the compiler has no information about what object types they return pointers to when it is compiling other translation units.

When the compiler is compiling a translation unit that uses the memory allocation routines, it is possible the translation unit will later be linked with a translation unit that conforms to the C aliasing rules. In particular, it could be linked with an object module that returns pointers to dynamically allocated memory, which initially has no effective type. Therefore, the compiler must produce an object module for the current translation unit that will work correctly if it is linked with such a module.

The effect of the aliasing rules in the C standard is to allow the compiler to optimize some code that receives pointers to objects of different types. For example, given a routine void foo(int *p, float *q), the compiler could assume that p and q point to different memory, and therefore it can commute operations on p[i] and q[j]. When the memory allocation routines are in a separate translation unit, such situations never arise with regard to the addresses it returns, so there is no effect from the aliasing rules.

Clang and gcc have logic that assumes that if some location written as type T1, some stuff happens that they can't track but definitely doesn't involve T1, the storage is observed as holding some bit pattern as type T2, more stuff they can't track happens that may or may not involve T1, and an object of type T1 whose bit pattern matches the observed T2 is written to the storage, and the storage is read as type T1, there is no way the storage can hold anything other than the first value that was written as type T1. I doubt compilation-unit boundaries are as robust an obstacle to TBAA as... — supercat, May 08 '23 at 20:07
...one might expect them to be. They would block most opportunities a compiler would have to employ TBAA usefully (as well as most useless and breaking ones), but I would expect clang and gcc to try to aggressively try to find ways of o working around the normal limitations imposed by compilation-unit boundaries. — supercat, May 08 '23 at 20:09

Do FreeRTOS heap implementations violate C aliasing rules?

3 Answers3