Is cache invalidation promised in this implementation

Question

Consider the following code:

volatile uint32_t word;
for (i=0; i<10; i++)
{
    word = *(uint32_t *)(ADDRESS_IN_MEMORY);
    printf("%"PRIu32, word);
    some_function_compiled_in_other_object();  /* this function may or may not change memory content at adress ADDRESS_IN_MEMORY */
}

So, since word is volatile, we know that word = *(uint32_t *)(ADDRESS_IN_MEMORY) will be indeed executed 10 times. But, is there any promise regarding the system cache here? I would expect that the compiled code will invalidate ADDRESS_IN_MEMORY before\after each read from this address, so that word will be loaded with the value from system memory and not the cache. Is that promised?

Does the answer depends on whether or not compiler knows about some_function_compiled_in_other_object changing value at memory address ADDRESS_IN_MEMORY?

Possible duplicate of [C volatile variables and Cache Memory](https://stackoverflow.com/questions/7872175/c-volatile-variables-and-cache-memory) — , Dec 13 '18 at 11:51
Did you leave `volatile` out of `*(uint32_t volatile*)(ADDRESS_IN_MEMORY)` intentionally? — user694733, Dec 13 '18 at 12:01
@user2162550 If `ADDRESS_IN_MEMORY` is a hardware register, then yes it does matter. But the register itself should be volatile qualified too, hopefully. And then it doesn't matter. The cast is very fishy though. — Lundin, Dec 13 '18 at 12:05

score 5 · Answer 1 · answered Dec 13 '18 at 12:05

So, since word is volatile, we know that word = *(uint32_t *)(ADDRESS_IN_MEMORY) will be indeed executed 10 times.

No.

Assume the CPU has some registers (and only allows values to be transfered to/from registers and doesn't allow data to be transferred directly from one place in memory to another place in memory), and the compiled code actually does something more like this:

    for (i=0; i<10; i++)
    {
        CPU_register_1 = *(uint32_t *)(ADDRESS_IN_MEMORY);
        word = CPU_register_1

Now lets also assume that the compiler optimises the code. It knows that *(uint32_t *)(ADDRESS_IN_MEMORY); is NOT volatile, so it might convert it into something like this;

    CPU_register_1 = *(uint32_t *)(ADDRESS_IN_MEMORY);
    for (i=0; i<10; i++)
    {
        word = CPU_register_1

score 3 · Accepted Answer · answered Dec 13 '18 at 12:16

The C standard knows nothing of cache memories. They are an application-specific detail outside the scope of the C language.

The volatile keyword is only concerned with optimizations performed by the compiler. The compiler needs to ensure that operations on volatile-qualified variables are sequenced in a certain order and not optimized away.

When reading a hardware register, you must always use volatile or otherwise the compiler can assume that the contents of the register are never changed since previous use.

So if ADDRESS_IN_MEMORY in your example is a number corresponding to an address, you have a bug there, since you read it as *(uint32_t *)(ADDRESS_IN_MEMORY);. This bug isn't the slightest related to cache memory.

Cache memory handling is handled by the CPU/branch prediction, not by the compiler nor the C language. And so you cannot affect it directly from application code, unless you access the MMU registers where you can specify the behavior. It is of course very system-specific. A sound system setup will not load memory-mapped hardware register access into data cache.

You can however write cache-friendly code, by accessing memory consecutively, always reading the next adjacent address from top to bottom, without any branches that can change access order.

“Cache memory handling is handled by the CPU, not by the compiler” is not true in general. There are compilers that participate in managing cache, including compilers with automatic vectorization that do strip mining. — Eric Postpischil, Dec 13 '18 at 12:24
@EricPostpischil: They may optimize around assumptions about cache behavior or emit instructions to influence it (like prefetch) but they don't *control* the cache logic. — R.. GitHub STOP HELPING ICE, Dec 13 '18 at 15:39
@R..: There are compilers that issue cache commands, such as instructions to store data from cache to memory or to invalidate a cache line (which forces the next load for that line to come from memory). These are as definite actions as arithmetic instructions are, not mere hints or requests, and it makes no more sense to say such a compiler does not control cache than it does to say such a compiler does not control arithmetic because it merely emits instructions to influence the processor to do arithmetic. — Eric Postpischil, Dec 13 '18 at 16:08

Is cache invalidation promised in this implementation

2 Answers2