If you need to read the 64-bit value while the program is running then you probably can't do this safely without a mutex as others have said, but on the off-chance that you only need to read this value after all of the threads have finished, then you can implement this with an array of 2 32-bit atomic variables.
Since your system can only guarantee atomicity of this type on 4-byte memory regions, you should use those instead to maximize performance, for instance:
#include <stdio.h>
#include <threads.h>
#include <stdatomic.h>
_Atomic uint32_t system_tick_counter_us[2];
Then increment one of those two 4-byte atomic variables whenever you want to increment an 8-byte one, then check if it overflowed, and if it did, atomically increment the other. Keep in mind that atomic_fetch_add_explicit
returns the value of the atomic variable before it was incremented, so it's important to check for the value that will cause the overflow, not zero.
if(atomic_fetch_add_explicit(&system_tick_counter_us[0], 1, memory_order_relaxed) == (uint32_t)0-1)
atomic_fetch_add_explicit(&system_tick_counter_us[1], 1, memory_order_relaxed);
However, as I mentioned, this can cause a race condition in the case that the 64-bit variable is constructed between system_tick_counter_us[0]
overflowing and that same thread incrementing system_tick_counter_us[1]
but if you can find a way to guarantee that all threads are done executing the two lines above, then this is a safe solution.
The 64-bit value can be constructed as ((uint64_t)system_tick_counter_us[1] << 32) | (uint64_t)system_tick_counter_us[0]
once you're sure the memory is no longer being modified