I've been continuously trying to wrap my head around this ordering model and how it's useful. This answer states
no loads that are dependent on the newly loaded value can be reordered wrt. the atomic load. I.e. if they are after the atomic load in the source code, they will happen after the atomic load too.
However, this makes no sense to me since it would be impossible to perform the later loads without value being read in by the first atomic load. This post provides a good overview of the consistency models, and it states that the ARM and Power architectures have a weak ordering model, except they enforce data-dependency ordering.
Does this mean that consume_memory_order is useless in every major architecture except Alpha?
If not, then given the example below, what and where can the instructions be reordered?
ldr r1, [r0]
ldr r7, [r0]
str r2, [r0]
ldr r3, [r1]
str r7, [r3]
ldr r4, [r3]