Is unaligned access in Cortex-M4 atomic?

Question

In the ARM documentation, it mentions that

The Cortex-M4 processor supports ARMv7 unaligned accesses, and performs all accesses as single, unaligned accesses. They are converted into two or more aligned accesses by the DCode and System bus interfaces.

It's not clear to me if this means the data access is atomic to the programmer or not. Then I found a StackOverflow comment interpreting the documentation as:

Actually some ARM processors like the Cortex-M3 support unaligned access in HW, so even an unaligned read/write is atomic. The access may span multiple bus cycles to memory, but there is no opportunity for another instruction to jump in between, so it is atomic to the programmer.

However, I looked around some more and found claims that contradicts the previous claim:

Another one is the fact that on cores beginning ARMv6 and later, in order for the hardware to “fix-up” an unaligned access, it splits it up into multiple smaller, byte loads. However, these are not atomic!.

So, who do I believe? For some context, I have setters/getters for each element in a packed struct in my project. In other words, some struct elements may be unaligned. I was wondering if accessing the struct elements will always be guaranteed to be atomic on Cortex-M4. If it's not, I am thinking I will have to enable/disable interrupts manually or add in some mutex, but I'd rather not if ARM Cortex M4 can just guarantee the data accesses to be atomic.

The exception behavior for these multiple access instructions means they are not suitable for use for writes to memory for the purpose of software synchronization. — old_timer, Nov 30 '19 at 04:38
above is a a quote from the arm docs. first off you need to know what instructions are being used and from that read the arm documentation for your core (armv7-m for the cortex-m4). — old_timer, Nov 30 '19 at 04:40
Are you short on memory? why are you using packed structs? I suspect to use structs across compile domains which is just bad news in general, unpredictable, etc. Best to enable trapping of unaligned accesses, and fix any code that performs such an access. Then you dont have to worry about it (unless the code generates instructions that can be restarted). — old_timer, Nov 30 '19 at 04:42
If your instruction generates a write for example to address 0x1002 and the bus is 32 bits wide (assume 32 or 64), it does not turn this into two instructions but it needs to generate a transaction at address 0x1000 with two byte lanes enabled and another at 0x1004 with two byte lanes enabled. Likewise for a read of 0x1002 a read of 0x1000 is done and a read of 0x1004 is done of which two bytes from each are kept and fed to the processor core to save in the register. there is a lot of language related to this in the documentation. the quote above is probably the one you should run with. — old_timer, Nov 30 '19 at 04:45
how are you going to implement a mutex? be very careful with ldrex/strex as your mcu is uniprocessor. I would do lots of experiments first and confirm that ldrex/strex catches a modification (by an interrupt). — old_timer, Nov 30 '19 at 04:50
so so so much easier to just avoid the unaligned accesses, all of your problems go away. — old_timer, Nov 30 '19 at 04:50
It's not atomic. Any other memory-observer can see two or more accesses be generated. However, if the processor guarantees precise exceptions (which cortex-m3 can do AFAIK, though its not configured that way by default), then the access will probably appear atomic with regards to interrupt handlers on the same processor. — EOF, Nov 30 '19 at 13:30
Quote from ARMv7-M reference manual "In ARMv7-M, the single-copy atomic processor accesses are: • All byte accesses. • All halfword accesses to halfword-aligned locations. • All word accesses to word-aligned locations " — R S, Nov 30 '19 at 14:24
Maybe you also want to read https://stackoverflow.com/questions/24010989/arm-single-copy-atomicity — Vroomfondel, Dec 02 '19 at 10:08
@old_timer I want to copy an incoming CAN message from hardware registers into an 8-bytes long packed struct using memcpy. The motivation is that it would be easier to read the content of the CAN message using struct members. However, I think I am going to refactor the code such that I can copy the CAN message from hardware registers into a struct that isn't packed to avoid this mess with unaligned access. — Ken Lin, Dec 05 '19 at 21:59
using structs across compile domains like that is problematic. just copy it to a buffer and use the data. — old_timer, Dec 06 '19 at 11:59
Apart from interrupts on your CPU, please keep in mind concurrent access by DMA and maybe other peripherals (and, of course, other CPUs) that may have access to the same memory. — HelpingHand, Apr 07 '20 at 14:55

score 3 · Answer 1 · edited Oct 28 '22 at 13:37

Nope, it isn't.

See section A3.5.3 of the ARMv7-M Architecture Reference Manual:

In ARMv7-M, the single-copy atomic processor accesses are:

All byte accesses.

All halfword accesses to halfword-aligned locations.

All word accesses to word-aligned locations

So, if you are copying a uint32 that isn't aligned to a 32-bit boundary (which is allowed in v7-M), the copy isn't atomic.

Also quoting:

When an access is not single-copy atomic, it is executed as a sequence of smaller accesses, each of which is single-copy atomic, at least at the byte level.

Is unaligned access in Cortex-M4 atomic?

1 Answers1