Modifying sp in ARM inline assembly

Question

I know there are rules governing the modifying of the stack pointer in x86:

All memory beyond the current address of RSP is considered volatile: The OS, or a debugger, may overwrite this memory during a user debug session, or an interrupt handler. Thus, RSP must always be set before attempting to read or write values to a stack frame.

My question is, what are the rules for ARM? I'm looking at this code (see extract below), and it looks like it violates that x86 rule (modifies memory then changes the stack pointer), but is that a problem on ARM?

mov r4, sp
sub r4, r4, #128

...

mov r3, #116
1: ldr r7, [r2]
add r2, r2, #4
str r7, [r4]
dd r4, r4, #4
sub r3, r3, #4
cmp r3, #0
ne 1b

sub sp, sp, #128

I've tried googling, but finding a spec that describes modifying the ARM stack in inline asm is... challenging. There are some docs regarding the ARM compiler and modifying the stack, but the rules for gcc appear to be different.

@artlessnoise Sorry, yes it's WIndows (added the tag). I just realized that the code doesn't include any `.cfi_` directives. Does that mean this routine is (somehow) 'skipped' during exceptions, or that it crashes? And where do the gcc docs say this is 'undefined' (that sounds promising)? — David Wohlferd, Jul 18 '16 at 00:02
The `.cfi_` type stuff depends on your binary spec (COFF/ELF/DWARF,PE-ARM? etc). [You should look at what the compiler outputs for some 'C' code in reference to the stack/frame](http://stackoverflow.com/questions/37743620/arm-headers-to-get-proper-call-stacks/). Definitely an OS/debugger will perform stack walks and need correct information. A signal to process after the assembler stack mod could cause a crash. No one can know for sure. — artless noise, Jul 18 '16 at 14:04

score 1 · Accepted Answer · answered Jul 17 '16 at 12:47

Your "rule" as you call it is a bit detailed and specific. But the "rule" applies to pretty much all processors that use one stack for everything.

You should as a general rule move your stack pointer first to "allocate" the stack space you want, which is how you keep the next thing from trashing it. And then move it back to de-allocate.

In the case of ARM, likely the code you were linking. You have banked registers in the arm, one of the first chapters in the architectural reference manual (need to read that before analyzing or writing assembly language, esp the one picture on registers). The Exception modes as the picture describes all have their own stack pointer. So when for example an interrupt occurs, some other stack pointer is used to save state, so your data wont be clobbered.

User and System share a stack pointer, but that is so that kernel, etc code can have access without getting trapped in user mode. System is not used for exceptions so your code isnt going to just stop and switch states and clobber the stack.

Now ARM is like any other brand Ford for example. They make big trucks little trucks, SUV's, small cars, grandpa cars, etc. ARM has a wide array of processor cores. The cortex-m is suited for microcontrollers and other small tight spaces. It has one stack, when an exception occurs it saves state on the stack for you, clobbering your data. So the code you pointed out would be bad, granted why would you be using printf on a cortex-m?

Compilers can be configured to use or not use a second stack pointer, the x86 world is used to this idea (sp and bsp), but it is not required. For a (data) stack to be useful there needs to be a stack pointer and instructions for referencing into the used part of the stack, stack pointer relative addressing. On some platforms you can access the stack pointer and use another register (make a copy) to access the stack frame leaving the stack pointer free to roam about. With or without it is an incredibly bad idea to touch a stack pointer in inline assembly in general, you need to know your toolchain well and code like that would require constant maintenance, every new release of the compiler or every new system you compile that code on, you have to hand examine the produced output to insure your manipulation is safe. If you are going to that level why use inline asm and burn all those man hours (job security?) you would use asm and make something safe and reliable the first time. If you just want some more data for that function, just make a local variable, it changes the subtraction on the sp, done. No inline assembly required. If you have this desire to look past the end of the stack, use assembly not inline assembly. If you want to modify past the stack pointer or quickly allocate for some reason without using local variables then again use assembly and move the stack pointer on systems where you have to to avoid corruption of this data you are playing with.

Other than crashing the system it doesnt make much sense to mess with the stack pointer in inline assembly. Has nothing to do with arm or x86 or fill in the blank.

What they have done there is write the entire function in assembly using inline assembly. And that may just a case of their build system choices, you can feed assembly into the gnu C compiler (if using inline assembly you have to write compiler specific code anyway so you already know what compiler you are using) and produce an object just like you can with C. There are other ways they could have done that is the point that are not as ugly. Unfortunately it is not an uncommon sight to see that solution. If running on a not-cortex-m, that code is safe-ish as is, you cant add a function call in the middle of it as you will trash your data, they do move the stack pointer just before the call rather than up front like a normal solution. Would have to track down the author to ask the "why did they do that" question.

My tldr of this is: *If running on a not-cortex-m, that code is safe-ish*. IOW it's ugly, but doesn't, quite, violate spec and there is no obvious crash here. Which is not the answer I was hoping for. I would like to see this code re-written, but that's usually easier to do if I can describe it as 'broken' instead of just 'ugly.' Since the target OS is Windows, can I use `Functions that do not require a frame pointer must perform all stack updates in the prologue and leave the stack pointer unchanged until the epilogue.` from https://msdn.microsoft.com/en-us/library/dn736986.aspx#Anchor_8? — David Wohlferd, Jul 17 '16 at 23:54
just move the stack pointer adjustment up to the top...done. after the mov r4,sp but before using r4. — old_timer, Jul 18 '16 at 00:54
@DavidWohlferd The "Red Zone" seems a much better justification - i.e. the kernel, debuggers, etc. are still technically free to interrupt the thread and dick about with the two words underneath SP at any time. It's a software ABI reason rather than an architecture one, but it's still something that that code would potentially go wrong from. — Notlikethat, Jul 19 '16 at 12:12

score 0 · Answer 2 · answered Jul 17 '16 at 07:20

0

There is no such rule for ARM. The CPU (At least Cortex-A/R CPUs) does not automatically stack registers in case of interrupt, and even for Cortex-M, it is guaranteed to keep the order.

answered Jul 17 '16 at 07:20

Dric512

3,525
1
20
27

I'm not sure I follow. If this code writes a few dozen bytes to the stack, then is interrupted, can't the interrupt code also write to the stack? How can the interrupt avoid overwriting what the first guy did if sp doesn't get updated? – David Wohlferd Jul 17 '16 at 07:30
@David Sure, the interrupt code writes to _a_ stack, but not the user mode stack - each processor mode has it's own banked SP. I think that issue _does_ exist on Linux due to signal handlers (which do run in the same user mode context), but that's not an architectural thing. FWIW, [this is all Microsoft have to say](https://msdn.microsoft.com/en-us/library/dn736986.aspx#Anchor_8). – Notlikethat Jul 17 '16 at 10:38
the cortex-m does use the stack to preserve state so that you dont have to the above code is broken and the content would get wiped out. – old_timer Jul 17 '16 at 12:19

Modifying sp in ARM inline assembly

2 Answers2