6

The 64-bit Windows ABI defines a generalized exception handling mechanism, which I believe is shared across C++ exceptions and structured exceptions available even in other languages such as C.

If I'm writing an x86-64 assembly routine to be compiled in nasm and linked into a C or C++ library, what accommodations do I need make on Windows in terms of generating unwind info and so on?

I'm not planning on generating any exceptions directly in the assembly code, although I suppose it is possible that the code may get an access violation if a user-supplied buffer is invalid, etc.

I'd like the write the minimum possible to get this to work, especially since it seems that nasm has poor support for generating unwind info and using MASM is not an option for this cross-platform project. I do need to use (hence save and restore) non-volatile registers.

BeeOnRope
  • 60,350
  • 16
  • 207
  • 386

2 Answers2

9

As a general rule, Windows x64 requires all functions to provide unwind information. The only exception is for leaf functions which do not modify rsp and do not modify any nonvolatile registers.

Raymond Chen
  • 44,448
  • 11
  • 96
  • 135
  • 2
    What is the consequence of not providing unwind information, for example, in a leaf function which modifies non-volatile registers and rsp? I assume it will simply crash on an exception rather than try to unwind, or are there other consequences such as stack clobber issues with interrupt or asynchronous task handling? – BeeOnRope Jul 26 '17 at 17:26
  • 3
    The system assumes that the absence of unwind information means that the function is a leaf which does not modify `rsp` or any nonvolatile registers. It will attempt to unwind the exception by restoring `rip` to the value from the top of the stack, and not restoring callee-save registers. The results are now undefined because you are operating with garbage data. – Raymond Chen Jul 26 '17 at 17:30
  • 1
    Well I expect the results are not totally arbitrary since for example the OS still must prevent the process from escaping its security sandbox, messing up other processes and presumably still cleans up the process resources on a crash. Or are you suggesting that something worse than that could happen on Windows? Perhaps a [better question](https://stackoverflow.com/q/45335145/149138) is what happens when a fault occurs during stack unwinding. – BeeOnRope Jul 26 '17 at 18:47
  • 2
    It's undefined but still constrained by the process security boundary. Maybe the value popped into `rip` happens to land in the middle of a valid (but unrelated) function, at which point the stack unwinder tries to call the except handler of a function that doesn't even have an activation record. At this point, you are executing effectively random code, and the behavior is unpredictable. If you're lucky, you crash immediately. If you're unlucky, the program manages to limp along in a corrupted state. If you're very unlucky, it installs malware. – Raymond Chen Jul 26 '17 at 20:47
  • Thanks Raymond, it makes sense. I guess I'm also [wondering](https://stackoverflow.com/q/45335145/149138) what happens if this pointer-chase-to-nowhere causes a GP as it probably will in most cases. Outside of the debugger will it be handled "as well" as a normal uncaught access violation, or be one of those silent crashes where the app disappears without the usual 0xC0000005 dialog or whatever. – BeeOnRope Jul 26 '17 at 20:53
  • 1
    If you don't have unwind information then stack walking will fail. This means that ETW profiling of your process will not reliably get call stacks, which makes it harder to profile and therefore harder to optimize your software. Your profiling will show functions that execute CPU time but it won't show you where they are called from, which needlessly complicates your task. So, if you don't care about exceptions or performance then maybe you can get away without unwind information If you don't care about profiling then why are you writing assembly language? – Bruce Dawson Dec 04 '17 at 19:19
  • @BruceDawson - I care about performance, which is exactly why I'm writing assembly language. I do my profiling in Linux which has a variety of tools comparable to ETW, and I expect assembly routines to execute with the same overall performance on the same hardware on Windows or Linux (modulo some minor issues such as THP). That said, for profiling assembly, you often don't need full stacks, if you know your time is being spent in the assembly routine you can concentrate on sampling of that method specifically without regard to the exact call stack. – BeeOnRope Dec 10 '17 at 20:16
  • @BeeOnRope - Sure, sometimes optimizing is just about making a routine run faster. But sometimes it is about calling it less often. Not having call stacks seems risky - you may end up without the information you need. There is essentially no benefit to skipping the metadata. – Bruce Dawson Dec 12 '17 at 04:03
  • @BruceDawson - the benefit in my case is less work. I'd like to support Windows and the majority of my application is in C and C++ which works fine, but I have some "kernels" written in assembly and the assembler has limited support for generating this metadata. If it was easy I wouldn't care, and I agree stacks are great. – BeeOnRope Dec 12 '17 at 05:17
  • @BeeOnRope You can limit your work to the function boundary. On entry, create a standard frame, save all the nonvolatile registers (with appropriate unwind codes), and then go crazy. When finished going crazy, restore the registers and return. If an exception needs to be unwound during the crazy part, the unwind codes will recover the nonvolatile registers from wherever you saved them, and the stack walk will unwind from the proper return address, and nobody gets hurt. – Raymond Chen Dec 12 '17 at 06:21
  • 1
    @raymond - yup, that's what I do already since it's just easy. The real problem is that `nasm` doesn't directly support generation of the `.pdata` so you are left scratching out the unwind info by hand. – BeeOnRope Dec 12 '17 at 19:28
4

Judging by the context of your question, what you really want to know is the practical consequences of not providing unwind information for your non-leaf assembly functions on x64 Windows. Since C++ exceptions are implemented based on SEH exceptions, when I talk about exceptions below, I mean both all "native" (access violation, something thrown using RaiseException, etc.) and C++ exceptions. Here's a list off the top of my head:

  • Exceptions won't be able to pass through your function

It's important to note that this point is not about throwing an exception, or an access violation happening directly in your function. Let's say your assembly code calls into a C++ function, which throws an exception. Even if the caller of your assembly function has a matching catch block, it will never be able to catch the exception, as unwinding will stop at your function without the unwind data.

  • When walking the stack, the stack walk will stop at the function without unwind data (or go astray; the point is, you will get an invalid call stack)

Basicaly, anything that walks the stack is screwed if your function is present on the call stack (debuggers when displaying the call stack, profilers, etc.)

  • Registered Unhandled Exception Filters will not be called back if an exception gets thrown, and your assembly function is on the call stack

This interferes with anything that relies on UEFs. Custom crash handlers, for instance. Or something potentially more relevant: std::terminate won't be called back in this case, if your program throws a C++ exception, that is unhandled (as it's dictated by the C++ standard). The MSVC runtime uses a UEF to implement this, so this won't work as well.


Are you developing a 3rd party library? If that's the case, the importance of the above points will depend on the use case of your clients.

Donpedro
  • 828
  • 7
  • 22
  • Thanks. Yes, among other things I'm looking for a breakdown of the consequences of not including this info. For example, I may be able to guarantee that there are no callouts to C++ from the assembly (e.g., because there are no callouts at all), but I do want to adjust `rsp` and possibly use non-volatile registers. My code doesn't explicitly trigger the unwind mechanism either. ISTM, then that the main issue would then be users of the code who use SEH to try to trap access violations within this code, which would presumably fail as Raymond points out. – BeeOnRope Jul 26 '17 at 20:59
  • And just to clarify - these may very well be _leaf functions_ (i.e., no calls) in the general sense of the term, but not in the Windows 64 ABI sense of the term, which also implies no volatile reg or `rsp` modifications. – BeeOnRope Jul 26 '17 at 21:00
  • 2
    There are various types of exceptions that are handled and continued by the default top-level exception filter, such as stack guard page exceptions and Win32 resource copy-on-write. Clients likely would not be happy if they blew up when the stack was too close to a 4KB boundary, or when they passed a pointer to copy-on-write resources or to a memory-mapped file that encounters an I/O error. – Raymond Chen Jul 27 '17 at 01:08
  • 1
    @RaymondChen you are right about the handling of resource writes (although it's a 16-bit compatibility thing, I heard). Stack guard page exceptions however, are handled by the VMM transparently in kernel mode (user mode code never sees it, as far as I know). I'm not sure about the "memory-mapped file that encounters an I/O error" case, but I would guess it gets the same treatment as stack guard page exceptions. – Donpedro Jul 27 '17 at 19:18
  • @Donpedro My memory appears to be unreliable. But nevertheless, the overall point stands. You have to do it right or your process will experience undefined behavior. – Raymond Chen Jul 28 '17 at 01:37