35

I'm writing this for Android (ARM only), but I believe the principle is the same for generic Linux as well.

I'm trying to capture the stack trace from within the signal handler, so that I can log it when my app crashes. This is what I've come up with using <unwind.h>.
Initialization:

struct sigaction signalhandlerDescriptor;
memset(&signalhandlerDescriptor, 0, sizeof(signalhandlerDescriptor));
signalhandlerDescriptor.sa_flags = SA_SIGINFO;
signalhandlerDescriptor._u._sa_sigaction = signalHandler;
sigaction(SIGSEGV, &signalhandlerDescriptor, 0);

The code itself:

struct BacktraceState
{
    void** current;
    void** end;
    void* pc;
};

inline _Unwind_Reason_Code unwindCallback(struct _Unwind_Context* context, void* arg)
{
    BacktraceState* state = static_cast<BacktraceState*>(arg);
    state->pc = (void*)_Unwind_GetIP(context);
    if (state->pc)
    {
        if (state->current == state->end)
            return _URC_END_OF_STACK;
        else
            *state->current++ = reinterpret_cast<void*>(state->pc);
    }
    return _URC_NO_REASON;
}

inline size_t captureBacktrace(void** addrs, size_t max, unsigned long pc)
{
    BacktraceState state = {addrs, addrs + max, (void*)pc};
    _Unwind_Backtrace(unwindCallback, &state);
    personality_routine();

    return state.current - addrs;
}

inline void dumpBacktrace(std::ostream& os, void** addrs, size_t count)
{
    for (size_t idx = 0; idx < count; ++idx) {
        const void* addr = addrs[idx];
        const char* symbol = "";

        Dl_info info;
        if (dladdr(addr, &info) && info.dli_sname) {
            symbol = info.dli_sname;
        }

        int status = -3;
        char * demangledName = abi::__cxa_demangle(symbol, 0, 0, &status);
        os << "#" << idx << ": " << addr << "  " << (status == 0 ? demangledName : symbol) << "\n";
        free(demangledName);
    }
}

void signalHandler(int sig, siginfo_t *siginfo, void *uctx)
{
    ucontext * context = (ucontext*)uctx;
    unsigned long PC = context->uc_mcontext.arm_pc;
    unsigned long SP = context->uc_mcontext.arm_sp;

    Logger() << __PRETTY_FUNCTION__ << "Fatal signal:" << sig;
    const size_t maxNumAddresses = 50;
    void* addresses[maxNumAddresses];
    std::ostringstream oss;

    const size_t actualNumAddresses = captureBacktrace(addresses, maxNumAddresses, PC);
    dumpBacktrace(oss, addresses, actualNumAddresses);
    Logger() << oss.str();
    exit(EXIT_FAILURE);
}

Problem: if I get the PC register by calling _Unwind_GetIP(context) in unwindCallback, I get the complete trace for the signal handler stack. Which is a separate stack, and that's obviously not what I want. So I tried supplying the PC taken from the ucontext in signal handler, and got a weird result: I get one stack entry, it is the correct entry - the function which caused the signal in the first place. But it's logged twice (even the address is the same, so it's not a symbolic name look up bug). Obviously, that's not good enough - I need the whole stack. And I wonder if this result is merely accidental (i. e. it shouldn't work in general.

Now, I read I need to also supply the stack pointer, which I apparently can get from ucontext, same as PC. But I don't know what to do with it. Do I have to unwind manually instead of using _Unwind_Backtrace? If so, can you give me sample code? I've been searching for the better part of a day, and still couldn't find anything I could copy and paste into my project.

For what it's worth, here's the libunwind source which contains _Unwind_Backtrace definition. Thought I could figure something out if I see its source, but it's way more complicated than I expected.

Barney
  • 2,355
  • 3
  • 22
  • 37
Violet Giraffe
  • 32,368
  • 48
  • 194
  • 335
  • FWIW, the way the Dalvik VM collects native stack traces from other threads starts here: https://android.googlesource.com/platform/dalvik/+/kitkat-release/vm/interp/Stack.cpp#1389 . The implementation is sending signals to other threads to cause them to do the collection. – fadden Apr 10 '15 at 15:38
  • @fadden: unfortunately, it uses `corkscrew/backtrace.h`, which is not available in the NDK, and I've heard that library has been removed from Android 5. – Violet Giraffe Apr 14 '15 at 09:49
  • On a related note - Ensure that `-fno-omit-frame-pointer` is passed in the `CFLAGS` to the compiler to allow tracing the call-graph. Frame pointers will be silently disabled by the optimisation flags (eg.`-O2`). – TheCodeArtist Apr 15 '15 at 04:21
  • @TheCodeArtist: yep, it is passed. As I've mentioned already, I have n problem getting the stacktrace of the thread where `dumpBacktrace` is called, even with -O3. – Violet Giraffe Apr 15 '15 at 05:03
  • @VioletGiraffe hmmm... ok. checkout the answers to this [**question**](http://stackoverflow.com/q/77005/319204). Especially [this one](http://stackoverflow.com/a/77336/319204) looks promising. Using `execinfo.h`, are the `backtrace()` family of functions available within a native executable on Android? – TheCodeArtist Apr 15 '15 at 10:04
  • @TheCodeArtist: that was the first thing I tried. `` is not present in the NDK so I can't even include it. – Violet Giraffe Apr 15 '15 at 10:22
  • Can't you from signal handler return to another function in crashing thread by changing pc in ucontext? then in that function get the back trace and exit the app? – auselen Apr 17 '15 at 13:03
  • @auselen: I didn't get the idea. First, what would I set PC to? Second, does setting PC in `ucontext` actually alter the PC register state to alter the execution flow? And third, are you sure that returning will get me back onto the original stack? – Violet Giraffe Apr 17 '15 at 13:05
  • set pc to a function printing what you want, altering pc works, setting up the stack might be the hard part but if your printing function doesn't mess things up it should be ok. this was my play thingy: https://github.com/auselen/agoapf/blob/master/signal_catcher/signal_catcher.c – auselen Apr 17 '15 at 13:46
  • @auselen: what does the line `mcontext->arm_pc += (mcontext->arm_cpsr & 0x20) == 0x20 ? 2 : 4;` mean? – Violet Giraffe Apr 17 '15 at 14:12
  • I was talking to myself like I should have put some comment there :) thats for moving pc forward. For arm mode it's four bytes ahead, for thumb two. – auselen Apr 17 '15 at 14:25
  • What about this http://stackoverflow.com/a/5426269/1163019? – auselen Apr 17 '15 at 14:33
  • Ah, gotcha. Still not sure how to do what you suggested - I'm not familiar with such specifics of CPU operation, let alone ARM CPU. Do you suggest that I put some function's address on the stack somewhere (where?) so that I get there when I execute `return`? Is it even possible to get back onto the main stack from a signal handler stack? – Violet Giraffe Apr 17 '15 at 14:33
  • @auselen: I've seen that answer, it's clearly for x86. It sues x86 registers. Pretty sure ARM is different enough for that not to work. – Violet Giraffe Apr 17 '15 at 14:34
  • Can you share a buildable ndk example? – auselen Apr 17 '15 at 17:55
  • I suppose I can extract my code into hello-jni. Will do on Monday (don't have the Android environment set up at home). But **all** the code is here. Really. Just call `sigaction` to register my signal handler - same way as you did in the sample linked earlier today. – Violet Giraffe Apr 17 '15 at 19:16
  • I wanted some buildable example because this whole thing depends on how you build, what you pass mostly because it is c++. So if you share some minimal example, it would nice. – auselen Apr 20 '15 at 08:20
  • @auselen: Ah, fair enough. I don't have anything complicated in my makefiles, just a couple extra GCC flags to make this specific thing work, namely - `-rdynamic` and `-funwind-tables`. Let me work on an example. – Violet Giraffe Apr 20 '15 at 09:15
  • 1
    Here is an almost header only library, which is capable of printing out the stack nicely in case of segmentation faults. https://github.com/vmarkovtsev/DeathHandler – Jens Munk Jul 17 '15 at 16:29
  • @JensMunk: thanks, I'll give a try! There's no mention of Android support, though. I believe it won't work there. Will try it out on Monday and report back. – Violet Giraffe Jul 17 '15 at 17:35
  • @violet giraffe It have used the library on ARM7 and ARM9 and it is POSIX compliant. The output formattet very nicely with colors - perfect for catching important information from crashes and no memory allocation is made in the handler (like mentioned above) – Jens Munk Jul 17 '15 at 17:52
  • Having a similar issue - wondering if it is possible to get the full call-stack using this? One thing I've noticed in the _Unwind_Backtrace call-back is that you can actually alter the value of the pointer assigned to state->current and everything seems to still work. Example: void* pcPtr = state->pc; pcPtr++; prior to assigning it here: *state->current++ = reinterpret_cast(pcPtr); At one point I thought I might be able to change the address of the pc and get a different part of the call-stack. – Tim Jan 28 '16 at 03:31
  • Interestingly enough - when I run use this code - I get the same result as this answer: http://stackoverflow.com/questions/8115192/android-ndk-getting-the-backtrace That is - I never get the stack-frame (function) that actually raised the signal. I only get the call-stack starting at the signal handler. I'd like to believe it is a compiler flag - but I have all of the following: -funwind-tables -fomit-frame-pointer -fexceptions -fasynchronous-unwind-tables -g I'm suspicious that the _Unwind_Backtrace function is really running the show (in regards to what part of the call-stack you get). – Tim Jan 28 '16 at 18:13
  • @Tim: I gave up on this task a long time ago and don't exactly remember what was my last result, but I definitely had this problem at some point. You can easily get the stack trace, but there's only the signal handler in it. Which is why I assumed the handler gets its own separate stack, and created this question. – Violet Giraffe Jan 28 '16 at 18:42
  • Oh - you were never able to get the actual function that raised the signal? Okay - I had misunderstood this bit of your post "I get one stack entry, it is the correct entry - the function which caused the signal in the first place.". I thought that had meant you actually did get the function that raised the signal. Sounds like arm/android might be kicking off a different thread to handle signals. Ios has something like that call Grand Central Dispatch - but is similarly not useful - as it prevents you from getting the call-stack that triggered the signal. – Tim Jan 28 '16 at 18:49
  • @Tim: Hmm, as I said, I hardly remember my progress. I guess you're right, I did get the culprit function at some point, but only this one entry - never managed to get the whole stack. – Violet Giraffe Jan 28 '16 at 18:53

3 Answers3

5

In order to to get stacktrace of code which caused SIGSEGV instead of stacktrace of the signal handler, you have to get ARM registers from ucontext_t and use them for unwinding.

But it is hard to do with _Unwind_Backtrace(). Thus, if you use libc++ (LLVM STL) and compile for 32-bit ARM, better try precompiled libunwind, bundled with modern Android NDKs (at sources/cxx-stl/llvm-libc++/libs/armeabi-v7a/libunwind.a). Here is a sample code.


// This method can only be used on 32-bit ARM with libc++ (LLVM STL).
// Android NDK r16b contains "libunwind.a" for armeabi-v7a ABI.
// This library is even silently linked in by the ndk-build,
// so we don't have to add it manually in "Android.mk".
// We can use this library, but we need matching headers,
// namely "libunwind.h" and "__libunwind_config.h".
// For NDK r16b, the headers can be fetched here:
// https://android.googlesource.com/platform/external/libunwind_llvm/+/ndk-r16/include/
#if _LIBCPP_VERSION && __has_include("libunwind.h")
#include "libunwind.h"
#endif

struct BacktraceState {
    const ucontext_t*   signal_ucontext;
    size_t              address_count = 0;
    static const size_t address_count_max = 30;
    uintptr_t           addresses[address_count_max] = {};

    BacktraceState(const ucontext_t* ucontext) : signal_ucontext(ucontext) {}

    bool AddAddress(uintptr_t ip) {
        // No more space in the storage. Fail.
        if (address_count >= address_count_max)
            return false;

        // Reset the Thumb bit, if it is set.
        const uintptr_t thumb_bit = 1;
        ip &= ~thumb_bit;

        // Ignore null addresses.
        if (ip == 0)
            return true;

        // Finally add the address to the storage.
        addresses[address_count++] = ip;
        return true;
    }
};

void CaptureBacktraceUsingLibUnwind(BacktraceState* state) {
    assert(state);

    // Initialize unw_context and unw_cursor.
    unw_context_t unw_context = {};
    unw_getcontext(&unw_context);
    unw_cursor_t  unw_cursor = {};
    unw_init_local(&unw_cursor, &unw_context);

    // Get more contexts.
    const ucontext_t* signal_ucontext = state->signal_ucontext;
    assert(signal_ucontext);
    const sigcontext* signal_mcontext = &(signal_ucontext->uc_mcontext);
    assert(signal_mcontext);

    // Set registers.
    unw_set_reg(&unw_cursor, UNW_ARM_R0, signal_mcontext->arm_r0);
    unw_set_reg(&unw_cursor, UNW_ARM_R1, signal_mcontext->arm_r1);
    unw_set_reg(&unw_cursor, UNW_ARM_R2, signal_mcontext->arm_r2);
    unw_set_reg(&unw_cursor, UNW_ARM_R3, signal_mcontext->arm_r3);
    unw_set_reg(&unw_cursor, UNW_ARM_R4, signal_mcontext->arm_r4);
    unw_set_reg(&unw_cursor, UNW_ARM_R5, signal_mcontext->arm_r5);
    unw_set_reg(&unw_cursor, UNW_ARM_R6, signal_mcontext->arm_r6);
    unw_set_reg(&unw_cursor, UNW_ARM_R7, signal_mcontext->arm_r7);
    unw_set_reg(&unw_cursor, UNW_ARM_R8, signal_mcontext->arm_r8);
    unw_set_reg(&unw_cursor, UNW_ARM_R9, signal_mcontext->arm_r9);
    unw_set_reg(&unw_cursor, UNW_ARM_R10, signal_mcontext->arm_r10);
    unw_set_reg(&unw_cursor, UNW_ARM_R11, signal_mcontext->arm_fp);
    unw_set_reg(&unw_cursor, UNW_ARM_R12, signal_mcontext->arm_ip);
    unw_set_reg(&unw_cursor, UNW_ARM_R13, signal_mcontext->arm_sp);
    unw_set_reg(&unw_cursor, UNW_ARM_R14, signal_mcontext->arm_lr);
    unw_set_reg(&unw_cursor, UNW_ARM_R15, signal_mcontext->arm_pc);

    unw_set_reg(&unw_cursor, UNW_REG_IP, signal_mcontext->arm_pc);
    unw_set_reg(&unw_cursor, UNW_REG_SP, signal_mcontext->arm_sp);

    // unw_step() does not return the first IP.
    state->AddAddress(signal_mcontext->arm_pc);

    // Unwind frames one by one, going up the frame stack.
    while (unw_step(&unw_cursor) > 0) {
        unw_word_t ip = 0;
        unw_get_reg(&unw_cursor, UNW_REG_IP, &ip);

        bool ok = state->AddAddress(ip);
        if (!ok)
            break;
    }
}

void SigActionHandler(int sig, siginfo_t* info, void* ucontext) {
    const ucontext_t* signal_ucontext = (const ucontext_t*)ucontext;
    assert(signal_ucontext);

    BacktraceState backtrace_state(signal_ucontext);
    CaptureBacktraceUsingLibUnwind(&backtrace_state);
    // Do something with the backtrace - print, save to file, etc.
}

Here is a sample backtrace testing app with 3 implemented backtracing methods, including the method shown above.

https://github.com/alexeikh/android-ndk-backtrace-test

Alexei Khlebnikov
  • 2,126
  • 1
  • 21
  • 21
0

First, you need to read the section on "async signal safe" functions:

http://man7.org/linux/man-pages/man7/signal.7.html

That's the entire set of functions that are safe to call in a signal handler. About the worst thing you can do is to call anything that calls malloc()/free() under the hood - or do it yourself.

Second, get it working outside of a signal handler first.

Third, these are probably apropos:

How to get C++ backtrace on Android

Android NDK: getting the backtrace

Community
  • 1
  • 1
Andrew Henle
  • 32,625
  • 3
  • 24
  • 56
  • 1
    **1.** Thanks. I'm aware that calling `malloc()` is unsafe (as well as doing many other things), but there's no way to avoid it. Besides, the app ha already crashed, worst thing that can happen is I won't get the stack trace logged. Which is what's happening right now. **2.** It does work outside of signal handler, that was the first thing I've tested (with PC acquired by `_Unwind_GetIP`, obviously). **3.** The first link seems irrelevant, the second one contains a useful answer upon which my code is built. – Violet Giraffe Apr 10 '15 at 11:59
  • 3
    **1.** No, worst thing is the app will deadlock and never exit unless something explicitly kills it. And there are infinitely many ways to avoid calling malloc(), even indirectly. **2.** You need to get it to work using a ucontext starting point. **3.** If you really need the stack backtrace, just run `system( "pstack PID" );` to emit the current stack trace. While `system()` isn't async-signal-safe itself, it's usually built on `fork()/exec()`, which are. Or you can roll your own `fork()/exec()`. – Andrew Henle Apr 10 '15 at 13:27
  • **2.** I understand, but HOW?.. **3.** I'll try that, but I seriously suspect it requires root, which is unacceptable. – Violet Giraffe Apr 10 '15 at 13:42
  • **2.** You might have to use google-breakpad ot get a ucontext: https://code.google.com/p/google-breakpad/ **3.** I don't have a running Android install right now, but I don't think offhand that you'd need root to pstack your own process. – Andrew Henle Apr 11 '15 at 14:28
  • **2.** I've looked at Breakpad more than once, it looks extremely complicated and a pain to get working (overkill, too). **3.** I do have `ucontext` in the code in my question already. I can get SP from there, as well as many other register values. But I have no idea how to use it. The best idea I have yet is find the source code of `_Unwind_Backtrace` in the AOSP sources (a task of its own), extract its code into my project (not the best idea because it may not be portable across different Android platform versions) and substitute SP for the one I've got from `ucontext`. – Violet Giraffe Apr 11 '15 at 15:54
  • This does the job. The library is header only and it is perfectly safe. https://github.com/vmarkovtsev/DeathHandler – Jens Munk Jul 17 '15 at 16:30
0

As part of getting unwinding through signal handlers (e.g. throwing an exception from one) working on arm-linux-eabihf I also obtained working backtraces from within a signal handler.

I'm pretty sure this is glibc-specific and therefore won't work on Android, but maybe it can be adapted or be useful for inspiration: https://github.com/mvduin/arm-signal-unwind

Matthijs
  • 704
  • 7
  • 8