0

I'm a bit confused about isLandingPad on BasicBlocks in LLVM. I have the following code, where I create an empty BasicBlock and then call isLandingPad on it:

#include "llvm/IR/IRBuilder.h"
#include <assert.h>

using namespace llvm;

int main(void)
{
    // Start with a LLVM context.
    LLVMContext TheContext;

    // Make a module.
    Module *TheModule = new Module("mymod", TheContext);

    // Make a function
    std::vector<Type*> NoArgs = {};
    Type *u32 = Type::getInt32Ty(TheContext);
    FunctionType *FT = FunctionType::get(u32, NoArgs, false);
    Function *F = Function::Create(FT, Function::ExternalLinkage, "main", TheModule);

    // Make an empty block
    IRBuilder<> Builder(TheContext);
    BasicBlock *BB = BasicBlock::Create(TheContext, "entry", F);
    Builder.SetInsertPoint(BB);

    auto fnp = BB->getFirstNonPHI();
    assert(fnp == nullptr);

    // I think this should crash.
    auto islp = BB->isLandingPad();
    printf("isLP = %d\n", islp);

    // If we inline the implementation of the above call, we have the following
    // (which *does* crash).
    auto islp2 = isa<LandingPadInst>(BB->getFirstNonPHI());
    printf("isLP2 = %d\n", islp2);

    return 0;
}

which outputs:

isLP = 0
codegen: /usr/lib/llvm-7/include/llvm/Support/Casting.h:106: static bool llvm::isa_impl_cl<llvm::LandingPadInst, const llvm::Instruction *>::doit(const From *) [To = llvm::LandingPadInst, From = const llvm::Instruction *]: Assertion `Val && "isa<> used on a null pointer"' failed.

According to the LLVM source of isLandingPad (https://llvm.org/doxygen/BasicBlock_8cpp_source.html#l00470) this should segfault when the BasicBlock is empty (since we are calling isa on a nullptr). However, when I run this program the call to isLandingPad succeeds and returns false. Interestingly, when I inline the function definition of isLandingPad (as seen further below), it crashes as expected.

I'm clearly doing something wrong here, but I don't see in what way the BB->isLandingPad() call is different to the inlined version, and why isLandingPad doesn't crash, when it should according to the source.

ptersilie
  • 13
  • 3
  • isLandingPad() doesn't demand that the block be well-formed or complete or anything. It simply checks: Is there an invoke instruction anywhere that will jump to this block in case of an exception? Yes or no? That's all. It'll work in case your function is half-compiled and the invoke is ready but not yet the exception handler. – arnt Mar 03 '20 at 16:38
  • Oh, and `BB->isLandingPad()` isn't ever guaranteed to segfault, and particularly not in this case, since it doesn't even try to access any instructions in the block itself. – arnt Mar 03 '20 at 16:41

2 Answers2

1

If the code "should segfault", that seems to imply that the code is invoking undefined behavior (UB) at runtime. It is a possibility that the compiler is doing optimizations based on the false assumption that UB does not occur in your program and this false assumption leads to the false result isLP == false that you observe.

You should never invoke undefined behavior and restructure your code to never call functions with parameters that can call UB. (E.g., check the result of getFirstNonPHI before calling isa<LandingPadInst> or isLandingPad.

Specifically you should not assume that UB (such as dereferencing nullptr or an address near it) has a well-defined effect such as "it will segfault" because the compiler may reorganize your code (assuming UB never happens) in ways that will eliminate the effect you expect (e.g., it will generate code that doesn't attempt to load from nullptr).

Inlining and optimization levels have great effect on the generated code and this is why you see different results (invalid return value vs. segfault) in different cases.

More info on undefined behavior:

palotasb
  • 4,108
  • 3
  • 24
  • 32
  • To be clear, `isa()` is the source of UB in OP's program? The LLVM docs are pretty sparse. – Edd Barrett Mar 03 '20 at 18:04
  • Thanks for the explanation and the quick reply. This makes sense. I was so hung up on why the inlined code leads to a different outcome that it simply didn't occur to me that my expectation might be wrong. I eventually realised that I needed to check if the block is empty first, but I was still curious why this was happening. – ptersilie Mar 04 '20 at 10:06
  • I need to clarify that I only mentioned a "segfault" since this is how this error manifested in my actual program. But I just realised that the simplified version I posted above actually triggers an assertion, which seems to suggest that the `isa` function does guard against nullpointers. So I'm still not entirely sure why `isLandingPad` doesn't trigger the assertion, while `isa` does. – ptersilie Mar 04 '20 at 10:07
  • I don't believe this program invokes UB if the assertion in `BB->isLandingPad()` actually triggers, since it's reasonably well defined what an assertion failure does. The issue here is, I think, that it's not triggering. – davmac Mar 04 '20 at 10:23
1

LLVM itself is (at least on my system) compiled with assertions disabled, so the assertion doesn't trigger. When you inline it in your code, you are compiling with assertions enabled, so it does trigger.

Note that since isa<...> is a template, it will be compiled into the compilation unit it is instantiated as part of. In this case, there's at least two: one in LLVM and one that comprises your program. Strictly speaking they should both be identical (the "one definition rule") or you have UB anyway. The practical upshot in a case like this one is that calls to isa<...>() from either compilation unit might end up calling the version instantiated in the other one. However, it's likely that in the case of isa<...>() the calls are being inlined, i.e. you end up with a version of isa<...>() specific to each compilation unit that instantiates it.

davmac
  • 20,150
  • 1
  • 40
  • 68
  • That's it! So, because LLVM is compiled without debug info, using `isLandingPad` won't trigger the assertion. But when I inline `isa` into my program (which I compiled with `-g`), it compiles with debug info so the assert triggers. It all makes sense now. – ptersilie Mar 04 '20 at 10:35
  • @ptersilie to be clear, I don't think `-g` or lack of controls assertions. IIRC you need to explicitly compile with `-DNDEBUG` (or `#define NDEBUG` in your code). – davmac Mar 04 '20 at 10:49