8

I have the very old (and huge) Win32 project which uses massive checks with NULL pointer by casting to pointer the dereferenced pointer. Like this:

int* x = NULL; //somewhere
//... code
if (NULL == &(*(int*)x) //somewhere else
    return;

And yes, I know that this code is stupid and need to be refactored. But it is impossible due to huge amount of code. Right now I need to compile this project under MacOS Sierra in Xcode, which leads to big problems... It turns out that in release mode (with code optimization) the condition executes with incorrect behaviour (so called undefined behaviour because of dereferencing of NULL pointer).

According to this document for GCC there is an option -fno-delete-null-pointer-checks, but it seems not working for LLVM when O1, O2 or O3 optimization enabled. So the question is: how can I force LLVM 8.0 compiler to allow such dereferences?

UPDATE. The real working example to check the problem.

//somewhere 1
class carr
{
public:
    carr(int length)
    {
        xarr = new void*[length];

        for (int i = 0; i < length; i++)
            xarr[i] = NULL;
    }

    //some other fields and methods

    void** xarr;
    int& operator[](int i)
    {
        return *(int*)xarr[i];
    }
};

//somewhere 2
carr m(5);

bool something(int i)
{
    int* el = &m[i];
    if (el == NULL)
        return FALSE; //executes in debug mode (no optimization)

    //other code
    return TRUE; //executes in release mode (optimization enabled)
}

At -O0 and -O1, something keeps the null check, and the code "works":

something(int):                          # @something(int)
    pushq   %rax
    movl    %edi, %eax
    movl    $m, %edi
    movl    %eax, %esi
    callq   carr::operator[](int)
    movb    $1, %al
    popq    %rcx
    retq

But at -O2 and above, the check is optimized out:

something(int):                          # @something(int)
    movb    $1, %al
    retq
  • 5
    [Corresponding bug report](https://llvm.org/bugs/show_bug.cgi?id=9251). It's not promising: the flag is indeed ignored for now (it was unrecognized at first). – Quentin Dec 02 '16 at 08:50
  • 1
    `-fno-delete-null-pointer-checks` isn't supposed to affect `&*(int*)x`, that's still supposed to be allowed to be `NULL`. Checking with clang on http://gcc.godbolt.org/, with simply `bool b(short *p) { return 0 == &*(int*)p; }`, clang generates correct code. Please post a minimal complete program where your compiler generates incorrect code. –  Dec 02 '16 at 08:59
  • @hvd I've posted real example. I'm not sure if this problem related to GCC, I've only seen this in Apple LLVM 8.0 –  Dec 02 '16 at 09:06
  • 4
    @hvd something that `&` returns should not possibly be null -- it's the address of something. Dereferencing a null pointer triggers UB, so `bool b(short *p) { return true; }` would be a *valid* optimisation of your function according to the standard. – Quentin Dec 02 '16 at 09:10
  • 3
    @Quentin For C, it's made explicit that `&*p` is allowed even if `p` is `NULL`, and for C++, the intent has been stated to be the same and that's what compilers do. It's a different story for references, but there are no references here. See http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#232 Edit: there *are* now references in the edited question. That explains it. –  Dec 02 '16 at 09:15
  • @hvd I did remember it for C, but had missed the debate about it in C++. Not sure if I want to dive into that right now, but thank you :p – Quentin Dec 02 '16 at 09:24
  • @Quentin so any idea how this issue can be fixed without rewriting the code? –  Dec 02 '16 at 09:26
  • @AlekDepler I've been rummaging through documentation for a bit, but I don't see a solution. If this pattern is always the same, you can detect and replace it with the help of a regex, but that won't ensure that there are none left. Edit: wait, no you can't in your real case... – Quentin Dec 02 '16 at 09:28
  • Would be a duplicate if the OP's question wasn't "How do I allow it anyway": [C++ standard: dereferencing NULL pointer to get a reference?](http://stackoverflow.com/q/2727834/11683) – GSerg Dec 02 '16 at 09:39
  • Is the something() function duplicated in various forms through the codebase, or only in a few places? How about the carr class? The answer will affect the possible (practical) remedies. – Jeremy Dec 02 '16 at 09:45
  • This seems very similar to yesterdays question. The overwhelming consensus of everyone was that your code needs re-writing. Don't be afraid to re-write poor code! It's an important step in any development process. – djgandy Dec 02 '16 at 10:10
  • Honestly, you are better off fixing the code. That will take less time than all the time you're going to spend debugging this mess. The code in the sample could be fixed without changing the call site, by having `m[i]` return a proxy that overloads `operator&`. – M.M Dec 02 '16 at 10:15

1 Answers1

0

Do a text-based search on NULL. Then run the compiler in warning mode and print out all the warnings on paper (if you still have such technology). Now for each null, is it a problematic null or a good one? If problematic, rename it XNULL.

Now it's likely that the C++ checks can fail on a small system with, say 640k installed, because 640k is enough for anybody, but not on your modern system with many GB. So just strip them out once relabelled. If that's not the case. make XNULL a "dummy object" with a valid address in C++ eyes.

(From the example, it looks like code is a Lisp interpreter. Lisp needs both a null pointer and a dummy pointer, there's no other easy way of writing the interpreter).

Malcolm McLean
  • 6,258
  • 1
  • 17
  • 18