Some of the "practical" (funny way to spell "buggy") code that was broken looked like this:
void foo(X* p) {
p->bar()->baz();
}
and it forgot to account for the fact that p->bar()
sometimes returns a null pointer, which means that dereferencing it to call baz()
is undefined.
Not all the code that was broken contained explicit if (this == nullptr)
or if (!p) return;
checks. Some cases were simply functions that didn't access any member variables, and so appeared to work OK. For example:
struct DummyImpl {
bool valid() const { return false; }
int m_data;
};
struct RealImpl {
bool valid() const { return m_valid; }
bool m_valid;
int m_data;
};
template<typename T>
void do_something_else(T* p) {
if (p) {
use(p->m_data);
}
}
template<typename T>
void func(T* p) {
if (p->valid())
do_something(p);
else
do_something_else(p);
}
In this code when you call func<DummyImpl*>(DummyImpl*)
with a null pointer there is a "conceptual" dereference of the pointer to call p->DummyImpl::valid()
, but in fact that member function just returns false
without accessing *this
. That return false
can be inlined and so in practice the pointer doesn't need to be accessed at all. So with some compilers it appears to work OK: there's no segfault for dereferencing null, p->valid()
is false, so the code calls do_something_else(p)
, which checks for null pointers, and so does nothing. No crash or unexpected behaviour is observed.
With GCC 6 you still get the call to p->valid()
, but the compiler now infers from that expression that p
must be non-null (otherwise p->valid()
would be undefined behaviour) and makes a note of that information. That inferred information is used by the optimizer so that if the call to do_something_else(p)
gets inlined, the if (p)
check is now considered redundant, because the compiler remembers that it is not null, and so inlines the code to:
template<typename T>
void func(T* p) {
if (p->valid())
do_something(p);
else {
// inlined body of do_something_else(p) with value propagation
// optimization performed to remove null check.
use(p->m_data);
}
}
This now really does dereference a null pointer, and so code that previously appeared to work stops working.
In this example the bug is in func
, which should have checked for null first (or the callers should never have called it with null):
template<typename T>
void func(T* p) {
if (p && p->valid())
do_something(p);
else
do_something_else(p);
}
An important point to remember is that most optimizations like this are not a case of the compiler saying "ah, the programmer tested this pointer against null, I will remove it just to be annoying". What happens is that various run-of-the-mill optimizations like inlining and value range propagation combine to make those checks redundant, because they come after an earlier check, or a dereference. If the compiler knows that a pointer is non-null at point A in a function, and the pointer isn't changed before a later point B in the same function, then it knows it is also non-null at B. When inlining happens points A and B might actually be pieces of code that were originally in separate functions, but are now combined into one piece of code, and the compiler is able to apply its knowledge that the pointer is non-null in more places. This is a basic, but very important optimization, and if compilers didn't do that everyday code would be considerably slower and people would complain about unnecessary branches to re-test the same conditions repeatedly.