I'm trying t understand a problem we cleared recently when using Clang 5.0 and Undefined Behavior Sanitizer (UBsan). We have code that processes a buffer in the forward or backwards direction. The reduced case is similar to the code shown below.
The 0-len
may look a little unusual, but it is needed for early Microsoft .Net compilers. Clang 5.0 and UBsan produced integer overflow findings:
adv-simd.h:1138:26: runtime error: addition of unsigned offset to 0x000003f78cf0 overflowed to 0x000003f78ce0
adv-simd.h:1140:26: runtime error: addition of unsigned offset to 0x000003f78ce0 overflowed to 0x000003f78cd0
adv-simd.h:1142:26: runtime error: addition of unsigned offset to 0x000003f78cd0 overflowed to 0x000003f78cc0
...
Lines 1138, 1140, 1142 (and friends) are the increment, which may
stride backwards due to the 0-len
.
ptr += inc;
According to Pointer comparisons in C. Are they signed or unsigned? (which also discusses C++), pointers are neither signed nor unsigned. Our offsets were unsigned and we relied on unsigned integer wrap to achieve the reverse stride.
The code was fine under GCC UBsan and Clang 4 and earlier UBsan. We eventually cleared it for Clang 5.0 with help with the LLVM devs. Instead of size_t
we needed to use ptrdiff_t
.
My question is, where was the integer overflow/undefined behavior in the construction? How did ptr + <unsigned>
result in signed integer overflow and lead to undefined behavior?
Here is an MSVC that mirrors the real code.
#include <cstddef>
#include <cstdint>
using namespace std;
uint8_t buffer[64];
int main(int argc, char* argv[])
{
uint8_t * ptr = buffer;
size_t len = sizeof(buffer);
size_t inc = 16;
// This sets up processing the buffer in reverse.
// A flag controls it in the real code.
if (argc%2 == 1)
{
ptr += len - inc;
inc = 0-inc;
}
while (len > 16)
{
// process blocks
ptr += inc;
len -= 16;
}
return 0;
}