15

I'm looking to implement a function that determines whether a given pointer points into a given buffer. The specification:


template <typename T>
bool points_into_buffer (T *p, T *buf, std::size_t len);

If there is some n, 0 <= n && n < len, for which p == buf + n, returns true.

Otherwise, if there is some n, 0 <= n && n < len * sizeof(T), for which reinterpret_cast<char *>(p) == reinterpret_cast<char *>(buf) + n, the behaviour is undefined.

Otherwise, returns false.


The obvious implementation would look something like

template <typename T>
bool points_into_buffer (T *p, T *buf, std::size_t len) {
    return p >= buf && p < buf + len;
}

but that has undefined behaviour in standard C++: relational comparisons of pointers are only defined for pointers into the same array.

An alternative would be to use the standard library's comparer objects:

template <typename T>
bool points_into_buffer (T *p, T *buf, std::size_t len) {
    return std::greater_equal<T *>()(p, buf) && std::less<T *>()(p, buf + len);
}

which is guaranteed to return true when I want it to return true, and avoids undefined behaviour, but allows for false positives: given int a; int b;, it allows a result of true for points_into_buffer(&a, &b, 1).

It can be implemented as a loop:

template <typename T>
bool points_into_buffer (T *p, T *buf, std::size_t len) {
    for (std::size_t i = 0; i != len; i++)
        if (p == buf + i)
            return true;
    return false;
}

However, compilers have trouble optimising away that loop.

Is there a valid way of writing this, where with current compilers and optimisations enabled, the result is determined in constant time?

  • What makes you think that using comparsion functors does not incur undefined behaviour? They are defined to use the plain relational operators, so there should be no difference here. – Horstling Jan 04 '15 at 14:57
  • 3
    @Horstling C++11 [comparisons]p8 "For templates `greater`, `less`, `greater_equal`, and `less_equal`, the specializations for any pointer type yield a total order, even if the built-in operators `<`, `>`, `<=`, `>=` do not." –  Jan 04 '15 at 14:58
  • @hvd Interesting (and a bit counterintuitive). Didn't know that one, thx. – Horstling Jan 04 '15 at 15:01
  • I think the answer is you don't: C++ permits strange memory address systems that are not simple linear ones, outside of pointers into same data structure. I would be interested in the architecure that the `std::less` solution breaks on. – Yakk - Adam Nevraumont Jan 04 '15 at 15:09
  • @Yakk Some segmented architecture, where the segment is effectively stored in the low-order bits, might break it, but I don't know of any real-world implementation that does that. –  Jan 04 '15 at 15:14
  • @hvd even if std::less always yields a total order, it doesn't mean the result will be correct with your use. Same problem as with my answer. – Jerem Jan 04 '15 at 15:51
  • @JeremyLaumon Yes, that's why I pointed out in my question that it can give false positives. If that were a safe approach, I wouldn't have asked this question. :) (It cannot give false negatives, by the way. When the result of `p < q` is defined, `std::less()(p, q)` must give that same result, and similarly for the other operators.) –  Jan 04 '15 at 16:00
  • How about an assert that adjacent pointers produce adjacent integers when cast to `unit_ptr_t`? Still not perfect. – Yakk - Adam Nevraumont Jan 04 '15 at 17:09
  • 2
    This doesn't help at all from a language lawyer's point of view but it might be comforting to know in practice that even the GNU portability library (gnulib) [makes the same assumption](https://www.gnu.org/software/gnulib/manual/html_node/Portability-guidelines.html#Portability-guidelines) and it doesn't seem to have been reported yet that this breaks on a real-world platform. – 5gon12eder Jan 04 '15 at 17:14
  • I think you can vet known implementations and validate others: if the pointers to first and end, cast to uintptr, produce a difference equal to the exoected byte count, then there is no room for an integer that would be between them that is not part of the array. Document assumptions and ideally include a static_assert. – JDługosz Feb 22 '15 at 23:51
  • 1
    @jdlugosz That again assumes a linear address space, which is common, but not universal, and something I wanted to avoid assuming. –  Feb 22 '15 at 23:54
  • It *verifies* that the array in question is contiguous in the manner needed by the simple exclusion test. If that's true then you test the ends only and if not you need to compare every element address in a loop. – JDługosz Feb 22 '15 at 23:58
  • Fwiw, if some system did have non-consecutive addresses, the internal, implementation of pointer comparisons would contain logic to deal, with that, and the library ought to give non-standard helper functions to call for such work. But the gnu site indicates that there aren't any at present, and any future system will be under pressure to hide the funny stuff inside the cast to uintptr_t so that common library code works. – JDługosz Feb 23 '15 at 00:03
  • @jdlugosz There have already been systems that don't have a simple linear address space, and the GNU site doesn't claim otherwise. (At the very least, there's x86 segment+offset "fat" pointers, both for real mode and for those few operating systems that used segments in protected mode.) It only says they're not aware of any that's a "practical porting target", which is probably true. –  Feb 23 '15 at 00:09
  • I remember 80286 protected mode (Turbo C++, Watcom) and 8086 "Large" compiler mode. The latter was ordered but not consecutive. I don't recall how the former was arranged for arrays > 64k, but it was a common targer at one point and lots of code was written for it. With a system of 4MB of memory, maybe I just never had any buffers/arrays larger than 64K ? – JDługosz Feb 23 '15 at 00:22
  • I would suppose that anyone who made a new architecture with funny details (for good reasons I'm sure) that was not very tiny and wanted to run the common body of open software would take it upon themselves to make gnu compilers work with existing code. – JDługosz Feb 23 '15 at 00:27
  • @jdlugosz That's one option. Another is that they take it upon themselves to port common software to their platform, re-writing code that makes not-fully-portable assumptions. Either way, it's probably something that won't cause problems for code written by you or me, unless we actually end up using such a platform ourselves. –  Feb 23 '15 at 00:33

2 Answers2

8

As far as I can tell, this is a portable implementation of the function I'm after for all possible implementations:

#ifdef UINTPTR_MAX

bool points_into_buffer(std::uintptr_t p, std::uintptr_t buf, std::size_t len)
{
  const auto diff = p + 0u - buf;
  if (diff < len)
    // #1
    if (reinterpret_cast<char *>(p) == reinterpret_cast<char *>(buf) + diff)
      return true;
  for (std::size_t n = 0; n != len; n++)
    if (reinterpret_cast<char *>(p) == reinterpret_cast<char *>(buf) + n)
      // #2
      if (reinterpret_cast<char *>(p) - reinterpret_cast<char *>(buf) != diff)
        return true;
  return false;
}

template <typename T>
bool points_into_buffer(T *p, T *buf, std::size_t len)
{
  return points_into_buffer(reinterpret_cast<std::uintptr_t>(p),
                            reinterpret_cast<std::uintptr_t>(buf),
                            len * sizeof(T));
}

#else

template <typename T>
bool points_into_buffer(T *p, T *buf, std::size_t len)
{
  for (std::size_t n = 0; n != len; n++)
    if (p == buf + n)
      return true;
  return false;
}

#endif

In general, diff is not guaranteed to have a meaningful value. But that's okay: the function returns true if and only if it finds some n such that reinterpret_cast<char *>(p) == reinterpret_cast<char *>(buf) + n. It only uses diff as a hint to find the value of n faster.

It relies on the compiler optimising conditions that are not necessarily known at compile time in general, but are known at compile time for a particular platform. The conditions for the if statements marked as #1 and #2 are determined by GCC at compile time to always be true and false respectively, because of how diff is defined, allowing GCC to see that no useful action is performed inside the loop, and allowing the entire loop to be dropped.

The generated code for points_into_buffer<char> and points_into_buffer<int> looks like:

bool points_into_buffer(char*, char*, unsigned int):
        movl    4(%esp), %edx
        movl    $1, %eax
        movl    12(%esp), %ecx
        subl    8(%esp), %edx
        cmpl    %edx, %ecx
        ja      L11
        xorl    %eax, %eax
L11:    rep ret

bool points_into_buffer(int*, int*, unsigned int):
        movl    4(%esp), %edx
        movl    12(%esp), %eax
        subl    8(%esp), %edx
        leal    0(,%eax,4), %ecx
        movl    $1, %eax
        cmpl    %edx, %ecx
        ja      L19
        xorl    %eax, %eax
L19:    rep ret

On systems where std::uintptr_t is not available, or where addresses are more complicated than simple integers, the loop is used instead.

1

If you cast the pointers to large enough unsigned integers and add number of bytes instead of number of objects, the undefined behavior goes away.

template <typename T>
bool points_into_buffer (T *p, T *buf, std::size_t len) {
    uintptr_t ip = (uintptr_t)p;
    uintptr_t ibuf = (uintptr_t)buf;
    return ip >= ibuf && ip < (ibuf + sizeof(T) * len);
}

This code doesn't detect if p is not correctly aligned, but you could easily add a test with a %.

Jerem
  • 1,725
  • 14
  • 24
  • 3
    This is not guaranteed to work on all implementations. You're assuming a linear address space, which is common, but not universal. –  Jan 04 '15 at 15:11
  • Good point. It's a defined behavior (actually implementation defined), but it may not be a correct behavior. I'm not sure there is a platform independent solution then. – Jerem Jan 04 '15 at 15:48