12

I often see code that adds a value, such as a length to a pointer, and then uses this value, e.g.

T* end = buffer + bufferLen;//T* + size_t

if (p < end)

However, is it possible for the buffer to have been allocated near enough the end of memory that "buffer + bufferLen" may overflow (e.g. 0xFFFFFFF0 + 0x10), resulting in "p < end" being false even if p was a valid element address (e.g. 0xFFFFFFF8).

If it is possible, how can it be avoided when I see many things that work with a begin/end range where end next element after the last one

Wug
  • 12,956
  • 4
  • 34
  • 54
Fire Lancer
  • 29,364
  • 31
  • 116
  • 182

4 Answers4

9

From the standard:

5.9 Relational operators [expr.rel]

If two pointers point to elements of the same array or one beyond the end of the array, the pointer to the object with the higher subscript compares higher.

So you don't need to worry; a conformant implementation will ensure that the past-the-end pointer compares correctly to the rest of the array. In addition,

3.7.4.1 Allocation functions [basic.stc.dynamic.allocation]

[...] The pointer returned shall be suitably aligned so that it can be converted to a pointer of any complete object type with a fundamental alignment requirement (3.11) and then used to access the object or array in the storage allocated [...]

The implication is that the pointer returned should be able to be treated as the pointer to the beginning of an array of appropriate size, so 5.9 continues to hold. This would be the case if the allocation function call is the result of calling operator new[] (5.3.4:5).

As a practical matter, if you're on a platform where it is conceivable for the allocator to (non-conformantly) return a block of memory ending at 0xFFFFFFFF, you could in most cases write

if (p != end)
ecatmur
  • 152,476
  • 27
  • 293
  • 366
  • Interesting; Does this imply that the last memory address can never be used (similar to `0`)? On x64_86/x86, nobody cares, but on embedded devices a byte could be "a lot". – bitmask Aug 20 '12 at 13:06
  • There's another quote in § 24.2.1.7 - *"range [i,j) refers to the elements in the data structure starting with the element pointed to by i and up to **but not including the element pointed to by j**. Range [i,j) is valid if and only if j is reachable from i."* – Flexo Aug 20 '12 at 13:07
  • 1
    Though it sounds related does this actually apply to the situation the OP is describing? Seeing as T is not an array. – Andreas Brinck Aug 20 '12 at 13:08
  • @AndreasBrinck if `buffer` was the result of `new[]` then it points to the beginning of an array (5.3.4:5). – ecatmur Aug 20 '12 at 13:20
  • @bitmask I think you're right; there's no way for the allocation function to know whether it's being called for a scalar object (which can occupy `0xff...fff`) or an array object (which cannot). – ecatmur Aug 20 '12 at 13:25
  • @ecatmur The result of `new[]` is not of array type, but your answer is probably right anyway (although I'm not 100% you have a +1 from me). – Andreas Brinck Aug 20 '12 at 13:40
  • @AndreasBrinck indeed, 5.3.4:5 *when the allocated object is an array, the new-expression yields a pointer to the initial element of the array* - which is what I said :) – ecatmur Aug 20 '12 at 13:55
  • Was looking at what the VC10 compiler outputs and from what I can tell with T=int, the addition was a single lea instruction, the if a cmp and jae and the new int[bufferLen] allocation just went to "malloc(size)", so how are these guarantees actually met in modern OS's/compilers? Is Window's leaving some VM as invalid that is larger than they think a single object ever will be (say 4KB?) or something else going on. What about other OS's? – Fire Lancer Aug 21 '12 at 00:46
  • @FireLancer it's sufficient to simply never allocate the top address in memory; this can be done by reserving as little as a byte. In practice Windows reserves at least 1GB of virtual memory for the system; see e.g. http://stackoverflow.com/questions/5680766/how-does-a-memory-map-of-a-windows-process-look-like – ecatmur Aug 21 '12 at 08:46
1

It is not possible for elements of a contiguous memory allocation to have non-contiguous addresses. end always has an address of higher value than start.

In the case that the allocation happens to end at exactly 0xFFFFFFFF for example, meaning end will be 0x00000000, this would be a bug and the code should be fixed to accommodate that scenario.

On some platforms though this scenario is impossible by design and could be a reasonable compromise in logic for the sake of simplicity. For example I would not hesitate to write if(p < end) on a Windows user-mode application.

tenfour
  • 36,141
  • 15
  • 83
  • 142
1

True, in many [start, end) pair algorithm end points past the last valid entry. But your implementation should never dereference end, the last entry actually accessed should be end-1, which is guaranteed to be in valid region. If your algorithm dereferences *end then is a bug. In fact there are test allocators that intentionally place the region on the very last bytes of a valid page, immedeatly followed by an unallocated region. With such allocators an algorithm that dereferences *end will cause protection fault.

FLG_HEAP_PAGE_ALLOCS

Turns on page heap debugging, which verifies dynamic heap memory operations, including allocations and frees, and causes a debugger break when it detects a heap error.

This option enables full page heap debugging when set for image files and standard page heap debugging when set in system registry or kernel mode.

  • Full page heap debugging (for /i) places an inaccessible page at the end of an allocation.

  • Standard page heap debugging (for /r or /k) examines allocations as they are freed.

Setting this flag for an image file is the same as typing gflags /p enable /full for the image file at the command line

As for the issue of pointer overfllow: no operating system allocates page containing VA address 0xFFFFFFFF, same way no operating system ever allocates page containing 0x00000000. For such overflow to occur the size of *start would have to be big enough for start+1 to jump over all the reserved VA at the end of valid ranges. But in such case the addess allocated for start should be at least one such size below the last valid VA address, and this implies start+1 will be valid (it follows start+N is also always valid as long as start was allocated as sizeof(*start)*N).

Remus Rusanu
  • 288,378
  • 40
  • 442
  • 569
-1

Don't worry about it. Your allocator (probably new, but maybe something else) won't give you something so close to the end of memory that it wraps around.

Worry about bounds checking instead. You won't ever get an allocation that wraps around like this, so as long as you don't overrun arrays (which has undefined behavior anyway), you won't end up wrapping around.

It's also useful to note that large chunks of process address space are reserved for the kernel. On most operating systems, this high-order area is reserved.

Wug
  • 12,956
  • 4
  • 34
  • 54
  • And I guess, we're supposed to take your word for it? – bitmask Aug 20 '12 at 13:00
  • @bitmask: seeing as memory address 0 is reserved, I'd say it follows logically that it's not going to be allocated as part of a contiguous allocation at the end of memory that wraps around to low addresses. – Wug Aug 20 '12 at 13:02
  • That doesn't stop integer addition from overflowing. – bitmask Aug 20 '12 at 13:04
  • I suppose if you're treating a non-array as an array, but that's an entirely different issue. You're not going to get a block allocation that wraps around, so as long as you're doing bounds checking correctly, this will NEVER EVER happen. – Wug Aug 20 '12 at 13:04
  • @Wug if the array allocation sits on the very edge of address space, "one past" the array will wrap to 0. – tenfour Aug 20 '12 at 13:08
  • You're not supposed to write one past the end of arrays, are you? Also, the standard seems to indicate that the one-past element will always compare higher, i.e. you won't get one where the one-past element has address 0. – Wug Aug 20 '12 at 13:09
  • @Wug - you're not supposed to *read/write* past the end, but you're very much allowed to construct a pointer to one past the end. (It's strictly UB to even construct a pointer further than that though) – Flexo Aug 20 '12 at 13:24
  • @Flexo: That's ok, but the standard indicates that such a pointer will always compare greater. In other words, as long as you're not relying on undefined behavior, you do not have to worry about it, – Wug Aug 20 '12 at 13:26