11

Is it safe in C to keep a pointer out-of-bounds (without dereferencing it) for further arithmetic ?

void f(int *array)
{
    int *i = array - 1;  // OOB

    while(...) {
        ++i;
        ...
    }
}

void g(int *array, int *end /* past-the-end pointer: OOB */)
{
    while(array != end) {
        ...
        ++array;
    }
}

I imagine some extreme cases, if the address is the first of memory or the last one...

rafoo
  • 1,506
  • 10
  • 17

3 Answers3

10

Moving pointer to one element past the last element is allowed, but moving further or moving before the first element is not allowed.

Quote from N1570 6.5.6 Additive operators (point 8):

When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

ilkkachu
  • 6,221
  • 16
  • 30
MikeCAT
  • 73,922
  • 11
  • 45
  • 70
  • So, I imagine that's why C++ can use algorithm template (working with past the-end iterators) with pointers. Thx – rafoo Jan 10 '21 at 00:32
4

A pointer may point to one element past the last element of the array, and pointer arithmetic may be done between that pointer and a pointer to an element of the array.

Such a pointer cannot be dereferenced, but it can be used in pointer arithmetic. For example, the following is valid:

char arr[10];
char *p1, *p2;
p1 = arr + 10;
p2 = arr + 5;
int diff = p1 - p2;
printf("diff=%d\n", diff);   // prints 5

A pointer may not point before the first element.

This is spelled out in section 6.5.6p8 of the C standard:

When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object,the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

Note that bolded portion that states that a pointer may be created to point to one element past the end of the array, and there is nothing allowing to point to any point before the start of the array.

dbush
  • 205,898
  • 23
  • 218
  • 273
0

As others have pointed out, you are allowed to point one past. But do remember that it is NOT allowed to point one element before the first. So you might want to be careful if you write algorithms that traverses arrays backwards. Because this snippet is invalid:

void foo(int *arr, int *end) {
    while(end-- != arr) { // Ouch, bad idea...
        // Code
    }
    // Here, end has the value arr[-1]
}

That's because, when end points at the same element as arr, the condition will be false, but after that, end is decremented once more and will point to one element before the array, thus invoking undefined behavior.

Do note that apart from that, the code works fine. To fix the bug, you can do this instead:

void foo(int *arr, int *end) {
    while(end != arr) { 
        end--; // Move end-- to inside the loop, in the very beginning

        // Code
    }

    // And here, end is equal to arr, which is perfectly fine
}

The code in the loop will work exactly as before. The only difference is that end will not be decremented the last time.

klutt
  • 30,332
  • 17
  • 55
  • 95
  • Did you mean end--? As written, the while loop is exited when the decremented end == arr, so arr[0] is never accessed in `Code`, and after the loop end == arr. (Your final paragraph describes postdecrement) On the other hand, does the postdecrement invoke undefined behaviour, as that illegal pointer value is never used (neither dereferenced nor for comparison)? That is, is there "evaluation" of it? – drRobertz Jan 10 '21 at 11:17
  • @drRobertz But yes, it does invoke UB, even though you're nog dereferencing it. That's why this is so dangerous. – klutt Jan 10 '21 at 11:19
  • yes, dereferencing or comparing to the pointer would invoke UB, but this program (assuming that there is no code after the loop at `//Here, end has the value arr[-1]`) does not use `end`, when it has an illegal value, in any way. That would mean that `(void) --end` could have UB. I never thought of this, maybe should be a language-lawyer question. – drRobertz Jan 10 '21 at 11:42
  • @drRobertz Read what the standard says. *"If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.*" If you point to *one before* you're not pointing at *the same* object. – klutt Jan 10 '21 at 18:04