12

It is said in C that when pointers refer to the same array or one element past the end of that array the arithmetics and comparisons are well defined. Then what about one before the first element of the array? Is it okay so long as I do not dereference it?

Given

int a[10], *p;
p = a;

(1) Is it legal to write --p?

(2) Is it legal to write p-1 in an expression?

(3) If (2) is okay, can I assert that p-1 < a?

There is some practical concern for this. Consider a reverse() function that reverses a C-string that ends with '\0'.

#include <stdio.h>

void reverse(char *p)
{
    char *b, t;

    b = p;
    while (*p != '\0')
        p++;
    if (p == b)      /* Do I really need */
        return;      /* these two lines? */
    for (p--; b < p; b++, p--)
        t = *b, *b = *p, *p = t;
}

int main(void)
{
    char a[] = "Hello";

    reverse(a);
    printf("%s\n", a);
    return 0;
}

Do I really need to do the check in the code?

Please share your ideas from language-lawyer/practical perspectives, and how you would cope with such situations.

aafulei
  • 2,085
  • 12
  • 27
  • 2
    (1) It might be legal to write it, but the result is undefined behaviour if you execute it. With segmented architectures (Intel 80286, 80386, etc), the result might be completely befuddling. (2) Ditto. (3) N/A, but the answer is no. For your hardware and your o/s, you might well be safe, but the C standard does not guarantee it. – Jonathan Leffler Feb 11 '20 at 07:28
  • 1
    Does this answer your question? [What are all the common undefined behaviours that a C++ programmer should know about?](https://stackoverflow.com/questions/367633/what-are-all-the-common-undefined-behaviours-that-a-c-programmer-should-know-a) Look specificly under the section for pointers. – Isaiah Mar 10 '20 at 02:10

2 Answers2

10

(1) Is it legal to write --p?

It's "legal" as in the C syntax allows it, but it invokes undefined behavior. For the purpose of finding the relevant section in the standard, --p is equivalent to p = p - 1 (except p is only evaluated once). Then:

C17 6.5.6/8

If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

The evaluation invokes undefined behavior, meaning it doesn't matter if you de-reference the pointer or not - you already invoked undefined behavior.

Furthermore:

C17 6.5.6/9:

When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object;

If your code violates a "shall" in the ISO standard, it invokes undefined behavior.

(2) Is it legal to write p-1 in an expression?

Same as (1), undefined behavior.


As for examples of how this could cause problems in practice: imagine that the array is placed at the very beginning of a valid memory page. When you decrement outside that page, there could be a hardware exception or a pointer trap representation. This isn't a completely unlikely scenario for microcontrollers, particularly when they are using segmented memory maps.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Could you please provide some comments to the example code in the question? Is the check necessary / proper / sufficient? – aafulei Feb 11 '20 at 07:38
  • It is possible that `*p == '\0'` at the beginning. This check intends to prevent from `p--` in the for loop. – aafulei Feb 11 '20 at 07:45
  • @aafulei Yeah I realized. You do need the extra check because of the badly written loop. The correct fix is to rewrite the loop, example: https://godbolt.org/z/R4TuwT – Lundin Feb 11 '20 at 07:50
  • Thank you for your code. It's a clever one. I would suggest you put it in your answer for more people to view it. The only thing is that when there are odd number of characters in the string (not counting `'\0'`) there will be a self-swapping (swap with itself) at the end. But that's fine. Also please bear with me for a little longer for cross-validation before I can make the tick. – aafulei Feb 11 '20 at 08:08
  • If `p-1` in an expression is invalid, `p=p-1` would be invalid. And `p--` is `p=p-1`. Would you argue that decrementing a pointer is invalid? – harper Feb 11 '20 at 08:16
  • @harper All three cases are undefined behavior, given that p points at the first element of the array. – Lundin Feb 11 '20 at 10:27
  • @Lundin So you say: decrementing a pointer is undefined behavior if if accidentally points to an array? This must be a misunderstanding. How would you make sure that a pointer does not point to an array that you don't know? – harper Mar 08 '20 at 12:31
  • @harper No, pointer arithmetic is _only_ defined for arrays. A pointer to a single item is regarded as an array with size 1. Therefore all your examples are UB _unless_ `p` points in the middle of an array, then they are fine. – Lundin Mar 09 '20 at 07:34
-2

The use of that kind of pointer arithmetic is bad coding practice, as it might lead to a significant bunch of hard to debug problems.

I only had to use this kind of thing once in more than 20 years. I was writing a call-back function, but I did not have access to the proper data. The calling function provided a pointer inside a proper array, and I needed the byte just before that pointer.

Considering that I had access to the entire source code, and I verified the behavior several times to prove that I get what I need, and I had it reviewed by other colleagues, I decided it is OK to let it go to production.

The proper solution would have been to change the caller function to return the proper pointer, but that was not feasible, time and money considering (that part of the software was licensed from a third-party).

So, a[-1] is possible, but should be used ONLY with very good care in very particular situations. Otherwise, there is no good reason to do that kind of self-hurting Voodoo ever.


Note: at a proper analysis, in my example, it is obvious that I did not access an element before the beginning of a proper array, but the element before a pointer, which was guaranteed to be inside of the same array.


Referring to the code provided:

  • it is NOT OK to use p[-1] with reverse(a);;
  • it is OK(-ish) to use it with reverse(a+1);, because you remain inside the array.
virolino
  • 2,073
  • 5
  • 21