4

Possible Duplicate:
checking if pointer points within an array

If I have an array and a size, and I want to check if a given pointer points to an element inside an array, is there any way to do so in standard C or C++ without invoking UB?

Does this work?

bool is_inside(someType * array, int size, someType * other_pointer){
    for (int i = 0; i < size; i++)
        if (array + i == other_pointer)
            return true;
    return false;
}

edit: It is my understanding that you cannot use comparisons other than == and != for pointers not pointing to the same array without UB (although in practice it works as expected). Was I wrong in this?

Community
  • 1
  • 1
zounds
  • 773
  • 8
  • 17

4 Answers4

3

It depends on what you mean by UB.

Specifically, for pointer comparisons, section 5.9 "Relational Operators" of the C++ standard says:

If two pointers p and q of the same type point to different objects that are not members of the same object or to different functions or if only one of them is null, the results of p<q, p>q, p<=q and p>=q are unspecified.

Note that the behaviour is unspecified (meaning that the result of the comparison could be true or false - in other words the result doesn't tell you anything useful - but the implementation isn't required to specify which) as opposed to undefined (meaning that the compiler or the resulting program could do anything at all).

However, so far I have only seen one family of implementations that didn't do the expected thing with code like Kirill's:

bool inside = (other_pointer >= array) && (other_pointer < array+size);

Those implementations are compilers intended for building MS-DOS real-mode programs, in which addresses have a paragraph and offset part. The addresses FF00:0010 and FF01:0000 point to the same memory location but if I recall correctly the compilers weren't guaranteed to behave in the expected way, except when compiling for some memory models (certainly the HUGE model but perhaps others).

However, if either p or q does not point to an existing object (because for example the pointer was freed) then the behaviour is going to be undefined no matter what you do. So you can't use this kind of method to figure out if a pointer is still valid.

James Youngman
  • 3,623
  • 2
  • 19
  • 21
  • 1
    FF00:0000 translates to FF00*10+0000=FF000, while FF001:0010 would have translated, if it had been a correct pointer like FF01:0010, to FF01*10+0010=FF020. Check the math. – Alexey Frunze Jun 22 '12 at 22:40
  • Ah I was wrong. It's merely unspecified. Thanks. – zounds Jun 22 '12 at 22:41
2

Arrays are guaranteed to be contiguous in memory, so just check that the pointer is within the bounds of the address of the first element and the last.

Ed S.
  • 122,712
  • 22
  • 185
  • 265
  • 1
    It should also be checked that the difference between the pointer and the address of the array is a multiple of `sizeof(someType)`. – Alexey Frunze Jun 22 '12 at 22:21
  • 3
    @Alex: Actually, I disagree. As R. Martinho Fernandes says, that would be an invalid pointer anyway. – Ed S. Jun 22 '12 at 22:24
  • You don't want the function to return true for an invalid pointer. – Alexey Frunze Jun 22 '12 at 22:27
  • 2
    @Alex: I don't think it's any better to return false. Why do you have an invalid pointer in the first place? I'd say undefined behavior, I'm not even going to specify what happens. What functions are you aware of that specify their behavior when passed an invalid pointer? – Ed S. Jun 22 '12 at 22:29
  • @Alex : The only way to get an invalid pointer is to invoke UB, at which point all bets are off anyway. – ildjarn Jun 22 '12 at 22:30
  • I offered a defensive check as UB doesn't always result in something horrible (as in program crashing or hanging or producing apparently bogus results). – Alexey Frunze Jun 22 '12 at 22:34
  • @Alex: I believe you are missing the point. There is nothing "defensive" about your proposal, it is no more or less correct than mine. To get the invalid pointer you had to invoke UB. You cannot say "oh, well not all UB causes bad things to happen, so let's check it anyway". That's just silly. The program is already in an undefined sense, you are not making the situation any better or worse. Again, please point me to a single C or C++ standard library function which defines its behavior when passed an invalid pointer. You won't, because it makes no sense to do so – Ed S. Jun 22 '12 at 22:37
  • @Alex: furthermore, try as you might, you cannot define that behavior *because UB has already been invoked*! You cannot guarantee that the function will return the correct result in the face of previously invoked UB. It makes no sense. You're attempting to define the undefinable. – Ed S. Jun 22 '12 at 22:39
  • 1
    In most practical implementations of C and C++ the magic UB isn't all that magic, pointers are just numbers that the CPU uses as addresses. If you're writing something like kernel system call functions, you really really want to check things like this, regardless of whether or not the language standard tells you it's been UB already or there will be UB. – Alexey Frunze Jun 22 '12 at 22:44
  • @Alex: That makes absolutely no sense. You cannot reason about UB. If you decide to rely upon specific behavior that a specific compiler exhibits when targeting a specific platform when faced with UB then you are making a poor choice. The fact remains that the program is in an undefined state, and as a result you simply cannot define the behavior of your function, so it is a complete waste of time and cycles to attempt to do so. Hell, if you return true in that case then you may very well be hiding a bug in another part of your program! – Ed S. Jun 22 '12 at 22:46
  • That makes perfect practical sense. The standard doesn't want to define or deal with what can or cannot happen on a specific hardware. That's fine. I understand that. – Alexey Frunze Jun 22 '12 at 22:49
  • @Alex: I know you do, I didn't mean to come off sounding like you didn't (yet that's exactly what I did...). I just don't think there is any benefit to what you propose. I actually think it may do more harm than good because it could silently hide an error in the calling code. – Ed S. Jun 22 '12 at 22:50
  • Or it may not hide it. If the extra check I proposed is augmented with calling `terminate()` (or whatever it is), then the error will more likely be revealed right there than not. – Alexey Frunze Jun 22 '12 at 22:54
  • You can get invalid pointers by casting integers without invoking UB. Of course, that's only sensible to do for constant expressions with value 0 and integers you obtained by casting pointers (if the integer type is large enough), but still. So checking the validity isn't entirely superfluous. – Daniel Fischer Jun 22 '12 at 22:56
0

Your solution will work but is not optimal as it contains an unnecessary loop.

What you should do, is to take a pointer to the array, the array size in bytes, and check if the pointed address falls within the address space occupied by that array.

zxcdw
  • 1,629
  • 1
  • 10
  • 18
  • 1
    It should also be checked that the difference between the pointer and the address of the array is a multiple of `sizeof(someType)`. – Alexey Frunze Jun 22 '12 at 22:21
  • 2
    The element size is unnecessary (hint: it's given by the mention of `someType` in the arguments). – R. Martinho Fernandes Jun 22 '12 at 22:22
  • 1
    @Alex That is only the case where the array itself starts from an address which is a multiple of `sizeof(someType)`. – zxcdw Jun 22 '12 at 22:27
  • Pointing into the "middle" of an array element isn't a good thing irrespective of where the array starts. – Alexey Frunze Jun 22 '12 at 22:29
  • How would you do it portably without a loop? – CB Bailey Jun 22 '12 at 22:29
  • @CharlesBailey Simply by comparing if the pointer is within the space occupied by the array **and** it's address-array's address is a multiple of element size. If it is not, then the pointer is pointing in the "middle" of the element, which isn't a good thing as Alex mentions. Or am I missing something which makes it unportable? – zxcdw Jun 22 '12 at 22:33
  • 1
    @zxcdw: No. You should not care about invalid pointers (nor could you do anything sensible in the face of them). If you are passed an invalid pointer then the behavior is already undefined as you had to invoke UB to get the invalid pointer in the first place – Ed S. Jun 22 '12 at 22:34
  • @zxcdw: But how? You can't use `<` because of the unspecified behavior if the given object isn't an element of the array. – CB Bailey Jun 22 '12 at 22:38
  • @zxcdw: You guys are missing a crucial point. You cannot define the behavior of your function when UB has already been invoked. The program itself is in an undefined state already, yet you wish to claim that your function is perfectly well defined in the face of that fact. You cannot. – Ed S. Jun 22 '12 at 22:41
0

You can simply write:

bool inside = (other_pointer >= array) && (other_pointer < array+size);

This is legal check.

Kirill Kobelev
  • 10,252
  • 6
  • 30
  • 51
  • 1
    It should also be checked that the difference between the pointer and the address of the array is a multiple of `sizeof(someType)`. – Alexey Frunze Jun 22 '12 at 22:22
  • 1
    Also, beware of overflows in addition and subtraction. – Alexey Frunze Jun 22 '12 at 22:22
  • @Alex what overflows? Pointers are not numbers. – R. Martinho Fernandes Jun 22 '12 at 22:25
  • @R.MartinhoFernandes you are doing `array+size`. That can overflow. – Alexey Frunze Jun 22 '12 at 22:25
  • 1
    @Alex No, it can't. Only *numbers* can overflow. – R. Martinho Fernandes Jun 22 '12 at 22:27
  • The result of that is unspecified if it doesn't point into the array, see §5.9/2, you should use `std::less` instead. – Jonathan Wakely Jun 22 '12 at 22:27
  • @R.MartinhoFernandes Think again. – Alexey Frunze Jun 22 '12 at 22:27
  • 4
    @Alex "If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined." from §5.7/6. – R. Martinho Fernandes Jun 22 '12 at 22:28
  • 1
    @R.MartinhoFernandes Oh, we're assuming the correct size given. In that case, you're right. – Alexey Frunze Jun 22 '12 at 22:30
  • @Alex: You must assume size is correct; making sure it is correct is a burden that should be placed on the caller. If `size` is incorrect what would the function do about it exactly? How could it know? – Ed S. Jun 22 '12 at 22:32
  • 2
    Assuming an incorrect size was given, the only sensible implementation of that function would be `std::terminate()`. There's nothing you can do. – R. Martinho Fernandes Jun 22 '12 at 22:33
  • 1
    @Alex : If you have a secret to developing some API that is smart enough to know when it's being lied to, please enlighten the rest of us. ;-] – ildjarn Jun 22 '12 at 22:33
  • 4
    @JonathanWakely: I believe that `std::less` for pointers is only guaranteed to be a suitable order for use in sets, etc. It isn't guaranteed to be suitable for determining whether and object is an element of an array. Arrays could appear interleaved. – CB Bailey Jun 22 '12 at 22:35
  • @ildjarn Other than checking whether the inputs make sense according to the information already available, I can't offer anything else. – Alexey Frunze Jun 22 '12 at 22:36
  • @Charles : But in order to be useful in `set<>`, `map<>`, etc., it must produce a strict weak ordering, which is all that's needed here. – ildjarn Jun 22 '12 at 22:38
  • @ildjarn: But it could be a completely different ordering to the limited ordering that `<` must give to individual arrays. – CB Bailey Jun 22 '12 at 22:41
  • @Alex: Nothing can be determined to make sense in a program that has invoked UB previously. That is the point. – Ed S. Jun 22 '12 at 22:42
  • @Charles : Ah, I finally understand what you mean (based on your comment on zxcdw's answer). Sorry for the noise. :-] – ildjarn Jun 22 '12 at 22:43