C++ Array out of Range access to calculate pointer valid?

Question

Is the following code guaranteed to be working?

int* arr = new int[2];
std::cout << &arr[0x100];

Is this considered good practice or would it be cleaner to add an offset the regular way?

Edit: By "working" I mean that it should print the pointer to the theoretical member at 0x100. Basically if this is equivalent to "std::cout << ((unsigned int)arr + 0x100*sizeof(int));".

um. Like reading 254 entries off then end of the array? Sorry, but reading past the end of an array is never clean or good practice. Either way is also UB. — Michael Dorgan, Jan 24 '18 at 21:53
@StoryTeller I don't think it's equivalent. Logically it is, but `&arr[0x100]` is equivalent to `*(arr+0x100)` which derefences the address and thus is undefined behavior. `arr + 0x100` is just an adress computation. — Jens, Jan 24 '18 at 21:57
I only do this to calculate the pointer to the theoretical member and never want to access any data. — iBent, Jan 24 '18 at 21:57
@Jens - This isn't what I think. This is what the C++ standard says about pointer arithmetic. — StoryTeller - Unslander Monica, Jan 24 '18 at 21:58
@StoryTeller I am not concerned about the addition which is of course valid. But then §5.2.1 includes the dereferencing of the result. — Jens, Jan 24 '18 at 22:04
@Jens - The addition is not valid, no matter how much one may think it is. Says as much in black and white over at §5.7.5 — StoryTeller - Unslander Monica, Jan 24 '18 at 22:09
@StoryTeller And you are right again. I found the reference in $8.7.4 where it says that "P + J and J + P (where J has the value j) point to the (possibly-hypothetical) element x[i + j] if 0 <= i + j <= n; otherwise, the behavior is undefined.". Thanks. — Jens, Jan 24 '18 at 22:14
@Jens - Well, you aren't far off the mark yourself. Most compilers I worked with go the length of making it defined. It's just that one should be careful when picking an implementation to read the fine print on this matter. — StoryTeller - Unslander Monica, Jan 24 '18 at 22:16

9Breaker · Answer 1 · 2018-01-24T22:59:08.780

With my compiler (Cygwin GCC) getting the address at this value is the same as doing pointer arithmetic, although each is undefined behavior (UB). As mentioned in the comment below by Jens, at http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html, I found the following helpful.

It is also worth pointing out that both Clang and GCC nail down a few behaviors that the C standard leaves undefined. The things I'll describe are both undefined according to the standard and treated as undefined behavior by both of these compilers in their default modes.

Dereferences of Wild Pointers and Out of Bounds Array Accesses: Dereferencing random pointers (like NULL, pointers to free'd memory, etc) and the special case of accessing an array out of bounds is a common bug in C applications which hopefully needs no explanation. To eliminate this source of undefined behavior, array accesses would have to each be range checked, and the ABI would have to be changed to make sure that range information follows around any pointers that could be subject to pointer arithmetic. This would have an extremely high cost for many numerical and other applications, as well as breaking binary compatibility with every existing C library.

The pointer arithmetic is also UB. So you have an address, but you cannot dereference the pointer to it. So there is really no use in having this address at all. Just getting the address is UB and should not be used in code.

See this answer for out-of-bounds pointers: Why is out-of-bounds pointer arithmetic undefined behaviour?

My sample code:

    int* arr = new int[2];
    std::cout << arr << std::endl;
    std::cout << &(arr[0])<< std::endl;
    std::cout << &(arr[1])<< std::endl;
    std::cout << &arr[0x100] << std::endl; // UB, cannot be dereferenced
    std::cout << &arr[256] << std::endl;   // cannot be dereferenced, so no use in having it
    std::cout << arr + 0x100; // UB here too, no use in having this address

Sample Output:

0x60003ae50
0x60003ae50
0x60003ae54
0x60003b250
0x60003b250
0x60003b250

I recommend to read http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html ... — Jens, Jan 24 '18 at 22:16
*"C++ allows getting these addresses"* - No it doesn't, the standard says it's UB — UnholySheep, Jan 24 '18 at 22:28
@Jens Thanks for the link, I updated based on the info. Very helpful. — 9Breaker, Jan 24 '18 at 22:29
Now this answer still makes it sound like the code doesn't invoke UB - it does, just like your sample code. The pointer arithmetic is also UB — UnholySheep, Jan 24 '18 at 22:31
@UnholySheep thanks for the help, I saw my flaw there and edited accordingly. Do you agree now? — 9Breaker, Jan 24 '18 at 22:51
@Jens, do you agree with this answer now? I appreciate the help. — 9Breaker, Jan 24 '18 at 22:51

Ernie Mur · Answer 2 · 2018-01-26T01:16:19.163

0

In the first line you allocate 2 integer values. In the second line, you access memory outside this range. This is not allowed at all.

Edit: Some interesting comments here. But I cannot understand, why it should be needed to cite the standard for such a simple answer and why is pointer arithmetic discussed here so much?

From a logical view, std::cout << &arr[0x100] consists of 3 steps: 1. access the non existing member of an array 2. get the address of the non existing member 3. use the address of the non existing member

If the first step is invalid, aren't all the following undefined?

edited Jan 26 '18 at 01:16

answered Jan 24 '18 at 21:54

Ernie Mur

481
3
18

Site the standard or give a bit more reason as to why this is UB. This answer is only a comment as it stands. – Michael Dorgan Jan 24 '18 at 21:55
Well, I never really access the data as it should only give me the pointer to the theoretical member at 0x100. – iBent Jan 24 '18 at 21:55
1

@MichaelDorgan I don't think that the statement "accessing memory you don't own is UB" requires documentation. I **do** however think that he needs to explain the implicit assumption that the memory is actually "accessed" (as OP doesn't seem to understand what the subscript operator is doing) – scohe001 Jan 24 '18 at 21:57
Actually, I think OP (Original OP - lol) does - he states as much. Regardless, I was pretty sure that pointer arithmetic more than 1 entry past the end of owned memory was considered UB as well - deferenced or not. I saw this in another post earlier today. Hmm... – Michael Dorgan Jan 24 '18 at 21:58
"*Well, I never really access the data*" makes it sound like he doesn't realize it's actually de-referencing. But I don't believe pointer arithmetic outside owned memory should be UB...I'd be happy to be proved wrong on this one if you can find that post tho! :) – scohe001 Jan 24 '18 at 22:02
I wrote "access" memory outside the range because of the [] operator. According to my understanding, it first accesses the integer which is not existing and only after this the address is evaluated. But it is never good practice, as the original question was. I still don't understand what the OP wanted to achieve with this term. – Ernie Mur Jan 24 '18 at 22:32
@iBent: I do not see a connection between your first code snipet and your function "removeFirst". The function works, if m_size is the size of the array, here 2, and T ist an int. You just move the contents of the array one integer distance to the lower index. But why would one need to access an int outside the array in C++? – Ernie Mur Jan 24 '18 at 22:47
@Ernie Mur yeah, I admit that my example is a bit bad. I actually meant "if (m_size) m_size--;". Consider m_size being 0 when the function is being called. (ofc "if (m_size == 0) return;" would be better here). – iBent Jan 24 '18 at 22:58

C++ Array out of Range access to calculate pointer valid?

2 Answers2