1

I am programming C++ using gcc on an obscure system called linux x86-64. I was hoping that may be there are a few folks out there who have used this same, specific system (and might also be able to help me understand what is a valid pointer on this system). I do not care to access the location pointed to by the pointer, just want to calculate it via pointer arithmetic.

According to section 3.9.2 of the standard:

A valid value of an object pointer type represents either the address of a byte in memory (1.7) or a null pointer.

And according to [expr.add]/4:

When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the expression P points to element x[i] of an array object x with n elements, the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤ n; otherwise, the behavior is undefined. Likewise, the expression P - J points to the (possibly-hypothetical) element x[i − j] if 0 ≤ i − j ≤ n; otherwise, the behavior is undefined.

And according to a stackoverflow question on valid C++ pointers in general:

Is 0x1 a valid memory address on your system? Well, for some embedded systems it is. For most OSes using virtual memory, the page beginning at zero is reserved as invalid.

Well, that makes it perfectly clear! So, besides NULL, a valid pointer is a byte in memory, no, wait, it's an array element including the element right after the array, no, wait, it's a virtual memory page, no, wait, it's Superman!

(I guess that by "Superman" here I mean "garbage collectors"... not that I read that anywhere, just smelled it. Seriously, though, all the best garbage collectors don't break in a serious way if you have bogus pointers lying around; at worst they just don't collect a few dead objects every now and then. Doesn't seem like anything worth messing up pointer arithmetic for.).

So, basically, a proper compiler would have to support all of the above flavors of valid pointers. I mean, a hypothetical compiler having the audacity to generate undefined behavior just because a pointer calculation is bad would be dodging at least the 3 bullets above, right? (OK, language lawyers, that one's yours).

Furthermore, many of these definitions are next to impossible for a compiler to know about. There are just so many ways of creating a valid memory byte (think lazy segfault trap microcode, sideband hints to a custom pagetable system that I'm about to access part of an array, ...), mapping a page, or simply creating an array.

Take, for example, a largish array I created myself, and a smallish array that I let the default memory manager create inside of that:

#include <iostream>
#include <inttypes.h>
#include <assert.h>
using namespace std;

extern const char largish[1000000000000000000L];
asm("largish = 0");

int main()
{
  char* smallish = new char[1000000000];
  cout << "largish base = " << (long)largish << "\n"
       << "largish length = " << sizeof(largish) << "\n"
       << "smallish base = " << (long)smallish << "\n";
}

Result:

largish base = 0
largish length = 1000000000000000000
smallish base = 23173885579280

(Don't ask how I knew that the default memory manager would allocate something inside of the other array. It's an obscure system setting. The point is I went through weeks of debugging torment to make this example work, just to prove to you that different allocation techniques can be oblivious to one another).

Given the number of ways of managing memory and combining program modules that are supported in linux x86-64, a C++ compiler really can't know about all of the arrays and various styles of page mappings.

Finally, why do I mention gcc specifically? Because it often seems to treat any pointer as a valid pointer... Take, for instance:

char* super_tricky_add_operation(char* a, long b) {return a + b;}

While after reading all the language specs you might expect the implementation of super_tricky_add_operation(a, b) to be rife with undefined behavior, it is in fact very boring, just an add or lea instruction. Which is so great, because I can use it for very convenient and practical things like non-zero-based arrays if nobody is putzing with my add instructions just to make a point about invalid pointers. I love gcc.

In summary, it seems that any C++ compiler supporting standard linkage tools on linux x86-64 would almost have to treat any pointer as a valid pointer, and gcc appears to be a member of that club. But I'm not quite 100% sure (given enough fractional precision, that is).

So... can anyone give a solid example of an invalid pointer in gcc linux x86-64? By solid I mean leading to undefined behavior. And explain what gives rise to the undefined behavior allowed by the language specs?

(or provide gcc documentation proving the contrary: that all pointers are valid).

personal_cloud
  • 3,943
  • 3
  • 28
  • 38
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/189341/discussion-on-question-by-personal-cloud-what-is-a-valid-pointer-in-gcc-linux-x8). If you want to provide your perspective, post an answer. If you do not feel that the question is answerable in its current state, cast a vote to close. – Cody Gray - on strike Mar 03 '19 at 10:52
  • @Cody Gray Excellent idea! I have posted an answer based on input from the (recently converted to chat) extended discussion. – personal_cloud Mar 03 '19 at 10:59
  • Have you looked into creating a non-zero based array abstract data type? – Galik Mar 03 '19 at 11:11
  • 1
    Do you know what *undefined behaviour* is? It's not a crash. It's not setting your computer on fire. It's not calling police, not stealing your girlfriend, not starting a nuclear war. Or all of these things. It's just *behaviour the standard refuses to talk about*, nothing more. Why do you expect to find any particularly funny assembly code in `super_tricky_add_operation` again? – n. m. could be an AI Mar 03 '19 at 21:10
  • 1
    "By solid I mean leading to undefined behavior." How do you plan to identify undefined behaviour? By looking at your computer and observing a crash? You cannot do that. By looking at your computer and observing it has caught fire? You cannot do that. Not by watching your home SWATted, not by watching your girlfriend leaving, not by watching the world end in the nuclear apocalypse. You can only identify UB **by reading the standard**. If the standard says your program has UB, it has UB (see the definition of UB in the previous comment). – n. m. could be an AI Mar 03 '19 at 21:16
  • @n.m. My goal is to understand how GCC has interpreted the (vague) language standard around pointer validity. If we can see it making use of the language assumptions in the assembly code it generates, that would be a very good clue. Vague standard does not automatically imply that GCC doesn't support something. – personal_cloud Mar 03 '19 at 22:09
  • 1
    There is absolutely nothing vague about pointer validity. \[basic.compound] *Every value of pointer type is one of the following: (3.1) — a pointer to an object or function (the pointer is said to point to the object or function), or (3.2) — a pointer past the end of an object (8.7), or (3.3) — the null pointer value (7.11) for that type, or (3.4) — an invalid pointer value.* The compiler doesn't need to interpret this is any special way. It can assume all pointers you do anything with are valid. – n. m. could be an AI Mar 03 '19 at 22:50
  • @n.m. OK. But haven't we established that there are many many ways to create an "object"? And C++ doesn't provide a single construct or facade interface to discover all these different kinds of objects (apart from trying to access them), just the overall range of the address space. If I create a new object allocator, am I obligated to "tell" the language about it somehow? – personal_cloud Mar 03 '19 at 23:10
  • 1
    No "we" have not. You can declare and define an object, or you can create one with the new operator. That makes, let's count them on our thumbs, one, two, that's two ways of creating objects. You don't "discover" objects. You know where they are. Overall I have an impression that you don't know what you are asking about. Is it about symptoms of UB? Is is about creating objects? Is it about pointer validity? This is way too broad. One question at a time please. – n. m. could be an AI Mar 04 '19 at 05:46
  • @n.m. What about mmap, malloc, I/O, shared pages, trapped pages, etc.... These are all valid arrays! No, I don't know where all these things are from a simple API, and neither does the compiler. Yes, my question is about UB symptoms. As explained in the answers, GCC *does* know the overall range of the virtual address space, and uses this in comparison optimizations. That is how the UB shows up in practice. (Or, all the UB can all be avoided by using `uintptr_t`, though then you need to adjust it in multiples of `sizeof(elem)` and cast it back to pointer before accessing the designated memory) – personal_cloud Mar 04 '19 at 06:00
  • "These are all valid arrays!" Says who? Only the standard determines what is a valid pointer and what isn't. Can you quote relevant standard language? There is a defect report that shows accessing malloc'd memory without placement new'ing an object in it first (a common idiom that comes from C) is UB. This is unfortunate but that's what the standard currently says. – n. m. could be an AI Mar 04 '19 at 06:03
  • @n.m. Placement new is optional for C types like `int`, since C++ is backwards compatible with C. I guess this includes the "mmap, malloc, I/O, shared pages, trapped pages" etc. I don't see how placement new would work with those things when another process/library etc created the data. And even for placement new, I don't think the compiler is allowed to build an external tracking structure for it (where is the memory resource for that?). Placement new should just be calling the class constructor which typically only updates values in the class itself, and maybe allocates some members. – personal_cloud Mar 04 '19 at 06:07
  • In any case, if you assume that malloc creates a valid character array, that's another way of creating an object. C++ doesn't have mmap or any other way to allocate memory. If a pointer comes from a function that is unknown to the implementation, such as one written in a different language, an implementation must assume that the pointer is valid, otherwise it would be pretty difficult to interface with other languages. But then you are creating objects outside of a C++ program. It is not in the scope of the C++ standard to tell how it's done. – n. m. could be an AI Mar 04 '19 at 06:16
  • 1
    "Placement new is optional for C types like int" No it is not "since C++ is backwards compatible with C" No it is not. – n. m. could be an AI Mar 04 '19 at 06:17
  • 1
    An implementation is pretty much allowed to keep track of all objects. When interfacing to another language, you would have to tell the implementation where foreign-created objects are, in some implementation-specific way. gcc doesn't track objects, it's not this kind of implementation. It assumes pointers it doesn't know about are valid. It is your responsibility to never do anything funny with invalid pointers. – n. m. could be an AI Mar 04 '19 at 06:32

3 Answers3

3

Usually pointer math does exactly what you'd expect regardless of whether pointers are pointing at objects or not.

UB doesn't mean it has to fail. Only that it's allowed to make the whole rest of the program behave strangely in some way. UB doesn't mean that just the pointer-compare result can be "wrong", it means the entire behaviour of the whole program is undefined. This tends to happen with optimizations that depend on a violated assumption.

Interesting corner cases include an array at the very top of virtual address space: a pointer to one-past-the-end would wrap to zero, so start < end would be false?!? But pointer comparison doesn't have to handle that case, because the Linux kernel won't ever map the top page, so pointers into it can't be pointing into or just past objects. See Why can't I mmap(MAP_FIXED) the highest virtual page in a 32-bit Linux process on a 64-bit kernel?


Related:

GCC does have a max object size of PTRDIFF_MAX (which is a signed type). So for example, on 32-bit x86, an array larger than 2GB isn't fully supported for all cases of code-gen, although you can mmap one.

See my comment on What is the maximum size of an array in C? - this restriction lets gcc implement pointer subtraction (to get a size) without keeping the carry-out from the high bit, for types wider than char where the C subtraction result is in objects, not bytes, so in asm it's (a - b) / sizeof(T).


Don't ask how I knew that the default memory manager would allocate something inside of the other array. It's an obscure system setting. The point is I went through weeks of debugging torment to make this example work, just to prove to you that different allocation techniques can be oblivious to one another).

First of all, you never actually allocated the space for large[]. You used inline asm to make it start at address 0, but did nothing to actually get those pages mapped.

The kernel won't overlap existing mapped pages when new uses brk or mmap to get new memory from the kernel, so in fact static and dynamic allocation can't overlap.

Second, char[1000000000000000000L] ~= 2^59 bytes. Current x86-64 hardware and software only support canonical 48-bit virtual addresses (sign-extended to 64-bit). This will change with a future generation of Intel hardware which adds another level of page tables, taking us up to 48+9 = 57-bit addresses. (Still with the top half used by the kernel, and a big hole in the middle.)

Your unallocated space from 0 to ~2^59 covers all user-space virtual memory addresses that are possible on x86-64 Linux, so of course anything you allocate (including other static arrays) will be somewhere "inside" this fake array.


Removing the extern const from the declaration (so the array is actually allocated, https://godbolt.org/z/Hp2Exc) runs into the following problems:

//extern const 
char largish[1000000000000000000L];
//asm("largish = 0");

/* rest of the code unchanged */
  • RIP-relative or 32-bit absolute (-fno-pie -no-pie) addressing can't reach static data that gets linked after large[] in the BSS, with the default code model (-mcmodel=small where all static code+data is assumed to fit in 2GB)

    $ g++ -O2 large.cpp
    /usr/bin/ld: /tmp/cc876exP.o: in function `_GLOBAL__sub_I_largish':
    large.cpp:(.text.startup+0xd7): relocation truncated to fit: R_X86_64_PC32 against `.bss'
    /usr/bin/ld: large.cpp:(.text.startup+0xf5): relocation truncated to fit: R_X86_64_PC32 against `.bss'
    collect2: error: ld returned 1 exit status
    
  • compiling with -mcmodel=medium places large[] in a large-data section where it doesn't interfere with addressing other static data, but it itself is addressed using 64-bit absolute addressing. (Or -mcmodel=large does that for all static code/data, so every call is indirect movabs reg,imm64 / call reg instead of call rel32.)

    That lets us compile and link, but then the executable won't run because the kernel knows that only 48-bit virtual addresses are supported and won't map the program in its ELF loader before running it, or for PIE before running ld.so on it.

    peter@volta:/tmp$ g++ -fno-pie -no-pie -mcmodel=medium -O2 large.cpp
    peter@volta:/tmp$ strace ./a.out 
    execve("./a.out", ["./a.out"], 0x7ffd788a4b60 /* 52 vars */) = -1 EINVAL (Invalid argument)
    +++ killed by SIGSEGV +++
    Segmentation fault (core dumped)
    peter@volta:/tmp$ g++ -mcmodel=medium -O2 large.cpp
    peter@volta:/tmp$ strace ./a.out 
    execve("./a.out", ["./a.out"], 0x7ffdd3bbad00 /* 52 vars */) = -1 ENOMEM (Cannot allocate memory)
    +++ killed by SIGSEGV +++
    Segmentation fault (core dumped)
    

(Interesting that we get different error codes for PIE vs non-PIE executables, but still before execve() even completes.)


Tricking the compiler + linker + runtime with asm("largish = 0"); is not very interesting, and creates obvious undefined behaviour.

Fun fact #2: x64 MSVC doesn't support static objects larger than 2^31-1 bytes. IDK if it has a -mcmodel=medium equivalent. Basically GCC fails to warn about objects too large for the selected memory model.

<source>(7): error C2148: total size of array must not exceed 0x7fffffff bytes

<source>(13): warning C4311: 'type cast': pointer truncation from 'char *' to 'long'
<source>(14): error C2070: 'char [-1486618624]': illegal sizeof operand
<source>(15): warning C4311: 'type cast': pointer truncation from 'char *' to 'long'

Also, it points out that long is the wrong type for pointers in general (because Windows x64 is an LLP64 ABI, where long is 32 bits). You want intptr_t or uintptr_t, or something equivalent to printf("%p") that prints a raw void*.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Thank you for this perspective; I agree that the Kernel will not allocate `largish`, and that trying to involve the linker in `largish` causes much bigger problems. But the purpose of `largish` is to satisfy the language requirement on pointer arithmetic, not to make the Kernel do something. Where does the language spec say that an "array" (for the purposes of [expr.add]/4) must be allocated by the Kernel? (I mean, yes, people have interpreted it that way, under certain assumptions, but it's not the only possible interpretation) – personal_cloud Mar 03 '19 at 21:13
  • For that matter, how does elementary arithmetic interact with the Kernel at all? Wouldn't this be obvious in the `.o` file? But if I add pointers all I see is an `lea` or `add` instruction, neither of which touches the kernel. – personal_cloud Mar 03 '19 at 21:15
  • 1
    @personal_cloud: right, none of it involves the kernel at all. UB doesn't mean that it *must* fail, it means it's *allowed* to fail and/or be super-weird. Your hacks with `largish[]` have created a pointer which doesn't actually point to an object. But anyway, this answer was just trying to address a flaw in your premise, and the part of the question I quoted. I didn't get very far into figuring out what else you're actually asking. – Peter Cordes Mar 03 '19 at 21:29
  • C++ imposes a vague requirement on pointer validity for the purposes of pointer arithmetic. What I'm asking is how does GCC interpret that requirement. It clearly can handle many cases that do not involve Kernel allocation, including various custom allocators, hardware drivers within the kernel itself, lazy-mapping schemes, mmaps to bad files, custom-located arrays that get partly- or fully- placed in the linker later on... so many examples. Is there an overall principle, or just a few exceptions around `null` (including comparisons wrapping around 0). – personal_cloud Mar 03 '19 at 21:32
  • @personal_cloud: usually pointer math does exactly what you'd expect regardless of whether pointers are pointing at objects or not. Like I said, UB doesn't mean it *has* to fail. Interesting corner cases include an array at the very top of virtual address space: a pointer to one-past-the-end would wrap to zero, so `start < end` would be false?!? But pointer comparison doesn't have to handle that case, because the Linux kernel won't ever map the top page, so pointers into it can't be pointing into or just past objects. [See this Q&A](//stackoverflow.com/q/47712502). – Peter Cordes Mar 03 '19 at 21:36
  • Yes, when it comes to pointer arithmetic, I have all along thought that GCC is just making assumptions about the overall range of the virtual address space. The answer should emphasize that. OK, you kind of cover it with the `PTRDIFF_MAX`. I will accept. – personal_cloud Mar 03 '19 at 21:44
  • @personal_cloud: after seeing your comments, I realized that was an important part of the answer and moved it higher up. – Peter Cordes Mar 03 '19 at 21:46
  • Thanks for reworking to directly address my question at the top of your answer. PS That's interesting that `mmap` can work around the `PTRDIFF_MAX` assumption. I wonder if whatever it's doing could be used to enable a wider range of pointer arithmetic. But I guess that's more a topic for my [related question on non-zero-based arrays](https://stackoverflow.com/questions/54951999/). – personal_cloud Mar 03 '19 at 21:58
  • @personal_cloud: no, `mmap` can't work truly "work around" it. It's a Unix system call, and doesn't care about C implementation limits, so it doesn't artificially limit allocation sizes. (And internally the kernel uses unsigned integer math to deal with sizes. Plus, on a 64-bit kernel, 3GB is a trivial size. A 32-bit kernel might still handle it, though, if compiled with a 3:1 user:kernel split so that much user-space virtual address space was available). But if you pass pointers to the start and end of a 2.5G mmap region to `size_t sz(int *end, int*start) {return end-start;}`, it's UB. – Peter Cordes Mar 03 '19 at 22:26
2

The Standard does not anticipate the existence of any storage beyond that which the implementation provides via objects of static, automatic, or thread duration, or the use of standard-library functions like calloc. It consequently imposes no restrictions on how implementations process pointers to such storage, since from its perspective such storage doesn't exist, pointers that meaningfully identify non-existent storage don't exist, and things that don't exist don't need to have rules written about them.

That doesn't mean that the people on the Committee weren't well aware that many execution environments provided forms of storage that C implementations might know nothing about. The expected, however, that people who actually worked with various platforms would be better placed than the Committee to determine what kinds of things programmers would need to do with such "outside" addresses, and how to best support such needs. No need for the Standard to concern itself with such things.

As it happens, there are some execution environments where it is more convenient for a compiler to treat pointers arithmetic like integer math than to do anything else, and many compilers for such platforms treat pointer arithmetic usefully even in cases where they're not required to do so. For 32-bit and 64-bit x86 and x64, I don't think there are any bit patterns for invalid non-null addresses, but it may be possible to form pointers that don't behave as valid pointers to the objects they address.

For example, given something like:

char x=1,y=2;
ptrdiff_t delta = (uintptr_t)&y - (uintptr_t)&x;
char *p = &x+delta;
*p = 3;

even if pointer representation is defined in such a way that using integer arithmetic to add delta to the address of x would yield y, that would in no way guarantee that a compiler would recognize that operations on *p might affect y, even if p holds y's address. Pointer p would effectively behave as though its address was invalid even though the bit pattern would match that of y's address.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • x86-64 only has 48-bit virtual addresses (or 57-bit with 5-level page tables on future HW). The canonical addresses are ones that are correctly sign-extended to 64-bit, so the usable ranges are the low and high 47-bit ranges at the top and bottom of virtual address space. You could call a non-canonical pointers "invalid non-null addresses", but they still work as integers as long as you never dereference them. See also [Should pointer comparisons be signed or unsigned in 64-bit x86?](//stackoverflow.com/q/47687805) – Peter Cordes Mar 05 '19 at 23:08
  • That's a great example with `ptrdiff_t delta = (uintptr_t)&y - (uintptr_t)&x;` because both `&x` and `&x+delta` are valid, but they do not point to the same object, and therefore subtly violate [expr.add]/4. Also a great explanation of how it could cause an aliasing optimization to unexpectly change the program's results later. Thank you. – personal_cloud Mar 06 '19 at 03:05
  • For some reason, there seems to be some serious debate as to whether `(char*)(delta+(uintptr_t)&x);` should be capable of accessing `y`, but I question why any implementation that wouldn't be willing to honor such semantics should define `uintptr_t` in the first place [it's purely optional]. IMHO, integer-to-pointer conversions have great big neon signs that should cause any compiler that isn't being willfully blind to recognize that the resulting pointer might be capable of accessing just about any object whose address has been converted to an integer type, and I really can't think... – supercat Mar 06 '19 at 06:33
  • ...of many non-contrived situations where that would seriously impede what would otherwise be useful optimizations. To be sure, the Standard would allow such optimization, but only because the standard *never* requires that a pointer produced by casting a `uintptr_t` actually be usable to access any object (it merely requires that `(char*)(uintptr_t)&x` compares equal to `&x`--not that it would be usable to access `x`). The authors of the Standard naively thought it wasn't necessary to say compiler writers shouldn't do silly things. – supercat Mar 06 '19 at 06:39
-1

The following examples show that GCC specifically assumes at least the following:

  • A global array cannot be at address 0.
  • An array cannot wrap around address 0.

Examples of unexpected behavior arising from arithmetic on invalid pointers in gcc linux x86-64 C++ (thank you melpomene):

  • largish == NULL evaluates to false in the program in the question.
  • unsigned n = ...; if (ptr + n < ptr) { /*overflow */ } can be optimized to if (false).
  • int arr[123]; int n = ...; if (arr + n < arr || arr + n > arr + 123) can be optimized to if (false).

Note that these examples all involve comparison of the invalid pointers, and therefore may not affect the practical case of non-zero-based arrays. Therefore I have opened a new question of a more practical nature.

Thank you everyone in the chat for helping to narrow down the question.

personal_cloud
  • 3,943
  • 3
  • 28
  • 38
  • 1
    GCC knows it (and the linker) will never put static data at address `0`, so `largish == NULL` doesn't even need to check at runtime, it's known to be false. Violating the compiler's assumptions with `asm("largish=0");` is basically undefined behaviour. – Peter Cordes Mar 03 '19 at 21:32
  • @Peter Cordes Correct. I suspect that basically all the discontinuities are around 0. Basically, assuming that a "valid array" doesn't start at 0, nor wrap around 0. That's what this answer is pointing out. ... Though it could be clarified a bit. – personal_cloud Mar 03 '19 at 21:37