C++ gcc extension for non-zero-based array pointer allocation?

Question

I am looking for a gcc-supported C++ language extension to enable the allocation of non-zero-based array pointers. Ideally I could simply write:

#include<iostream>  
using namespace std;

// Allocate elements array[lo..hi-1], and return the new array.
template<typename Elem>
Elem* Create_Array(int lo, int hi)
{
  return new Elem[hi-lo] - lo;
  // FIXME what about [expr.add]/4.
  // How do we create a pointer outside the array bounds?
}

// Deallocate an array previously allocated via Create_Array.
template<typename Elem>
void Destroy_Array(Elem* array, int lo, int hi)
{
  delete[](array + lo);
}


int main() 
{  
  const int LO = 1000000000;
  const int HI = LO + 10;
  int* array = Create_Array<int>(LO, HI);
  for (int i=LO; i<HI; i++)
    array[i] = i;
  for (int i=LO; i<HI; i++)
    cout << array[i] << "\n";
  Destroy_Array(array, LO, HI);
}

The above code seems to work, but is not defined by the C++ standard. Specifically, the issue is [expr.add]/4:

When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the expression P points to element x[i] of an array object x with n elements, the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤ n; otherwise, the behavior is undefined. Likewise, the expression P - J points to the (possibly-hypothetical) element x[i − j] if 0 ≤ i − j ≤ n; otherwise, the behavior is undefined.

In other words, behavior is undefined for the line marked FIXME in the code above, because it calculates a pointer that is outside the range x[0..n] for the 0-based array x.

Is there some --std=... option to gcc to tell it to allow non-zero-based array pointers to be directly calculated?

If not, is there a reasonably portable way to emulate the return new Type[hi-lo] - lo; statement, perhaps by casting to long and back? (but then I would worry about introducing more bugs)

Furthermore, can this be done in a way that requires only 1 register to keep track of each array, like the code above? For example if I have array1[i], array2[i], array3[i] this requires only the 3 registers for the array pointers array1, array2, array3, plus one register for i? (similarly, if cold-fetching the array references, we should be able to just fetch the non-zero-based pointer directly, without doing calculations merely to establish the reference in registers)

Just create a array/vector class that uses a 0 index under the food but takes user index range for its `operator []`. For every problem there is an abstraction that solves it :) — NathanOliver, Mar 01 '19 at 20:44
@François Andrieux good point. I have removed the language-lawyer tag. — personal_cloud, Mar 01 '19 at 20:46
@NathanOlivier OK but then wouldn't `operator[]` contain some expression like `array[offset + i]`, and doesn't this double the number of registers required to represent each array? I've added a sentence to the question to clarify the efficiency goal. — personal_cloud, Mar 01 '19 at 20:48
Possibly. If you go the `std::array` route and `offset` is a compile time constant it will probably be optimized away. Even if you go the `std::vector` route the compiler might be able to do some clever optimization. — NathanOliver, Mar 01 '19 at 20:51
@NathanOliver That is a good point -- often I am retrieving the array reference outside the loop, and the compiler could pre-calculate the user-index-array-base under the hood. But sometimes the array reference is a cold fetch. Can't we just explicitly tell gcc what we want? — personal_cloud, Mar 01 '19 at 20:53
@personal_cloud I suspect not. There just isn't an extension for circumventing every inconvenient c++ rule. Those rules are generally there for a reason, and if circumventing them was easy, they likely wouldn't need to exist in the first place. — François Andrieux, Mar 01 '19 at 20:54
@François Andrieux Good point. So maybe this *is* a language-lawyer question after all? — personal_cloud, Mar 01 '19 at 20:56
@personal_cloud The question itself been established that it's against the rules of the language. If what you want to ask is *why* or what passage prohibits this, then it would be a language lawyer question, but it would be a distinct question. It seems like the answer to this question is either simply "there is no such extension/guarantee" or "yes, use [compiler flag]". Edit : Following your last comment, that might be a language lawyer question. "Does a pointer need to be valid to use operator `[]`", for example. But that's a different question. — François Andrieux, Mar 01 '19 at 20:59
@François Andrieux Correct, I am looking for either one of those answers. — personal_cloud, Mar 01 '19 at 21:01
@personal_cloud As far as I know, pointer arithmetic requires valid pointers. I don't believe you can "undo" an arithmetic operation that would produce an invalid pointer by performing the reverse of that operation. But I'm not 100% sure. For one, it would imply pointers are guaranteed to respect modulo arithmetic but I'm not aware of any such guarantee. — François Andrieux, Mar 01 '19 at 21:01
@NathanOliver I posted my own answer that includes your suggestion to use `operator[]` overload while still being optimal in register usage and number of operations to access an element. Hopefully I didn't miss a hidden dragon somewhere... — personal_cloud, Mar 03 '19 at 22:04

personal_cloud · Accepted Answer · 2019-03-04T19:27:38.380

1

Assuming you're using gcc on linux x86-64, it supports the intptr_t and uintptr_t types which can hold any pointer value (valid or not) and also support integer arithmetic. uintptr_t is more suitable in this application because it supports mod 2^64 semantics while intptr_t has UB cases.

As suggested in comments, we can use this to build a class that overloads operator[] and performs range checking:

#include <iostream> 
#include <assert.h>
#include <sstream> // for ostringstream
#include <vector>  // out_of_range
#include <cstdint> // uintptr_t
using namespace std;


// Safe non-zero-based array. Includes bounds checking.
template<typename Elem>
class Array {
  uintptr_t array; // base value for non-zero-based access
  int       lo;    // lowest valid index
  int       hi;    // highest valid index plus 1

public:

  Array(int lo, int hi)
    : array(), lo(lo), hi(hi)
  {
    if (lo > hi)
      {
        ostringstream msg; msg<<"Array(): lo("<<lo<<") > hi("<<hi<< ")";
        throw range_error(msg.str());
      }
    static_assert(sizeof(uintptr_t) == sizeof(void*),
          "Array: uintptr_t size does not match ptr size");
    static_assert(sizeof(ptrdiff_t) == sizeof(uintptr_t),
          "Array: ptrdiff_t size does not match ptr (efficieny issue)");
    Elem* alloc = new Elem[hi-lo];
    assert(alloc); // this is redundant; alloc throws bad_alloc
    array = (uintptr_t)(alloc) - (uintptr_t)(lo * sizeof(Elem));
    // Convert offset to unsigned to avoid overflow UB.
  }


  //////////////////////////////////////////////////////////////////
  // UNCHECKED access utilities (these method names start with "_").

  uintptr_t _get_array(){return array;}
  // Provide direct access to the base pointer (be careful!)

  Elem& _at(ptrdiff_t i)
  {return *(Elem*)(array + (uintptr_t)(i * sizeof(Elem)));}
  // Return reference to element (no bounds checking)
  // On GCC 5.4.0 with -O3, this compiles to an 'lea' instruction

  Elem* _get_alloc(){return &_at(lo);}
  // Return zero-based array that was allocated

  ~Array() {delete[](_get_alloc());}


  //////////////////////////////
  // SAFE access utilities

  Elem& at(ptrdiff_t i)
  {
    if (i < lo || i >= hi)
      {
        ostringstream msg;
        msg << "Array.at(): " << i << " is not in range ["
            << lo << ", " << hi << "]";
        throw out_of_range(msg.str());
      }
    return _at(i);
  }

  int get_lo() const {return lo;}
  int get_hi() const {return hi;}
  int size()   const {return hi - lo;}

  Elem& operator[](ptrdiff_t i){return at(i);}
  // std::vector is wrong; operator[] is the typical use and should be safe.
  // It's good practice to fix mistakes as we go along.

};


// Test
int main() 
{  
  const int LO = 1000000000;
  const int HI = LO + 10;
  Array<int> array(LO, HI);
  for (int i=LO; i<HI; i++)
    array[i] = i;
  for (int i=LO; i<HI; i++)
    cout << array[i] << "\n";
}

Note that it is still not possible to cast the invalid "pointer" calculated by intptr_t to a pointer type, due to GCC 4.7 Arrays and Pointers:

When casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined. That is, one may not use integer arithmetic to avoid the undefined behavior of pointer arithmetic as proscribed in C99 and C11 6.5.6/8.

This is why the array field must be of type intptr_t and not Elem*. In other words, behavior is defined so long as the intptr_t is adjusted to point back to the original object before converting back to Elem*.

edited Mar 04 '19 at 19:27

answered Mar 03 '19 at 17:13

personal_cloud

3,943
3
28
38

1

According to https://gcc.gnu.org/onlinedocs/gcc/Arrays-and-pointers-implementation.html: `intptr_t` doesn't avoid UB. *the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined. That is, one may not use integer arithmetic to avoid the undefined behavior of pointer arithmetic as proscribed in C99 and C11 6.5.6/8.* In practice it will typically work, because there aren't usually assumptions that depend on it. But that doesn't mean it's strictly safe or a good idea. – Peter Cordes Mar 03 '19 at 22:05
Why are you using signed `int` as your index type in `operator[]`? That seems weird, unless you want to allow negative indices. – Peter Cordes Mar 03 '19 at 22:14
@Peter Cordes. Yes I want to allow negative indices definitely. Re: `intptr_t` documentation: Oops, yes, I missed that. Note that this restriction applies only when *casting* the integer back to a pointer. So I believe we would be OK if we keep the intermediate pointer in an `intptr_t` variable. So long as it points back to the original object before it becomes a pointer again. – personal_cloud Mar 03 '19 at 22:16
1

You could avoid the existence of a *pointer* outside any object by using `intptr_t array_intbase`. So the value only exists as a pointer type after adding `array_intbase + i*sizeof(Elem)`. You shouldn't have `Elem *array` because it's not a real pointer. But yes, storing the already-offset array base is more efficient than doing `array[i - lo]`, and should compile to x86 / ARM scaled-index addressing modes when appropriate / possible. – Peter Cordes Mar 03 '19 at 22:17
@Peter Cordes Re: use `intptr_t array`. Precisely, that seems to be the correct solution. I appreciate your help. Do you want to post the final solution, or should I? (either way is fine with me) – personal_cloud Mar 03 '19 at 22:18
Anyway, using signed `int` could result in unnecessary truncation and sign-extension to 64-bit on x86-64. Probably better to use `ptrdiff_t` as a signed pointer-width type. (Or maybe `ssize_t`). – Peter Cordes Mar 03 '19 at 22:21
@Peter Cordes I'm on x86-64. `int` should just cause sign-extension to 64-bit, no? Everything gets promoted to the widest type, right? I don't see a real problem, except for limiting my index range... But I guess your point is just that if I *happened* to have the indices already in 64-bit registers sign-extended, I could avoid needless additional sign-extension... fair enough. – personal_cloud Mar 03 '19 at 22:23
Right, but you don't want to force the compiler to redo sign-extension inside loops. For your wrappers to fully optimize away as often as possible after inlining, you should probably use a signed pointer-width type. – Peter Cordes Mar 03 '19 at 22:27
I would be wary of overflow issues even with the current code, it seems a lot simpler for the class to just store the memory allocation address instead of subtracting `lo` from it – M.M Mar 03 '19 at 22:48
@M.M Why is overflow important? Everything is 64-bit. We are just adding and subtracting mod 2^64. The combination result is known to equal the intended array location mod 2^64, therefore it *is* the intended array location. If we don't subtract `lo` from the allocation address, then we do an extra operation (and possible fetch, if not doing range checking) later when executing the `operator[]`. – personal_cloud Mar 03 '19 at 22:53
It would be undefined behaviour to have signed integer overflow – M.M Mar 03 '19 at 22:54
@M.M: good point, a large negative `lo` could wrap a high starting value for `intptr_t` past `INTPTR_MAX`, especially on 32-bit platforms where they're the same width; using `uintptr_t` would make wraparound well-defined, and still correctly get back to the right `uintptr_t` value. (On 2's complement platforms at least, where signed and unsigned addition are the same binary operation, so casting to/from `unsigned` can give you safe signed overflow.) – Peter Cordes Mar 03 '19 at 22:54
@M.M. Wow, I hadn't realized that signed overflow was undefined. Surely the actual behavior of `+` and `-` is just mod 2^64, so it's pretty lame that the spec doesn't capture that. Fortunately, you're right, unsigned integers do not have this problem, as noted [here](https://stackoverflow.com/questions/16188263/i). I'll update the solution. – personal_cloud Mar 03 '19 at 22:57
Doing `array[i - lo]` every time we index has potential costs, but the `array-lo` part could be hoisted out of loops. Still it's not totally free. – Peter Cordes Mar 03 '19 at 22:57
@personal_cloud: Yes, in practice it's often going to be fine when compiling for x86 if gcc/clang don't see any UB at compile time. The main danger is optimizations based on the assumption of no overflow cause problems. e.g. `if (x + y < x)` is transformed to `if (y<0)` pretty early by gcc8. [Why/how does gcc compile the undefined behaviour in this signed-overflow test so it works on x86 but not ARM64?](//stackoverflow.com/a/54510856). See also [Does the C++ standard allow for an uninitialized bool to crash a program?](//stackoverflow.com/a/54125820) for more about understanding UB. – Peter Cordes Mar 03 '19 at 23:02
@Peter Cordes Thanks for explaining. OK, I've updated the code to cast the `ssize_t`s to `uintptr_t` to get the mod 2^64 semantics. Hopefully *this time* it's perfect... – personal_cloud Mar 03 '19 at 23:05
If you only care about non-huge `lo` / `hi`, you could still use `int` there to keep your class at 16 bytes instead of 24 (on x86-64), and let the compiler sign-extend them when comparing against a `ssize_t` arg if assert is enabled. – Peter Cordes Mar 03 '19 at 23:09
Also, correction, `ssize_t` is not appropriate. It's POSIX only, and the standard defines its range as `[-1, SSIZE_MAX]`, so it's not guaranteed to support negative values other than -1. [What is the difference between ssize\_t and ptrdiff\_t?](//stackoverflow.com/a/10884990) – Peter Cordes Mar 03 '19 at 23:12
@Peter Cordes Hmm, OK. `ptrdiff_t` should work here... surely it's actually 64-bit as HW doesn't give us many choices. OK, I've re-tested and posted updated code. Thanks for the reference on `ssize_t`. – personal_cloud Mar 03 '19 at 23:13
Yes, `ptrdiff_t` or `intptr_t` would be good choices for the wrapper function args, and `int` for the class members (or wider if you really want this to work for huge objects). `ptrdiff_t` probably the most appropriate. (And yes it's a 64-bit type in the x86-64 System V ABI; objects larger than 2GB can exist, even if *static* objects larger than that require `-mcmodel=medium`.) In theory a C implementation that limited object sizes to 2GB could have 32-bit `ptrdiff_t` but 64-bit `intptr_t`. But AFAIK that's not the case on any real 64-bit C implementations. – Peter Cordes Mar 03 '19 at 23:19
1

Suggested improvements: use C++11 `static_assert` to check type sizes at compile time only. Also provide an `Elem*` function that undoes the offset and returns a raw pointer to the actual array. You already get unchecked access to the array if you compile this with `-DNDEBUG` to make assert a no-op. (Or make it like std::vector: a `.at()` function does range-checking before dereferencing, and `operator[]` doesn't. That way efficient access is still possible through clean wrappers.) – Peter Cordes Mar 04 '19 at 02:04
Ok, you have it backwards compared to std::vector: your `_at` doesn't do bounds checking but `[]` does. That's probably going to be confusing for potential users of your wrapper. Also, `assert(alloc);` is not generally an acceptable way to check for allocation failure. Your class still has to work properly when compiled with `-DNDEBUG`. In C++ normally you'd throw an exception if allocation fails inside a constructor (because there's no return value you can use). (In practice on x86-64 Linux, memory overcommit makes it nearly impossible for `new` to return 0.) – Peter Cordes Mar 04 '19 at 05:14
It might be more natural to take a `lo` offset and a *size* in the constructor, not a high index. IDK, maybe your use-case for this makes that natural. – Peter Cordes Mar 04 '19 at 05:22
@Peter Cordes: Good idea to add `at` like `std::vector` and use exceptions instead of assertions for bounds checking. I've updated the code. I used to be a `(lo, size)` guy too, but once I switched over all my classes to use a `(lo, hi)` interval type, everything was **so** much cleaner. Think of how much simpler the interval intersection checks are, for example... and just generating some consecutive intervals... I'm never going back to `(lo, size)`. True, you often want to calculate the size, but `(lo, hi)` is more symmetrical for so many algorithms. Like a basic lo..hi-1 loop on the index.. – personal_cloud Mar 04 '19 at 18:58
re: "True, you often want to calculate the size". That is a good point... I added a `size()` method. – personal_cloud Mar 04 '19 at 19:22

C++ gcc extension for non-zero-based array pointer allocation?

1 Answers1

Linked