11

I have the following type:

std::vector<std::vector<int>> indicies

where the size of the inner vector is always 2. The problem is, that vectors are non-contiguous in memory. I would like to replace the inner vector with something contiguous so that I can cast the flattened array:

int *array_a = (int *) &(a[0][0])

It would be nice if the new type has the [] operator, so that I don't have to change the whole code. (I could also implement it myself if necessary). My ideas are either:

std::vector<std::array<int, 2>>

or

std::vector<std::pair<int, int>>

How do these look in memory? I wrote a small test:

#include <iostream>
#include <array>
#include <vector>
int main(int argc, char *argv[])
{
    using namespace std;

    vector<array<int, 2>> a(100);

    cout << sizeof(array<int, 2>) << endl;

    for(auto i = 0; i < 10; i++){
        for(auto j = 0; j < 2; j++){
            cout << "a[" << i << "][" << j << "] " 
                <<&(a[i][j]) << endl;
        }
    }
    return 0;
}

which results in:

8
a[0][0] 0x1b72c20
a[0][1] 0x1b72c24
a[1][0] 0x1b72c28
a[1][1] 0x1b72c2c
a[2][0] 0x1b72c30
a[2][1] 0x1b72c34
a[3][0] 0x1b72c38
a[3][1] 0x1b72c3c
a[4][0] 0x1b72c40
a[4][1] 0x1b72c44
a[5][0] 0x1b72c48
a[5][1] 0x1b72c4c
a[6][0] 0x1b72c50
a[6][1] 0x1b72c54
a[7][0] 0x1b72c58
a[7][1] 0x1b72c5c
a[8][0] 0x1b72c60
a[8][1] 0x1b72c64
a[9][0] 0x1b72c68
a[9][1] 0x1b72c6c

It seems to work in this case. Is this behavior in the standard or just a lucky coincidence? Is there a better way to do this?

Stein
  • 3,179
  • 5
  • 27
  • 51
  • [The elements of a vector are stored contiguously](http://stackoverflow.com/q/849168/238902) – default Dec 11 '16 at 00:30
  • 3
    I think the question is: Could there be padding in `std::pairs` and `std::arrays`? Just that the `std::vector` stores its elements contiguously is not enough here. – Wintermute Dec 11 '16 at 00:39
  • A vector of vectors **is not guaranteed to store the elements contiguously**. Only the objects themselves (the inner vectors as addresses or whatever representation is being used) are stored contiguously, but not the data each individual inner-vector data pointer points to. – vsoftco Dec 11 '16 at 00:50
  • 1
    @Wintermute http://stackoverflow.com/questions/19103244/is-the-size-of-stdarray-defined-by-standard –  Dec 11 '16 at 01:00
  • I don't believe you can rely on this: https://stackoverflow.com/questions/40476058/does-stdvectorsimd-wrapper-have-contiguous-data-in-memory/40476277#40476277 – Galik Dec 11 '16 at 01:31

3 Answers3

2

An array<int,2> is going to be a struct containing an array int[2]; the standard does not directly mandate it, but there really is no other sane and practical way to do it.

See 23.3.7 [array] within the standard. There is nothing in the standard I can find that requires sizeof(std::array<char, 10>)==1024 to be false. It would be a ridiculous QOI (quality of implementation); every implementation I have seen has sizeof(std::array<T,N>) == N*sizeof(T), and anything else I would consider hostile.

Arrays must be contiguous containers which are aggregates that can be initialized by up to N arguments of types convertible to T.

The standard permits padding after such an array. I am aware of 0 compilers who insert such padding.

A buffer of contiguous std::array<int,2> is not guaranteed to be safely accessed as a flat buffer of int. In fact, aliasing rules almost certainly ban such access as undefined behaviour. You cannot even do this with a int[3][7]! See this SO question and answer, and here, and here.

Most compilers will make what you describe work, but the optimizer might decide that access through an int* and through the array<int,2>* cannot access the same memory, and generate insane results. It does not seem worth it.

A standards compliant approach would be to write an array view type (that takes two pointers and forms an iterable range with [] overloaded). Then write a 2d view of a flat buffer, with the lower dimension either a runtime or compile time value. Its [] would then return an array view.

There is going to be code in boost and other "standard extension" libraries to do this for you.

Merge the 2d view with a type owning a vector, and you get your 2d vector.

The only behaviour difference is that when the old vector of vector code copies the lower dimension (like auto inner=outer[i]) it copies data, afer it will instead create a view.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • IMO it's permitted to alias a struct consisting of `int[2]` as ints – M.M Dec 11 '16 at 04:26
  • @m.m But not a struct containing 2 such arrays (the layout compatibility/same address refers to the first member), nor an array of 2 such structs as one buffer of `int`. So `int* ptr=?; ptr[3];` *cannot* refer to a member of any `array` in a defined way under the standard, regardless of what `?` is. – Yakk - Adam Nevraumont Dec 11 '16 at 05:08
  • The standard requires an array to have no padding between its elements. That applies recursively to arrays of arrays. The layout of int[4] and int[2][2] and array[2] is the same by definition. – Maxim Egorushkin Dec 15 '16 at 05:26
  • @max does it require that a struct has no padding *after* its elements? ie, that `struct A{ char x[2]; };` must have `sizeof(A)==2`? If so, where? – Yakk - Adam Nevraumont Dec 15 '16 at 08:46
  • http://en.cppreference.com/w/c/language/sizeof : When applied to an operand that has structure or union type, the result is the total number of bytes in such an object, including internal and trailing padding. The trailing padding is such that if the object were an element of an array, the alignment requirement of the next element of this array would be satisfied, in other words, sizeof(T) returns the size of an element of a T[] array. – Maxim Egorushkin Dec 15 '16 at 14:13
  • In other words, the trailing padding is only added to satisfy the layout requirements of an array (no padding between elements). What would be the reason for trailing padding in `struct X { char x[2]; };`? – Maxim Egorushkin Dec 15 '16 at 14:23
  • @MaximEgorushkin That is not the standard. I was asking where the standard mandated that. I'd bet the standard mandates that `T[N]` has size `N*sizeof(T)`, but even that I'd want a quote for (are compilers *permitted* to put padding at the *end* of an array?), which you claimed the standard required above. Second, aliasing rules in the standard are strict; even with no padding, access through different types of pointers and references can not be defined. – Yakk - Adam Nevraumont Dec 15 '16 at 14:25
  • Your sentences 1 and 3 in the answer are wrong, hence the downvote. I am not going to add any more evidence because your assertions are made without references, which means they can be dismissed without references. – Maxim Egorushkin Dec 15 '16 at 14:33
  • @Yakk Look at the C++ version for sizeof, they did not bother elaborating. – Maxim Egorushkin Dec 15 '16 at 14:37
  • @MaximEgorushkin I have now mentioned the [array] section in the standard, which is extremely brief. I did not quote it in its entirety, but it places no direct restrictions on `sizeof(std::array)`; `sizeof` is never mentioned. Citations for the pointer aliasing rules in C++ may follow. I am uncertain what your last comment refers to ("C++ version for sizeof"). – Yakk - Adam Nevraumont Dec 15 '16 at 14:45
  • @MaximEgorushkin And 3 other SO answers cited for aliasing of arrays-of-arrays into arrays being illegal; arrays-of-structs-containing-arrays being aliased into an array is less so. Some of those answers cite parts of the standard. The standard doesn't state "this is illegal", but treating pointers-to-X as pointers-to-Y is only permitted in narrow ways, and this is not one of them. There is no citation I can provide that "the standard never permits this" other than the whole standard, because a pargraph anywhere added to the standard *could* permit it; you cannot cite a negative. – Yakk - Adam Nevraumont Dec 15 '16 at 14:55
  • @Yakk I do not have access to the standard pdf's at the moment, so I cannot look it up. You are right about strict alignment requirements and casting UB. But I would need to look up std::array. However, it is original Boost rationale was to make plain C arrays better C++ citizens with begin/end and disable the pointer decay, if my memory serves me right. In other words, it was meant to be just a type safe wrapper over the underlying built-in array type. – Maxim Egorushkin Dec 15 '16 at 14:59
  • I meant _C++ version of sizeof __documentation__._ – Maxim Egorushkin Dec 15 '16 at 15:03
2

Is there a better way to do this?

I recently finished yet-another-version of Game-of-Life.

The game board is 2d, and yes, the vector of vectors has wasted space in it.

In my recent effort I chose to try a 1d vector for the 2d game board.

typedef std::vector<Cell_t*>  GameBoard_t;

Then I created a simple indexing function, for when use of row/col added to the code's readability:

inline size_t gbIndx(int row, int col)
  { return ((row * MAXCOL) + col); }

Example: accessing row 27, col 33:

Cell_t* cell = gameBoard[ gbIndx(27, 33) ];

All the Cell_t* in gameBoard are now packed back to back (definition of vector) and trivial to access (initialize, display, etc) in row/col order using gbIndx().


In addition, I could use the simple index for various efforts:

void setAliveRandom(const GameBoard_t& gameBoard)
{
   GameBoard_t  myVec(m_gameBoard); // copy cell vector

   time_t seed = std::chrono::system_clock::
        now().time_since_epoch().count();

   // randomize copy's element order
   std::shuffle (myVec.begin(), myVec.end(), std::default_random_engine(seed));

   int count = 0;
   for ( auto it : myVec )
   {  
      if (count & 1)  it->setAlive(); // touch odd elements
      count += 1;
   }
}

I was surprised by how often I did not need row/col indexing.

2785528
  • 5,438
  • 2
  • 18
  • 20
  • g++ v6.2 tells me that "std::default_random_engine(...)" is implementation defined. I now use std::mt19937_64 gen(rd) with std::random_device rd; – 2785528 Jan 19 '17 at 04:35
-4

As far as I know, std::vector are contiguous in memory. Take a look at this questions:

Why is std::vector contiguous?,

Are std::vector elements guaranteed to be contiguous?

In case you have to resize an inner vector, you wouldn't have the whole structure contiguous, but the inner vectors would still be it. If you use a vector of vectors, though, you'd have a fully contiguous structure (and I edit here, sorry I misunderstood your question) meaning that the pointers that point to your inner vectors will also be contiguous.

If you want to implement a structure that is always contiguous, from the first element of the first vector to the last element of the last vector, you can implement it as a custom class that has a vector<int> and elems_per_vector that indicates the number of elements in each inner vector.

Then, you can overload the operator(), so to access to a(i,j) you are actually accessing a.vector[a.elems_per_vector*i+j]. To insert new elements, though, and in order to keep the inner vectors at constant size between them, you'll have to make as many inserts as inner vectors you have.

Community
  • 1
  • 1
J.Checa
  • 1
  • 1
  • A ´std::vector` stores its elements contiguously but not locally. That is to say, in a vector of vectors, each inner vector stores its elements contiguously, but the last element of one inner vector will not generally be contigously stored with the first element of the next inner vector, which is what OP requires. – Wintermute Dec 11 '16 at 00:42
  • 1
    @J. Checa *If you use a vector of vectors, though, you'd have a fully contiguous structure* That's false. A vector of vectors stores indeed the inner vectors (as objects) contiguously. However, each inner vector stores internally a pointer that represents its data. The pointers themselves are not at all guaranteed to point to contiguously memory areas. – vsoftco Dec 11 '16 at 00:45
  • Sorry, I misunderstood the OP question. He seem to want all the elements, from the first element of first vector, to the last elemento of last vector, to be aligned in memory. Yes, indeed, if you create a vector of vectors, you get a contiguous zone of memory with pointers to the inner vectors. Also, my last statement led to confusion, I meant to say that all your elements will be contiguous, referring to the pointers themselves. I'll edit my answer to avoid confusion. Thanks @vsoftco and Wintermute for your comments :D – J.Checa Dec 11 '16 at 01:03