16

I have a std::vector<int> and a pointer int* that points to an element in the vector. Let’s say the pointer points to the third element: pointer=&vector.at(2). If I now shuffle the vector, will it still point to the same element (the third) or will it point the the new location where the element which used to be the third has now moved?

After that, I’d like to make the question a little bit more general: How do pointers and iterators to elements in a vector behave when the vector is expanded or reduced?

davepmiller
  • 2,620
  • 3
  • 33
  • 61
dani
  • 3,677
  • 4
  • 26
  • 60
  • 1
    A pointer is and address, a memory location. it points to whatever happens to be at its address; it always points to the same address unless the pointer itself is changed. – The Paramagnetic Croissant Feb 23 '15 at 22:21
  • 1
    Take a moment to think about what is a pointer? Pointer is an address to memory. Doesn't matter how you shuffled the vector, the pointer would still point to the same address. – Tony J Feb 23 '15 at 22:22
  • They stay the same, and that's why it's a problem. – iwolf Feb 23 '15 at 22:22
  • You can't move objects in C++. (You can have move constructors etc., but those aren't what "moving an object" means in the context of this question) – user253751 Feb 24 '15 at 07:04

8 Answers8

17

The pointer will continue to point to the same location, so when you shuffle, it'll point to whatever element has been moved into the location you specified.

When you expand the size of a vector, all existing pointers and iterators into the vector can become invalid. When you shuffle, they continue to refer to the same location, which will (usually) contain a different value than it did before shuffling.

Reducing the size of a vector will depend on exactly how you do that. One way is to create a temporary vector as a copy of the current vector, swap the two, then destroy the temporary (usually implicitly, by letting it go out of scope). If you do this, the pointers will be into the temporary, and be invalidated when it's destroyed.

If you use shrink_to_fit that (probably) won't invalidate iterators/pointers, but may not have any effect (the standard specifies that it's a non-binding request, and doesn't say anything about it invalidating iterators/pointers).

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • 5
    "all existing pointers and iterators into the vector become invalid" may be true, but probably more correct as "existing pointers and iterators are not guaranteed to stay valid". – thang Feb 23 '15 at 22:31
  • Swapping with an empty vector will do more than *probably* invalidating them, won't it? `std::vector::swap` guarantees that pointers to elements remain valid, but pointing into the other container (http://stackoverflow.com/questions/8190950/may-stdvector-make-use-of-small-buffer-optimization). So it's deterministic, the pointers are invalid if and only if you do something on the formerly-empty vector to make them invalid, such as destroying it. – Steve Jessop Feb 23 '15 at 23:44
  • 4
    @thang Those mean the same thing. – philipxy Feb 24 '15 at 00:03
  • @SteveJessop: good point. I'm not sure what I was thinking when I wrote that part, but it wasn't really very accurate. I've tried to improve it a bit, but if you see any other problems I'd be happy to hear about them. – Jerry Coffin Feb 24 '15 at 00:19
  • 2
    How do you actually `shrink_to_fit` without invalidating pointers and iterators? – T.C. Feb 24 '15 at 06:05
  • 1
    @T.C.: [Some] memory managers can split an allocated block in place. Others would likely just ignore the request--but either way, the standard doesn't seem to give any permission for `shrink_to_fit` to invalidate iterators. – Jerry Coffin Feb 24 '15 at 07:34
  • 3
    @JerryCoffin See [LWG 2223](http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#2223) about the `shrink_to_fit` vs iterator invalidation issue. – dyp Feb 24 '15 at 11:54
11

If I now shuffle the vector, will it still point to the same element (the third) or will it point the the new location where the used-to-third element has moved?

Shuffling elements is just a matter of copying/swapping elements through the various "buckets" in the array, while your pointer just points to "that fixed position in memory". So, it'll keep pointing to whatever stays in third position in the array.

Then I like to make the question a little bit more general: How do pointer and iterators to elements in a vector behave when the vector is expanded, reduced or shuffled?

Expand: all iterators/references/pointers may be invalidated.

Reduced: as far as they point to elements before those removed, they are kept valid unless you do a shrink_to_fit. Iterators/pointers to elements you removed are obviously invalid.

Shuffled: you are moving around stuff without causing reallocations, so iterators and references are still valid.

Notice that all this stuff is typically reported in most C++ documentation sources.


The conceptual rule to remember for vectors is that they are just a box around a dynamic array, and iterators and pointers to elements are conceptually the same thing (actually, std::vector<T>::iterator could be a typedef for T *). The same holds for references (which are pointers in disguise).

If an operation may need reallocate the array (=the array needs to grow, or you explicitly requested it to shrink), then all iterators/pointers/references are going to be invalidated. If you remove elements, then pointers pointing past the "conceptual end" of the vector will point to invalid elements. If the size stays the same, no reallocation needs to occur.

Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
10

If the vector is shuffled without being resized then the pointer still points to the same location , which will probably contain a different element.

If the vector is resized to be larger, then the pointer is said to be "invalidated" and it has the same status as an uninitialized pointer, i.e. evaluating it or trying to read through it causes undefined behaviour.

M.M
  • 138,810
  • 21
  • 208
  • 365
  • How can you say that the pointer would **probably** point to a different element, without knowing its content? Imagine a vector of only 3's... – Theolodis Feb 24 '15 at 09:20
  • @Theolodis it seems improbable to me that someone would be shuffing a vector containing only 3's – M.M Feb 24 '15 at 09:21
9

The address will not change, but the value stored at that address will.

#include <iostream>
#include <algorithm>

static void print_vec(const std::vector<int>& v) {
    for (int i = 0; i < v.size();  ++i) {
        std::cout << "Value: " << v.at(i) << " Address: " << &v.at(i) << std::endl;
    }
}

static void shuffle_vec(std::vector<int>& v) {
    auto engine = std::default_random_engine{};
    std::shuffle(v.begin(), v.end(), engine);
}

int main() {
    std::vector<int> v{1, 2, 3, 4, 5};
    std::cout << "Before Shuffle: " << std::endl;
    print_vec(v);
    shuffle_vec(v);
    std::cout << "After Shuffle: " << std::endl;
    print_vec(v);

    return 0;
}

Output:

Before Shuffle: 
Value: 1 Address: 0x16eb320
Value: 2 Address: 0x16eb324
Value: 3 Address: 0x16eb328
Value: 4 Address: 0x16eb32c
Value: 5 Address: 0x16eb330
After Shuffle: 
Value: 3 Address: 0x16eb320
Value: 1 Address: 0x16eb324
Value: 5 Address: 0x16eb328
Value: 4 Address: 0x16eb32c
Value: 2 Address: 0x16eb330
davepmiller
  • 2,620
  • 3
  • 33
  • 61
6

In practice, a vector is a code-maintained contiguous buffer of data. Each element is set up adjacent to the next in an array-like fashion.

When elements move around, in practice, they just move around. Pointers point to locations in that buffer, so if the element moves, in practice the pointer just ends up pointing somewhere else.

However, the C++ standard is more strict. When an iterator is invalidated, so are references and pointers to that location. There are a number of operations that can invalidate an iterator that do not, logically, change the fact that the array is actually going to be the same buffer. For example, if you .erase an element, it invalidates every iterator at that location and afterwards.

In practice a pointer to the element will end up pointing at what was the "next" element in the list, but the standard doesn't guarantee that.

std::shuffle does not invalidate any iterators. It just changes the values stored there. So a pointer to the nth element will both in practice, and in theory, still point to the nth element.

When the vector is expanded, if you expand it beyond .capacity(), all iterators are invalidated. In practice it actually moves the data to a new location, and the pointers are now danging pointers.

When you reduce (via .erase(it) or .erase(a,b)), all iterators at or after the first argument are invalidated. This means that references and pointers to these elements are also invalidated. In practice, they will now refer to elements "further down the chain" (if such elements exist), as neither .erase will cause your buffer to reallocate, but this is not guaranteed by the standard.

There are other operations that can invalidate iterators. .shrink_to_fit() can, as can vector<X>(vec).swap(vec) (the C++03 version of shrink-to-fit), and .reserve() and operations that grow the size beyond .capacity().

The operations that cause .capcity() to change will actually make your pointers invalid (or those that make the pointers point beyond-the-end) in practice and in theory.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • Why do you say "in practice" several times? (it's like that in theory too) – M.M Feb 23 '15 at 23:23
  • @matt `vec.erase(vec.begin())` in theory invalidates every pointer to any element of the vector. In practice, they still point into elements, but the standard does not talk about 'points to next element now' half-invalidation. I cannot think of an implementation that would break it, short of actually marking pointers invalid with a pathological language implementation. – Yakk - Adam Nevraumont Feb 23 '15 at 23:58
  • Code not following the standard has undefined behaviour. Bringing up typical implementations of undefined behaviour is misleading and irrelevant. (What to you is the *point* of doing so?) – philipxy Feb 24 '15 at 04:50
  • @philipxy I can think of 3 good reasons, and 1 mediocre one. First, recognizing what kind of mistakes you have to be careful of, because testing will not find it: knowing what the expected behavior (even if undefined) tells you lots. Second, I could be wrong: while references and iterators may be invalidated, the contiguity guarantees and pointer arithmetic guarantees in the standard might make pointers more well defined than I think! Third, when you have a situation where every implementation MUST behave some way, future standards may mandate it. Forth, sometimes you just gotta cheat. – Yakk - Adam Nevraumont Feb 24 '15 at 14:43
5

Read the documentation for every function you call. If you don't know when and how you may call it and what it does then why are you using it?

In general you cannot rely on implementation notions like addresses or arrays and you cannot rely on a test program. You must read when an iterator is or is not invalidated for what elements for a particular container, iterator and operator.

vector::shrink_to_fit invalidates all iterators
vector::resize to same or smaller invalidates exactly the iterators beyond the new size
vector::resize to larger invalidates all iterators

From the C++14 standard [iterator.requirements.general]:

[P]ointers are iterators. The effect of dereferencing an iterator that has been invalidated is undefined.

http://en.cppreference.com/w/cpp/container/vector

std::vector is a sequence container that encapsulates dynamic size arrays.
The elements are stored contiguously, which means that elements can be accessed not only through iterators, but also using offsets on regular pointers to elements.

iterator     RandomAccessIterator

Iterator invalidation
swap, std::swap     Never
shrink_to_fit     Always
resize     If the vector changed capacity, all of them. If not, only those after the insertion point.

http://en.cppreference.com/w/cpp/container/vector/resize

Vector capacity is never reduced when resizing to smaller size because that would invalidate all iterators, rather than only the ones that would be invalidated by the equivalent sequence of pop_back() calls.

After vector::shuffle iterators/pointers are unchanged but dereference to new values.

Because shuffle uses swap which leaves iterators unchanged:

http://en.cppreference.com/w/cpp/algorithm/random_shuffle

template< class RandomIt, class URNG >
void shuffle( RandomIt first, RandomIt last, URNG&& g );

RandomIt must meet the requirements of ValueSwappable and RandomAccessIterator.

http://en.cppreference.com/w/cpp/concept/Iterator

Iterator is the base concept used by other iterator types: InputIterator, OutputIterator, ForwardIterator, BidirectionalIterator, and RandomAccessIterator. Iterators can be thought of as an abstraction of pointers. [...]

`- lvalues of type It satisfy Swappable [...]

http://en.cppreference.com/w/cpp/concept/ValueSwappable

Type T is ValueSwappable if
    1) Type T satisfies the Iterator requirements
    2) For any dereferencable object x of type T (that is, any value other than the end iterator), *x satisfies the Swappable requirements.

http://en.cppreference.com/w/cpp/concept/Swappable

using std::swap;
swap(u, t);

After the call, the value of t is the value held by u before the call, and the value of u is the value held by t before the call.

philipxy
  • 14,867
  • 6
  • 39
  • 83
3

As people have mentioned, the pointer is "pointing" to a location in memory, regardless of the contents that are there. There is actually some really interesting stuff you can do, like having an array of 5 elements, but printing out the value at position 6, even though it is not within the scope of your array. By accessing an array with something like array[5] when you have only declared it to be 5 elements long, you will end up with undefined behaviour, essentially meaning that a variety of things could happen, with each run potentially returning something completely different. See philipxy's comments below for some very useful links delving into this concept.

So with that out of the way, here is a little bit of code you could test to actually see this effect.

#include <vector>
#include <iostream>

using namespace std;

int main()
{
    vector<int> values (5);

    for (int i = 0; i < 5; i++)
        values[i] = i;

    for (int i = 0; i < 5; i++)
        cout << values[i] << " ";

    //Initialise the pointer so that it is pointing at the first element in vector
    int* pointer = &values[0];

    //By incrementing, we expect it to be pointing at the second element, which should be 1
    pointer++;

    cout << endl << "Pointer " << *pointer << endl;

    //Reverse the order of the vector
    reverse(values.begin(), values.end());

    for (int i = 0; i < 5; i++)
        cout << values[i] << " ";

    cout << endl << "Pointer " << *pointer << endl; 

    return 0;
}

The result of this code is:

Results of output from code

So we can see that the pointer hasn't actually changed where it is pointing, but that cell in memory has been altered, so dereferencing the pointer will yield a different result.

Stevo
  • 397
  • 6
  • 26
  • Accessing an array or vector beyond its size is undefined behaviour. You cannot correctly say it will do anything in particular. – philipxy Feb 24 '15 at 04:14
  • @Philipxy Yeah, I think that is a reasonable comment. All I meant was that trying to go outside the defined region of the array won't necessarily be stopped, it will happily try to use the cell, whether successful or not. But if that statement is incorrect, could you please indicate scenarios where this is not the case? I'm always interested in learning more, and I will then update the answer accordingly. – Stevo Feb 24 '15 at 06:29
  • 1
    The C/C++ (etc) standards define what programs do, and some have [undefined behavior](http://blog.regehr.org/archives/213). One simply cannot expect anything but what is specified. Read about the [as-if rule](http://stackoverflow.com/questions/15718262/what-exactly-is-the-as-if-rule). Or [specification](https://en.wikipedia.org/wiki/Programming_language_specification) & [semantics](https://en.wikipedia.org/wiki/Semantics_%28computer_science%29) of [programming languages](https://en.wikipedia.org/wiki/Programming_language). See an earlier comment by me: What to you is the *point* of doing so? – philipxy Feb 24 '15 at 09:20
  • @philipxy I have finished reading the first page of the undefined behaviour link you have provided, and am very interested to continue with the other two pages. The as-if rule was also a very interesting read, though seemed somewhat funny in the cases where it didn't necessarily need to be applied (with the copy/move). Thank you so much for linking to this information, it really is a very interesting read, even though a few of the concepts were a little obscure to me. I'll edit my answer now to reflect what I've learnt. – Stevo Feb 24 '15 at 10:51
2

It completely depends on how "std::vector" is implemented. I am not sure if there are any guarantees about that.

EDIT: I just learned, that the C++ standard indeed is a lot stricter than I thought (thanks to philipxy). It does specify that vector must internally behave like a C Array. See

http://herbsutter.com/2008/04/07/cringe-not-vectors-are-guaranteed-to-be-contiguous/

So forget about the rest, at least if you have an implementation which conforms to at least C++03.

If you would for example implement std::vector as a linked list (not likely), then shuffling, reducing size, etc will not do anything.

If the "std::vector" internally uses something like "int []" to store its elements (likely), then reshuffling the elements will probably mean that your pointer will now point to a different value then before (what Stevo tried out).
If you resize your vector in this case, then again it depends entirely on internal implementation. A resize might allocate a new "int []" and copy the old content over. In this case your pointer will point to now unallocated memory, so all havoc might break loose.
If you are lucky (depending on the implementation) then a shrinking or growing of the vector by a "small" amount might not do anything (your pointer is still valid).

Summary: Don't do that ;-) (using pointers and afterwards modifying your container).

Ingo Blackman
  • 991
  • 8
  • 13