15

In my code a have a global vector of Node object and a local vector of Node pointers:

#include<cstdio>
#include<cstdlib>
#include<vector>

using namespace std;

class Node {
    int n;

public:
    Node(int i) : n(i);
    int getN() { return n; }
};

vector<Node> v;

int main() {
    vector<Node*> p;
    v.push_back(Node(1));
    p.push_back(&v[0]);
    printf("first node id : %d\n", (*p[0]).getN());

    return 0;
}

I inserted a node object to global vector & inserted the pointer of that object in the local vector. Output of my above code is:

first node id : 1

However, if I change my main function to this:

int main()
{
    vector<Node*> p;
    v.push_back(Node(1));
    p.push_back(&v[0]);
    v.push_back(Node(2));
    p.push_back(&v[1]);
    printf("first node id : %d\n", (*p[0]).getN());

    return 0;
}

The code prints a garbage value:

first node id : 32390176

I can't figure out the problem. Does the vector data structure changes the references of each object after an insertion ? How can I fix this ?

einpoklum
  • 118,144
  • 57
  • 340
  • 684
palatok
  • 1,022
  • 5
  • 20
  • 30
  • Made your code slightly more terse (and using an initialization list in the ctor). – einpoklum Jan 10 '16 at 17:00
  • 1
    I'm pretty sure we have a duplicate for that. Consulting the reference documentation (see [Iterator invalidation](http://en.cppreference.com/w/cpp/container/vector)) should have been clear enough anyways. – πάντα ῥεῖ Jan 10 '16 at 17:04
  • @πάνταῥεῖ A good starting point. But while reference-invalidation implies iterator-invalidation, the reverse does not hold. – Deduplicator Jan 10 '16 at 17:18
  • @palatok: You know, you could just have your example with plain `int`s rather than `Node`s wrapping `int`s, it'll be just the same... – einpoklum Jan 10 '16 at 17:23
  • @πάνταῥεῖ: Many novice programmers don't know what iterators are when they start using C++ vectors, and would probably not realize something like "Iterator Invalidation" is relevant to what they're doing. – einpoklum Jan 10 '16 at 17:24

4 Answers4

18

"Does a vector change the references after an insertion?"

Possibly, yes. An std::vector may reallocate its (heap) storage when you add/push_back() additional elements, invalidating all pointers:

Iterator [read: Pointer] Invalidation

(for operations) push_back, emplace_back ... If the vector changed capacity, all of them [i.e. all iterators are invalidated]. If not, only end().

"How can I fix this?"

The above invalidation rule does not apply if a vector's capacity does not change due to an insertion - since vectors do not reallocate storage unnecessarily. So if you pre-set your vector's capacity to 2 in your example (say, with v.reserve(2)), the pointer will remain valid. If you don't know the size in advance, but you can delay the construction of the second vector (with the pointers), you don't have to reserve, you'll just have the size after inserting the last element.

The approaches above are highly unrecommended, however. If you were to make your vector constant - at least in the scope of a function in which you would construct and use the second vector - you would have a strong guarantee of non-reallocation. Alternatively, if you could determine the size in advance you might use an std::array, and it would be more fitting to use pointers into that container's storage:

Iterator Invalidation

As a rule, iterators to an array are never invalidated throughout the lifetime of the array.

You might also consider storing indices into your vector (although there, as well, the vector might shrink, invalidating the indices, or you might insert elements in the middle etc).

Anyway, I suspect that you might not actually want to do any of this, i.e. it seems to be a not-so-good solution to a problem which could be handled with a different approach altogether.

PS - If the vector has a custom allocator then everything I've written might be irrelevant.

Community
  • 1
  • 1
einpoklum
  • 118,144
  • 57
  • 340
  • 684
6

Yes, a push_back() on a vector invalidates all references (and pointers) to elements in that vector if it has to reallocate. There are various ways to work around this. If you know that your vector will have a particular number of nodes, you can use reserve(). In your example, you could reserve two elements:

int main()
{
    v.reserve(2);
    .
    .
    .
}

This will make sure the vector has preallocated enough storage so that it doesn't need to reallocate.

If you don't know the size ahead of time, then you'll have to change something about your approach. You might use a std::deque instead of a std::vector, since std::deque doesn't invalidate references when you use push_back(). You might store indices instead of pointers. Or you might need to push all the nodes into your vector before making the pointers.

int main()
{
    v.push_back(Node(1));
    v.push_back(Node(2));

    vector<Node*> p;
    p.push_back(&v[0]);
    p.push_back(&v[1]);

    printf("first node id : %d\n", (*p[0]).getN());

    return 0;
}
Vaughn Cato
  • 63,448
  • 5
  • 82
  • 132
  • Thanks, I can do that for now. But what if I don't know the size ? Is there any solution for that situation ? One of the reason for using vector in this code that i don't know the size :( – palatok Jan 10 '16 at 17:06
  • @palatok: I've extended my answer. – Vaughn Cato Jan 10 '16 at 17:11
  • Thanks, Inserting all nodes before collecting pointers seems a very good workaround. Thanks a lot !!! – palatok Jan 10 '16 at 17:14
5

What you are doing is undefined behavior for your vector p because the vector v can change where it's objects are stored.

A std::vector's memory is contiguous, so it may, after a number of push_backs, have to allocate a new block memory and copy it's contents to the new block. This will invalidate all the pointers that happened to point to the old memory location.

gedamial
  • 1,498
  • 1
  • 15
  • 30
Anon Mail
  • 4,660
  • 1
  • 18
  • 21
  • How can I fix this ? Is there any workaround for my problem using vector ? I have to use vector & pointers – palatok Jan 10 '16 at 17:02
  • If you **have** to use a `std::vector`, you can `reserve` all of your necessary space upfront (but you have to know what the size of your vector will be). This will prevent reallocations. – Anon Mail Jan 10 '16 at 17:04
  • @palatok: What do you mean "fix this"? You can't change the standard library. What are you trying to accomplish? What are your requirements, and which parts of your design are you willing to change? – Beta Jan 10 '16 at 17:06
  • Sorry , I didn't mean fixing in standard library. I meant how can I fix my code with the same data structure. Anyway, now I got the idea to fix my problem, Thanks – palatok Jan 10 '16 at 17:16
1

You have stumbled upon one of the great known "dark corners" of C++: the great "iterator invalidation." Definitely familiarize yourself intimately with this:

Iterator invalidation rules

In particular, you are hitting this reality:

vector: all iterators and references before the point of insertion are unaffected, unless the new container size is greater than the previous capacity (in which case all iterators and references are invalidated) [23.2.4.3/1]

(the emphasis is mine)

Now, about your issue. You can either make sure the vector never re-allocates. Or, you can use a different container that does not have this issue. There are compromises among all the container types, depending on your needs. Check out the other question thoroughly and make an informed decision.

Community
  • 1
  • 1
johnbakers
  • 24,158
  • 24
  • 130
  • 258
  • 1
    It's not a "dark corner". Pointers into memory which you don't manage yourself are inherently untrustworthy... – einpoklum Jan 10 '16 at 17:11
  • @einpoklum well, "dark corner" is a vague and subjective term at best, but I have seen numerous posts on HN in the last couple months that specifically isolate iterator invalidation as a significant "dark corner" of the language, so I guess it is a popular sentiment. But you can call it what you want, it's a silly term. – johnbakers Jan 10 '16 at 17:13
  • I actually don't know what HN means, but - ok, fair enough... I guess it's more of a dark corner if you're coming from a language with iterators which are kept valid while things change under the hood. – einpoklum Jan 10 '16 at 17:22