1

I am writing a program and there was a very subtle error that was hard to recognize.

I examined the program for hours and it looks like it's from my misuse of resize() in allocating a new vector in 2D vector, from my misuse of back() in vector, or from updating a reference variable.

I wrote a simpler code that contains similar problem in my original program for demonstration:

int main() {

    vector<vector<int>> A(1);
    vector<int> &recent = A.back();

    recent.emplace_back(50);
    cout << A.back()[0] << endl; //prints 50

    A.resize(2);
    A[1] = vector<int>();
    cout << A.size() << endl;    //prints 2
    recent = A.back();
    cout << A[1].size() << endl; //prints 0
    recent.emplace_back(20);     //Segmentation fault occurs!!
    cout << A[1][0] << endl;
}

Segmentation fault occurs when I tried to emplace_back(20), although in my original program it doesn't throw any error and doesn't emplace the element either.

Possible cause of problem, in my opinion is:

  1. I used resize() to allocate a new vector after the current last position of the 2D vector A, because I didn't know how to emplace_back() an empty vector.

2, 3. In recent = A.back();, I'm not sure if I am updating the reference variable(defined as vector<int> &recent) correctly, and if back() gives the correct reference to the newly allocated vector at the end of the 2D vector A.

The logic looked perfectly fine, but obviously I am doing something wrong.

What am I doing wrong, and what can I do instead?

Eric
  • 2,635
  • 6
  • 26
  • 66
  • ` A[1] = vector();` can you explain what this line does? Because `resize` calls constructor by iteself. –  Sep 15 '15 at 10:06
  • @Satus I know. I added that expression just to see if it works with it. It didn't work with or without it. – Eric Sep 15 '15 at 10:07
  • `A.back()` already returns a reference that tracks the last element. Why add another layer? – TemplateRex Sep 15 '15 at 10:38
  • @TemplateRex the variable name for 2D vector in my program was too lengthy and I wanted to give a meaningful name to the last element in it. – Eric Sep 15 '15 at 10:41
  • then just add a reference with a short name to track the entire 2D vector, and access the `.back()` member of that shorthand: `auto& vec_ref = my_long_vector_vector; vec_ref.back();`. – TemplateRex Sep 15 '15 at 10:50
  • I added an answer to that effect. – TemplateRex Sep 15 '15 at 10:55

3 Answers3

3

References in C++ cannot be "updated". The call to resize may (and likely will) invalidate any reference to the original content of the vector. Thus recent is a dangling reference after A.resize(2);.

When creating the initial A here

std::vector<std::vector<int>> A(1);

the outer vector is required to be able to store one single vector. If you add another std::vector<int> to A the first element of A is likely to move to another memory location. Since recent will always refer to the old memory location you see the segfault.

See 'c++ Vector, what happens whenever it expands/reallocate on stack?' for how vectors work.

On the question how to circumvent this: If you know the size of the vector in advance you could use reserve to prevent the vector A from reallocating its contents. You'd nevertheless face the problem that references cannot be "reassigned". You can always use A.back() to refer to the last element.

You can use a function taking a reference argument which will be bound upon calling the function:

void do_stuff(std::vector<int> & recent)
{
  // do stuff with recent
}

std::vector<std::vector<int>> A;
while (condition)
{
  // add whatever to A
  A.emplace_back(std::vector<int>{});
  // do stuff with last element
  do_stuff(A.back());
}

Another way to do it is with scope:

std::vector<std::vector<int>> A(1);
{
  std::vector<int> &recent = A.back();
  recent.emplace_back(50);
  std::cout << A.back()[0] << std::endl; //prints 50
  A.resize(2);
} // recent goes out of scope here

std::cout << A.size() << std::endl;    //prints 2
{
  std::vector<int> &recent = A.back(); // another recent indepedant of first one
  std::cout << A[1].size() << std::endl; //prints 0
  recent.emplace_back(20); 
  std::cout << A[1][0] << std::endl;  // prints 20
}
Community
  • 1
  • 1
Pixelchemist
  • 24,090
  • 7
  • 47
  • 71
  • Is there any workaround for this? I am doing this in a while loop and want to keep a reference to the last vector in 2D vector. – Eric Sep 15 '15 at 10:08
  • Use pointer or simply `A.back()` without putting it into reference? –  Sep 15 '15 at 10:09
  • I try to minimize use of pointers. Does this seem like a proper situation to use a pointer? – Eric Sep 15 '15 at 10:10
  • @user2418202 Use `std::list` instead of `std::vector` if you insist on using pointers and/or references to existing items in a resizeable container. – PaulMcKenzie Sep 15 '15 at 10:11
  • Thank you. I just used the scope idea. The while loop contained a lot of `if`-`else if`, and I defined reference variable `recent` separately at scopes in which I will use the variable many times. – Eric Sep 15 '15 at 10:34
3

Let's step through the code line by line.

vector<vector<int>> A(1);
vector<int> &recent = A.back();

Here we create a vector with one default-constructed vector<int> as its contents. We then bind a reference to the last and only element.

recent.emplace_back(50);
cout << A.back()[0] << endl; //prints 50

We now emplace 50 into the sole vector and print it.

A.resize(2);

Now we resize the vector. If space needs to be reallocated, all iterators, pointers and references to the contents are now invalid.

A[1] = vector<int>();
cout << A.size() << endl;    //prints 2

This is fine, as there is enough space in A.

recent = A.back();

BANG

This assignment doesn't rebind recent, it tries to assign A.back() to the referencee. If space was reallocated for A, recent is no longer a valid reference, so we run off into the realm of undefined behaviour.

Quite honestly, using A.back() directly rather than maintaining a reference to it is probably your best bet. If you absolutely want to hold some kind of reference to the end, this is a reasonable use of a non-owning pointer.

TartanLlama
  • 63,752
  • 13
  • 157
  • 193
  • I didn't use A.back() directly because in my original program it would be `keys_in_transaction.back()` and pretty lengthy, and I wanted to give a name like `recent_keys`. Should I use something like `shared_ptr> recent = shared_ptr> (A.back());` even if I am not very good at using smart pointers or pointers? – Eric Sep 15 '15 at 10:18
  • You shouldn't use a `std::shared_ptr`, because that denotes ownership of the object. A raw pointer is more suitable. – TartanLlama Sep 15 '15 at 10:19
  • @user2418202 You should neither use a `shared_ptr` nor a raw pointer. – Pixelchemist Sep 15 '15 at 10:23
2

From the discussion in the comments, it appears that your original problem was:

vector<vector<int>> very_long_name_that_cannot_be_changed;

and that you want a shorthand notation to access the last element of this:

auto& short_name = very_long_name_that_cannot_be_changed;
short_name.resize(100);             // will expand the vector, but not change the reference
short_name.back().emplace_back(20); // presto, quick accesss to the last element.

This is proof against resizing, because the reference just tracks the vector, not its last element.

TemplateRex
  • 69,038
  • 19
  • 164
  • 304