Using iterators in nested loop through two vectors to match elements and erase from the vectors

Question

I wanted to use the following loops to match elements from two vectors and then delete them from the vectors:

for(auto it1=left_unprocessed_event_arrays.begin(); it1!=left_unprocessed_event_arrays.end(); ++it1){
  for(auto it2=right_unprocessed_event_arrays.begin(); it2!=right_unprocessed_event_arrays.end(); ++it2){
    if(it1->header.stamp.nsec==it2->header.stamp.nsec){
      matching_event_arrays.push_back({*it1,*it2});
      left_unprocessed_event_arrays.erase(it1);
      right_unprocessed_event_arrays.erase(it2);
    }
  }
}

I then realised I couldn't do it like this, because using the erase() is making the iterators invalid.
Searching a solution brought me to this. Here someone suggests using the pointer that is returned by erase(), and then incrementing the iterator in the else-bracket like this:

std::vector<std::string>::iterator iter;
for (iter = m_vPaths.begin(); iter != m_vPaths.end(); ) {
    if (::DeleteFile(iter->c_str()))
        iter = m_vPaths.erase(iter);
    else
        ++iter;
}

But When I increment only in the innerloop, it won't correctly go through the outer-loop. I'm struggling to see how I can make this work for my nested loop.
So my question is: How can I implement the solution of the linked answer for a nested loop? Or if there is another/better solution: what's that?

Have you considered iterating from the end to the start with indices? This way your delete will never affect where you're at. One could also argue it will be faster because you will move less stuff around. Also, instead of having O(n²) complexity, if you can sort your objects in some way, then you would only need the current index/iterator on each of your vector, that would be closer to O(n) — AlexG, Nov 27 '18 at 11:51
@AlexG How does it not affect where I am at? Let's consider we are at element 0 in both vectors and the if condition applies. Now I erase element 0 from both vectors and then I increment the index and at the same time my old element 1 is element 0 now. So I will skip element 1 from the original vector. Or am I wrong here? — Max, Nov 27 '18 at 11:55
@AlexG Well, I guess I could sort them, but sorting is also taking time, right?! So what do I win? And besides that: performance doesn't matter here, because the vectors will have approximately 1 to 5 elements — Max, Nov 27 '18 at 11:58
` left_unprocessed_event_arrays.erase(it1); right_unprocessed_event_arrays.erase(it2); ` it1 and it2 are invalidated, aren't they? So it1++ and it2++ is undefined behavior. — Matthieu Brucher, Nov 27 '18 at 11:59
@MatthieuBrucher I know. That's what I wrote in the question, isn't it? — Max, Nov 27 '18 at 11:59
If you found a match, do you want to continue the inner loop or not? This is not clear from either the code or your description. — Max Langhof, Nov 27 '18 at 12:20
@MaxLanghof I wanted to check each element of the first vector with each element of the second vector, no matter if I found a match or not. — Max, Nov 27 '18 at 12:32
@Max If the arrays are {1 2 3} and {1 2 2 2 3}, would you want to match the `2` once or thrice? It looks like once, but you did not specify. — Max Langhof, Nov 27 '18 at 12:33
@MaxLanghof Ah okay, now I see what you want to know. It's irrelevant for me, because I can guarantee that there are no duplicate elements in the the field that I use as a matching criterion. — Max, Nov 27 '18 at 12:35
By the way, do these arrays happen to be sorted by time stamp? This algorithm could be significantly more efficient in that case... — Max Langhof, Nov 27 '18 at 12:50
@Max First off, if you iterate backwards on a 10-element vector, you would start at index 9, then 8, then 7 and so on. That means regardless if elements[9] is deleted or not, you would go to elements[8] next. As for your second point, yes, sorting does take time; it's O(N log N), which means it beats O(n²) hands down for large datasets. Your OP did not mention any number of elements, but now that you add that precision, it might not be as relevant. That's why it's important to [MCVE] — AlexG, Nov 27 '18 at 14:02
@MaxLanghof Even though I mentioned before, they were not sorted, actually they are. But as I said: Both vectors have just 1-5 elements (in most cases they have exactly 1 element). So performance is not really important here. — Max, Nov 27 '18 at 14:34
@AlexG Oh I got you now. I guess that would've been an option now. And of course you are right, a MWE would've been better. Sorry for that. Still don't think this question deserves 4 downvotes — Max, Nov 27 '18 at 14:36

Max Langhof · Accepted Answer · 2018-11-27T12:56:38.067

Using standard algorithms clarifies both what you want to do and makes the control flow straightforward (unlike goto):

for (auto it1 = left_unprocessed.begin(); it1 != left_unprocessed.end(); )
{
  auto eventsMatch = [&event1 = *it1](const auto& event2) {
    return event1.header.stamp.nsec == event2.header.stamp.nsec;
  };

  auto it2 = std::find_if(right_unprocessed.begin(), right_unprocessed.end(), eventsMatch);

  if (it2 != right_unprocessed.end())
  {
    matching_event_arrays.push_back({*it1, *it2});
    it1 = left_unprocessed.erase(it1);
    right_unprocessed.erase(it2);
  }
  else
  {
    ++it1;
  }
}

snake_style · Answer 2 · 2018-11-27T12:22:00.233

See at this example, maybe it will helps:

int main()
{
    vector<int> a = { 1, 2, 3, 7, 2 };
    vector<int> b = { 2, 3, 4, 9 };

    auto cmp = [&b](int x) {return std::find(b.begin(), b.end(), x) != b.end();};

    while(1)
    {
        auto it = std::find_if(a.begin(), a.end(), cmp);
        if(it == a.end())
            break;
        auto val = *it;
        a.erase(std::remove(a.begin(), a.end(), val), a.end());
        b.erase(std::remove(b.begin(), b.end(), val), b.end());
    }
}

score 0 · Answer 3 · answered Nov 27 '18 at 12:17

0

I would use a goto to exit from the nested loop like this:

for(auto it1=left_unprocessed_event_arrays.begin();   i1!=left_unprocessed_event_arrays.end();){
  for(auto it2=right_unprocessed_event_arrays.begin(); it2!=right_unprocessed_event_arrays.end();){
    if(it1->header.stamp.nsec==it2->header.stamp.nsec){
      matching_event_arrays.push_back({*it1,*it2});
      it1 = left_unprocessed_event_arrays.erase(it1);
      it2 = right_unprocessed_event_arrays.erase(it2);
      goto next;
    }
    ++it2;
  }
  ++it1;
next:;
}

answered Nov 27 '18 at 12:17

j6t

9,150
1
15
35

I don't think this is a good place to use `goto`. You are not actually breaking out of multiple nested loops, only one (so it is not allowed by the [Cpp Core guidelines](https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#es76-avoid-goto) either). Just that it looks ok in this instance is not a good reason to use it - it also "looks ok" if you replace an `if` by `goto`, yet that is universally frowned upon. – Max Langhof Nov 27 '18 at 12:55
More specifically, the two places where `it1` is changed are far apart, making it harder to verify that the loop actually terminates. That makes the code more fragile in face of changes (which is one of the main reasons why `goto` is generally best avoided). – Max Langhof Nov 27 '18 at 12:58
@MaxLanghof The loop is short enough. In this case, `goto` not only "looks ok", but "is ok". Be pragmatic. Guidelines are just that: guidelines. That said, your solution with std algorithms is preferable in the general case. Upvoted. – j6t Nov 27 '18 at 13:04

Using iterators in nested loop through two vectors to match elements and erase from the vectors

3 Answers3