2

While I was debugging, I figured out that std::remove doesn't behave as I expected. For example:

std::vector<int> nums {0,1,2,2,3,0,4,2};
int val{2};
std::remove(nums.begin(), nums.end(), val);

nums vector changes to [0, 1, 3, 0, 4, 0, 4, 2]. Where does that extra 0 and 4 come from? I would expect [0, 1, 3, 0, 4, 2, 2, 2].

Another example:

std::vector<int> nums {3,2,2,3};
int val{3};
std::remove(nums.begin(), nums.end(), val);

nums vector changes to [2, 2, 2, 3]. One 3 has been changed to 2 somehow. Could you explain this behavior?

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
kry23
  • 99
  • 5
  • 5
    Yes, it's expected behaviour. `std::remove` cannot remove anything from the vector **because it only has iterators to work with**. So what it actually does is move the items to be kept to the front of the vector, you are then supposed to actually shrink the vector yourself. You can do that with the iterator that `std::remove` returns. – john Jul 02 '23 at 15:48
  • 7
    Start by reading [a good `std::remove` reference](https://en.cppreference.com/w/cpp/algorithm/remove). Then learn about [the erase/remove idiom](https://en.wikipedia.org/wiki/Erase%E2%80%93remove_idiom). – Some programmer dude Jul 02 '23 at 15:49
  • *I would expect [0, 1, 3, 0, 4, 2, 2, 2].* Your expectations are mistaken, that's not how [`remove`](https://en.cppreference.com/w/cpp/algorithm/remove) works. – Eljay Jul 02 '23 at 15:50
  • 1
    @john The question is not confused about that part. – user17732522 Jul 02 '23 at 15:52
  • OK, I can see that your question is slightly different from what I originally thought. All `std::remove` has to do is make sure the items to be kept are at the start of the vector (and in the same order). The remaining items in the vector are unspecified. Specifically the overall contents of the vector may change. – john Jul 02 '23 at 15:53
  • @user17732522 Yes, I figured that out when I reread the question. – john Jul 02 '23 at 15:53
  • 2
    *I would expect [0, 1, 3, 0, 4, 2, 2, 2].* -- Wrong algorithm. If this is what you expected, then `std::stable_partition` is what you are looking for: `std::stable_partition(nums.begin(), nums.end(), [](int n) { return n != 2; });`. – PaulMcKenzie Jul 02 '23 at 15:57
  • 1
    [See this](https://godbolt.org/z/36fM5hKWW). – PaulMcKenzie Jul 02 '23 at 16:02
  • @user17732522 Please explain why have you reopened the question that is the exact duplicate of [STL remove doesn't work as expected?](https://stackoverflow.com/questions/6456870/stl-remove-doesnt-work-as-expected), that has the exact answer like your one: *The iterators in the range [new_last, last) are all still dereferenceable, but the elements that they point to are unspecified. Remove is stable, meaning that the relative order of elements that are not equal to value is unchanged.* – 273K Jul 02 '23 at 16:03
  • 1
    @273K The question itself is not a duplicate since they start from completely different assumptions and only one of the answers happens to contain the detail relevant for this question in a quote without focusing on it. I don't think that is enough to qualify as duplicate. – user17732522 Jul 02 '23 at 16:16
  • 1
    @user17732522 One of the answers there is the highly ranked accepted answer. If it answers this question, it could be closed as a dupe or as needs more details about why OP expects some specific behavior. – 273K Jul 02 '23 at 16:21
  • Thanks everyone for the replies. However, my question is not about why "re-move" doesn't delete/erase elements. I know the erase/remove idiom, and I am using it in my code. @273K My question is about why do we have undefined/random values. Why are all the specified element occurrences not being "re-moved" to the end? What is the advantage of this behavior? I am looking for the technical details underlying this behavior. That's why I don't think that it is a duplicate. – kry23 Jul 02 '23 at 19:37
  • @kry23: "*My question is about why do we have undefined/random values.*" Well... what else are they going to be? What *should* they be? The function is called "remove" not "swap to the end". The goal of the function is for the valid part of the sequence to no longer have those matching elements. What is inside of the *invalid* part of the sequence is irrelevant. – Nicol Bolas Jul 02 '23 at 19:48
  • @NicolBolas *"The function is called "remove" not "swap to the end"*, yes correct. The function name is remove, so I would just expect to re-move the to-be-kept elements to the front and re-move the specified element's to the end – kry23 Jul 02 '23 at 19:57
  • Why do you expect swaps with removed elements? STL developers are very smart people, they implement very optimized algorithms. The remove algorithm doesn't suppose to perform many swaps but rather memmove of memory blocks between elements to remove. The answer in the question that I consider to be duplicated, also answers your question, you haven't applied enough efforts for reading that answer and posted a duplicate. – 273K Jul 02 '23 at 21:07
  • 1
    @kry23 *"The function name is remove, so I would just expect to re-move the to-be-kept elements to the front and re-move the specified element's to the end"* -- You used "re-move" twice. Wouldn't that suggest the name should be "remove_remove"? :) – JaMiT Jul 02 '23 at 22:23
  • @kry23: "*so I would just expect to re-move the to-be-kept elements to the front and re-move the specified element's to the end*" That's not how English works. The word "remove" does not mean "move again". It means "get rid of". – Nicol Bolas Jul 02 '23 at 22:39
  • try use remove and erase? – kenash0625 Jul 03 '23 at 09:49
  • 1
    @kry23 -- The `std::remove` makes sure that the valid range is correct. What you see in the "removed area" is junk and cannot be used in any meaningful way *except* to be finally erased. The idea is that you shouldn't care what is in that area. Again, if you really wanted the algorithm to move the "about to be erased" elements and you want to do something with them, then that is a two step process -- `std::stable_partition` or `std::partition`, and then do what you want to the elements that are placed after the partition point, and then finally erase them. – PaulMcKenzie Jul 03 '23 at 14:10
  • @273K I know that they are smart people. I am not implying that something is wrong here. I just wanted to get the idea behind it. See the accepted answer to understand what I am looking for and apply enough effort for realizing that it's not a duplicate... – kry23 Jul 03 '23 at 22:17

4 Answers4

4

remove does not state that the elements past (and including) its return ending iterator will have any particular value. It does not necessarily swap anything to those values. It is perfectly legal for them to be untouched or to contain undefined values or whatever.

You should not be looking at the elements past the returned ending iterator. The only elements that matter are those on the half-open range of begin to the returned iterator.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • Ok, this is what it does as we can see from the debugging. But what is the intuition? Why does the name of the function not apply for the specified element occurrences? – kry23 Jul 02 '23 at 19:41
  • @kry23: Because they are outside of the valid range of elements. If you really understand what the erase/remove idiom is doing, then I don't really see how you have a question. It works the way it does because that's the job it's supposed to do. – Nicol Bolas Jul 02 '23 at 19:50
3

Where does that extra 0 and 4 come from?

They come from you. (So does the 2 after them.) All std::remove does is shift elements. It does not swap elements; it must use move-assignment (which is the same as copy-assignment for int).

Before: {0,1,2,2,3,0,4,2}
                 | | |
             ----- | |
             | ----- | (assignment, not swaps)
             | | -----
             | | |
             V V V
After:  {0,1,3,0,4,0,4,2}
         ^^^       ^^^^^ -- unchanged

Since the integers after the new "last" element (the shifted 4) are expected to be erased, there is no need to waste CPU cycles changing their values. (If you were dealing with something more complex than int, there might be an efficiency gain from moving values rather than copying them; in that case the values would change to something "valid but unspecified" -- i.e. the "moved-from" state. Still not a swap.)

Similarly:

Before: {3,2,2,3}
           | |
         --- |
         | ---
         | |
         V V
After:  {2,2,2,3}
             ^^^ -- unchanged

See also STL remove doesn't work as expected? which is about the same phenomenon, but from a different flawed expectation. (As I understand the current guidelines, since the question is from a different perspective, it should not be treated as a duplicate.)

See also possible implementation @ cppreference.com. The algorithm is fairly simple. Iterate over the container. When a to-be-removed element is found, do nothing. Otherwise, assign that value to the first element that has not yet been assigned something. After iterating, return an iterator to the first element that has not yet been assigned something. There is no cleanup attempted for the elements not assigned something. The values there are just whatever was left after assigning from (or skipping) them.

JaMiT
  • 14,422
  • 4
  • 15
  • 31
1

std::remove makes no guarantee about the values of the "removed" items at the end of the range. It only guarantees that the non-"removed" items will be at the front of the range and in their original relative order.

user17732522
  • 53,019
  • 2
  • 56
  • 105
-1

To remove elements from a vector equal to some value or that satisfy some condition you need to use the so-called erase-remove idiom.

The standard algorithm std::remove just moves to the left elements of the vector in positions of removed elements and returns an iterator that points to after the last kept element.

That is you need to write

nums.erase( std::remove(nums.begin(), nums.end(), val), nums.end() );

If your compiler supports the C++20 Standard then you could use general function std::erase like

std::erase( nums, val );

Here is a demonstration program.

#include <iostream>
#include <vector>
#include <algorithm>

int main()
{
    std::vector<int> nums { 0, 1, 2, 2, 3, 0, 4, 2 };
    int val{ 2 };

    nums.erase( std::remove( nums.begin(), nums.end(), val ), nums.end() );

    for (const auto &item : nums)
    {
        std::cout << item << ' ';
    }
    std::cout << '\n';

    nums = { 0, 1, 2, 2, 3, 0, 4, 2 };

    // Supported in C++20
    std::erase( nums, val );

    for (const auto &item : nums)
    {
        std::cout << item << ' ';
    }
    std::cout << '\n';

    nums = { 0, 1, 2, 2, 3, 0, 4, 2 };
}

The program output is

0 1 3 0 4
0 1 3 0 4
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335