0

I have a vector of strings. For now take:

std::vector<std::string>{
"one","two","three","two","ten","six","ten".......
}

Now I want to filter out only unique strings in the vector, and in the same order as they are.

Is there a standard library function for this?

EDIT: I need to discard the repeated values.

Evg
  • 25,259
  • 5
  • 41
  • 83
lorem1213
  • 433
  • 4
  • 11

4 Answers4

1

If you want to remove the duplicate while keeping the order of the elements, you can do it (inplace) in O(n log(n)) like this:

std::vector<std::string> v{"one", "two", "three", "two",  "ten", "six", "ten"};

// Occurrence count
std::map<std::string, int> m;

auto it = std::copy_if(v.cbegin(), v.cend(), v.begin(),
                       [&m](std::string const& s) { return !m[s]++; });
                    // ^ keep s if it's the first time I encounter it

v.erase(it, v.end());

Explanation: m keeps track of the number of times each string has already been encountered while scanning through v. The copy_if won't copy a string if it has already been found.


If you want to remove all the elements that are duplicated, you can do it (inplace) in O(n log(n)) like this:

// Occurrence count
std::map<std::string, int> m;

// Count occurrences of each string in v
std::for_each(v.cbegin(), v.cend(), [&m](std::string const& s) { ++m[s]; } );

// Only keep strings whose occurrence count is 1.
auto it = std::copy_if(v.cbegin(), v.cend(), v.begin(),
                       [&m](std::string const& s) { return m[s] == 1; });
v.erase(it, v.end());
paolo
  • 2,345
  • 1
  • 3
  • 17
  • Why map? Using unordered_map here will give you the same result but much better performance close to O(n). – Vit Jan 12 '23 at 15:53
1

You can create a frequency map from the vector and create a new vector including only the elements with a frequency of 1.

std::vector<std::string> v{"one", "two", "three", "two", "ten", "six", "ten"};
std::unordered_map<std::string, int> freq;
for(const auto& e: v) ++freq[e];
std::vector<std::string> res;
res.reserve(v.size());
for (const auto& e: v)
    if (freq[e] == 1) res.push_back(e);
Unmitigated
  • 76,500
  • 11
  • 62
  • 80
1

This will count the occurrences of each string and erase elements with more than one occurrence.

std::unordered_map<std::string, size_t> m;
for ( auto & s : v ) ++m[s];

std::erase_if( v, [&](auto const &s){ return m[s] > 1;} );

The entire operation is O(n).

See it work in Compiler Explorer

Drew Dormann
  • 59,987
  • 13
  • 123
  • 180
  • 3
    How can it be `O(n)` ? You have a `O( log (n) )` lookup in the map for each of the `n` elements of the vector. – paolo Jul 23 '22 at 21:10
1

You can use:

std::vector<std::string> unique(const std::vector<std::string> &original)
{
  std::vector<std::string> result;
  std::set<std::string> aux;
  for (auto it = original.begin(); it != original.end(); ++it) {
    if(aux.find(*it) == aux.end()){
      aux.insert(*it);
      result.push_back(*it);
    }
  }
  return result;
}

It will keep the order of the original input vector:

output:

one
two
three
ten
six