My comment:
In that case, you should use a std::set<int>
because that is actually what matches your requirement. You could use also a std::map<int, int>
to map the key to the number of occurrences if you like.
OPs reply:
Can you add this to a full-fledged answer so that I can accept it for this question?
Here we go:
Just filtering duplicates:
#include <iostream>
#include <set>
int main()
{
int sample[] = { 5, 2, 3, 3, 2, 2 };
// add all values at most once
using Table = std::set<int>;
Table table;
for (int value : sample) table.insert(value);
// output the result
for (const Table::value_type& entry : table) {
std::cout << "Value " << entry << "\n";
}
}
Output:
Value 2
Value 3
Value 5
Demo on coliru
Counting the number of occurrences:
#include <iostream>
#include <map>
int main()
{
int sample[] = { 5, 2, 3, 3, 2, 2 };
// add all values at most once but count the number of occurrences
using Table = std::map<int, unsigned>;
Table table;
for (int value : sample) ++table[value];
// output the result
for (const Table::value_type& entry : table) {
std::cout << "Value " << entry.first << " (" << entry.second << " times)\n";
}
}
Output:
Value 2 (3 times)
Value 3 (2 times)
Value 5 (1 times)
Demo on coliru
The trick:
The std::map::operator[] inserts an element if the key is not yet there. This element (in this case std::pair<const int, unsigned>
) is default initialized which grants that it starts as { key, 0 }
.
So, there are two cases:
- The key is not yet there:
The element is created as { key, 0 }
and the value (.second
of the element) is incremented immediately which results in { key, 1 }
.
- The key is already there:
The value (.second
of the element) is incremented again.
A variation on filtering duplicates:
This keeps the original input order but removes repetitions (by book-keeping in a separate std::set
).
#include <iostream>
#include <set>
#include <vector>
int main()
{
using Sample = std::vector<int>;
Sample sample = { 5, 2, 3, 3, 2, 2 };
// remove duplicates
using Table = std::set<int>;
Table table;
Sample::iterator iterRead = sample.begin();
Sample::iterator iterWrite = sample.begin();
for (; iterRead != sample.end(); ++iterRead) {
if (table.insert(*iterRead).second) *iterWrite++ = *iterRead;
}
sample.erase(iterWrite, sample.end());
// output the result
for (const Sample::value_type& entry : sample) {
std::cout << "Value " << entry << "\n";
}
}
Output:
Value 5
Value 2
Value 3
Demo on coliru
The trick:
std::set::insert() returns a pair of iterator
and bool
.
The iterator
points to the key in the set (inserted or already been there).
The bool
denotes if the key was inserted (true
) or was already there (false
).
The other trick:
Just erasing every found duplicate from the std::vector
would result in the worse complexity O(n²).
Hence, two iterators are used, one for reading and one for writing. Thereby, every input value which is not yet in the bookkeeping table (and hence occurs the first time) is written back, otherwise not.
So, every value which occurred the first time is shifted towards the beginning and appended to the previous values which occurred the first time each. Additionally, the iterWrite
points past the last written element after the loop and can be used to erase the rest (which contains left input values which are all duplicates).
The complexity of this algorithm is O(n) – much better than the naive approach.
Btw. the standard algorithms std::remove(), std::remove_if() does it the same way.
Thus, the same algorithm could be achieved with std::remove_if()
:
#include <algorithm>
#include <iostream>
#include <set>
#include <vector>
int main()
{
using Sample = std::vector<int>;
Sample sample = { 5, 2, 3, 3, 2, 2 };
// remove duplicates
using Table = std::set<int>;
Table table;
Sample::iterator last
= std::remove_if(sample.begin(), sample.end(),
[&](int value) { return !table.insert(value).second; });
sample.erase(last, sample.end());
// output the result
for (const Sample::value_type& entry : sample) {
std::cout << "Value " << entry << "\n";
}
}
Output:
like above
Demo on coliru