3

I have a vector with integers. I would like to select the most common numbers. I would like to make a top 5 list. For example:

std::vector<int> numbers = {32, 32, 32, 12, 12, 11, 11, 11, 9};

the most common numbers are: TOP 1: 32, TOP 2: 11, TOP 3: 12, TOP 4: 9;

after I selected them, I would like to store it into an other vector: the most common numbers.

ΦXocę 웃 Пepeúpa ツ
  • 47,427
  • 17
  • 69
  • 97

5 Answers5

6

Here's another algorithm, the cost is going to be O(n) for any k, plus a decent cache locality.

1.First store all elements in a unordered_map O(N)

std::unordered_map<int, int> m;
for(const auto ele: numbers) {
  m[ele]++;   
}

2.Dump all elements in a vector of pairs O(N)

std::vector<std::pair<int, int>> temp;  
for(const auto& ele: m) {
 temp.emplace_back(ele.first ,  ele.second);   
}

3.Now use the nth_element to find kth rank O(N)

std::nth_element( temp.begin(), temp.begin()+k ,
temp.end(), [](const auto& p1, const auto& p2) {
        // Strict weak ordering
        if (p1.second > p2.second) {
            return true;
        }  if (p1.second < p2.second) {  
            return false;
        }
        return p1.first > p2.first; // We have to print large element first
    } );

4.Display the output

std::for_each( temp.begin(), temp.begin() +k - 1, [](const auto & p) {
    std::cout << p.first << " ";
});

Demo Here

P0W
  • 46,614
  • 9
  • 72
  • 119
3

You can create a unordered_map<int,int> mp where you can store the count of each number, like mp[32] = 3.

Next you need to find the top five elements

  1. Time Complexity : O(mlogm) : You can sort the map in descending order( to sort it you will have to use a extra vector), and take the first 5 elements.

  2. Time Complexity: O(m) : Or you can iterate 5 times over the whole map to get the top file elements. Each time you iterate find the number with the greatest frequency which is not present in our topFive vector yet.

m : number of entries in the map.

Deepak Patankar
  • 3,076
  • 3
  • 16
  • 35
2

I've made this example and put comments in-line. It needs C++11 at least.

#include <map>
#include <vector>
#include <iostream>
#include <algorithm>

int main(void) {
  std::map<int, int> ans;
  std::vector<int> numbers = {32, 32, 32, 12, 12, 11, 11, 11, 9};
  std::vector<std::pair<int, int>> sorted;
  std::vector<int> common;
  // Step 1 Accumulate into a map, counting occurrences
  for (auto number : numbers) {
    ans[number]++;
  }
  // Step 2 Make a linear list, putting the count first then the value
  for (auto& ent : ans) {
    sorted.emplace_back(ent.second, ent.first);
  }
  // Step 3 sort it, by count ascending
  std::sort(std::begin(sorted), std::end(sorted));

  // Step 4 Get commonest 5 (or fewer)
  for (int i = 1; i<= 5; ++i) {
    int index = sorted.size() - i;
    if (index >= 0) {
      common.push_back(sorted[index].second);
    }
  }

  // Step 5 print out
  for (auto i : common) {
    std::cout << i << std::endl;
  }
  return 0;
}
Peter Hull
  • 6,683
  • 4
  • 39
  • 48
2

you can do this: create a set so you get rid off duplicated, then find the frequency of every item of the set in the vector, with this result create a pair (something like int, int) push the pair in a vector and finally sort it using your own predicate:
now for the top x you can do a for loop or just resize the vector if you are sure what the consequences of this are.

std::vector<int> numbers{32, 32, 32, 12, 12, 11, 11, 11, 9};
std::set<int> mySet(numbers.begin(), numbers.end());
std::vector<std::pair<int, int>> result{};

for(const auto& x : mySet)
{
    result.push_back(std::make_pair(x , std::count(numbers.begin(), numbers.end(), x)));
}
std::sort(result.begin(), result.end(), [](const std::pair<int, int>& a, const std::pair<int, int>& b){return (b.second < a.second);});
//result.resize(3);
std::cout << "Top: " << std::endl;
for(const auto& x : result)
{
   std::cout << x.first << ':' << x.second << std::endl;
}

the result will be:

Top: 11:3 32:3 12:2 9:1

ΦXocę 웃 Пepeúpa ツ
  • 47,427
  • 17
  • 69
  • 97
  • 1
    Partial sort would do. Plus `std::count` is a linear. Your first loop is N*N worst case. – P0W Jan 30 '20 at 12:20
1

There are many many ways how to achieve this. One of which may be.

std::vector numbers = {32, 32, 32, 12, 12, 11, 11, 11, 9};
int maxNumber = *std::max_element(numbers.begin(), numbers.end())
std::vector<int> occurrences(maxNumber + 1, 0);
for(auto& value : numbers)
{
   occurrences[value]++;
}

Then you just need to sort the array whilst keeping track of the indexes. This is a topic of another question C++ sorting and keeping track of indexes .

Croolman
  • 1,103
  • 13
  • 40