5

Here is simple program to find the element which occurs most often in an array:

#include <cstdlib>
#include <iostream>
#include <vector>

using namespace std;

int main(int argc, char *argv[]) {
    int a[] = {1,2,3,4,4,4,5};
    int n = sizeof(a) / sizeof(int);
    int max = 0;
    int result = 0;
    int *b = new int[n];
    for (int i = 0;  i < n;  i++) {
        b[a[i]] = (b[a[i]] || 0) + 1;
        if (b[a[i]] > max) {
            max = b[a[i]];
            result = a[i];
        }
    }
    cout << result << endl;
    system("PAUSE");
    return EXIT_SUCCESS;
}

But it doesn't work; it prints 1. Why?

Eamon Nerbonne
  • 47,023
  • 20
  • 101
  • 166
user457463
  • 91
  • 1
  • 1
  • 6
  • Can you guarantee that the array is always sorted as per your example? Or, can you guarantee that all occurences of a given element will be consecutive? If so, simpler, faster algorithms with O(1) extra storage exist. – Eamon Nerbonne Sep 26 '10 at 15:25

5 Answers5

11

Since you are including vector anway, why not replace int *b=new int [n]; with std::vector<int> b(n)? This also takes care of releasing the memory, you forgot to delete[] b.

But as others have mentioned, your solution will break if the array contains elements larger than n. A better approach might be to count the elements with a mapping to int. That way, you can also count elements that cannot be used as an array index, for example strings.

There is also no reason to limit ourselves to arrays. Here is a generic solution that works with any container of any less-than comparable element type:

#include <algorithm>
#include <iterator>
#include <map>

struct by_second
{
    template <typename Pair>
    bool operator()(const Pair& a, const Pair& b)
    {
        return a.second < b.second;
    }
};


template <typename Fwd>
typename std::map<typename std::iterator_traits<Fwd>::value_type, int>::value_type
most_frequent_element(Fwd begin, Fwd end)
{
    std::map<typename std::iterator_traits<Fwd>::value_type, int> count;

    for (Fwd it = begin; it != end; ++it)
        ++count[*it];

    return *std::max_element(count.begin(), count.end(), by_second());
}

#include <iostream>
#include <vector>

int main()
{
    std::vector<int> test {1, 2, 3, 4, 4, 4, 5};
    std::pair<int, int> x = most_frequent_element(test.begin(), test.end());
    std::cout << x.first << " occured " << x.second << " times";
}
fredoverflow
  • 256,549
  • 94
  • 388
  • 662
  • 1
    Uhm, your edit is probably _technically_ fine, but given the question, I think your proposed code is overcomplicated. The questioner is still learning the basics, let's not overwhelm him with templates and iterator details just yet... – Eamon Nerbonne Sep 26 '10 at 15:35
  • @Loïc: But why is it required? Isn't a name introduced by a `typedef` always a type name? – fredoverflow Sep 26 '10 at 15:38
  • Some reference : http://publib.boulder.ibm.com/infocenter/comphelp/v8v101/index.jsp?topic=/com.ibm.xlcpp8a.doc/language/ref/keyword_typename.htm – Loïc Février Sep 26 '10 at 15:40
  • 3
    @FredOverflow the `typename` applies to `value_type`, not to `map_t`. If you omit the `typename`, then `map_t::value_type` is taken as a non-type and then the compiler is granted to reject `by_second` at the time parsing the template because `by_second` requires a type argument. – Johannes Schaub - litb Sep 26 '10 at 16:15
  • Would it be a bit simpler to build a `multiset` there instead of the map? – Cubbi Sep 26 '10 at 17:06
  • Yeah, that#s the classic example for a map which default-constructs an object of the referenced type upon accessing a non-existing index. Note that your algorithm will exhibit undefined behavior if the sequence passed to it is empty. – sbi Sep 26 '10 at 19:19
  • Oh, and constructing that comparator gets easier if you change it to this: `struct by_second { template bool operator()(const Pair& a, const Pair& b) {return a.second – sbi Sep 26 '10 at 19:21
  • 1
    @Eamon: A task like this (well, it was counting word occurrences in a text) was among the first few my students ever had to do. Of course, by that time they could only _use_ templates, but that was fine. Manually fiddling with memory is far worse than this. – sbi Sep 26 '10 at 19:22
  • @Fred: `most_frequent_element()` still invokes _UB_, though. However, thinking about this, since it returns an actual object of a type that's depending on a template parameter, there's probably no way around the requirement to pass it a non-empty sequence. – sbi Sep 26 '10 at 19:39
  • BTW, what's wrong with `typename std::iterator_traits::value_type` for the return type? – sbi Sep 26 '10 at 19:40
  • @sbi: I want to return the element and its frequency in a pair. `map::value_type` is defined as `pair`. – fredoverflow Sep 26 '10 at 19:52
6

You've not initialized your array b. You can do it like this:

int *b=new int [n]();
                  ^^

Once that is done you can increment your frequency array as:

b[a[i]]++;

in place of

b[a[i]]=(b[a[i]]||0)+1;

Your way of doing (b[a[i]]||0)+1 works in languages like Javascript, Perl where an un-initialized array element will have a special value called undef or null. C++ does not have anything like that and an uninitialized array will have garbage.

Community
  • 1
  • 1
codaddict
  • 445,704
  • 82
  • 492
  • 529
  • +1 for simple solution. Although this program is still brittle in that it will crash if array a has values like {1, 600, 765, 2000, 600} – AndyG Sep 26 '10 at 19:41
1

You need to initialize your array b to zero.

 b[a[i]]||0

will not work, you don't know what is in b originally.

Shorter code :

int main(int argc, char *argv[])
{
    int a[]={1,2,3,4,4,4,5};
    int n = sizeof(a)/sizeof(int );
    int *b=new int [n];
    fill_n(b,n,0); // Put n times 0 in b

    int val=0; // Value that is the most frequent
    for (int i=0;i<n;i++)
        if( ++b[a[i]] >= b[val])
            val = a[i];

    cout<<val<<endl;
    delete[] b;
    return 0;
}

ATTENTION : your code (mine too) can ONLY work if the maximum value is lower than the number of values in the array a.

If that's not the case and the maximum value is much greater that the number of elements, you might want to sort the array and then search in linear time (and constant space) for the maximum value.

Loïc Février
  • 7,540
  • 8
  • 39
  • 51
0

implementation that takes advantage of the fact map is sorted.

#include <iostream>
#include <vector>
#include <map>

using namespace std;

template<class T> pair<T, int> getSecondMostOccurance(vector<T> & v) {
    map<T, int> m;

    for ( int i = 0; i < v.size(); ++i ) {
        m[ v[i] ]++;
    }
    return *m.end();
}

int main() {
    int arr[] = {1, 4, 5, 4, 5, 4};
    pair<int, int> p = getSecondMostOccurance(vector<int>(arr, arr+7));
    cout << "element: " << p.first << " count: " << p.second << endl;
    return 0;
}
user167698
  • 1,833
  • 6
  • 20
  • 25
0

Here, O(n) in time, O(1) in space general solution that works on sorted ranges.

#include <iostream>

template <class ForwardIterator>
ForwardIterator
most_common(ForwardIterator first, ForwardIterator last) {
  /** Find the most common element in the [first, last) range.

      O(n) in time; O(1) in space.

      [first, last) must be valid sorted range.
      Elements must be equality comparable.
  */
  ForwardIterator it(first), max_it(first);
  size_t count = 0, max_count = 0;
  for ( ; first != last; ++first) {
    if (*it == *first) 
      count++;
    else {
      it = first;
      count = 1;
    }      
    if (count > max_count) {
      max_count = count;
      max_it = it;
    }
  }  
  return max_it;
}    
int main() {
  int a[] = {1, 2, 3, 4, 4, 4, 5};
  const size_t len = sizeof(a) / sizeof(*a);
  std::cout << *most_common(a, a + len) << std::endl;
}
jfs
  • 399,953
  • 195
  • 994
  • 1,670