Count occurrences in an array in O(n log n) time

Question

Given an unsorted array, A[], containing n integers, how would one create an algorithm to return the element that occurred most often?

I figure you'd need a way to count the number of times an element occurs. I can only figure out an O(n²) implementation. I know I need to use a sorting algorithm first, but if I were to use merge sort, that's already the total runtime of O(n logn)). I've only sorted the data and I can't look at the elements without further increasing the runtime.

If sorting the array is a possibility, merge sort will do the trick making it n logn, after that a simple n iteration will do the trick — Daniel Mendel, Aug 24 '15 at 21:37
Not an answer, but a clue. O(n log n) + O(n) is still O(n log n). Big-O notation only cares about the dominant term. — David Conneely, Aug 24 '15 at 21:37
@DavidC I thought you would multiply them--not add them! If had known that, I wouldn't have asked this question. Thank you! — , Aug 24 '15 at 21:44

score 11 · Accepted Answer · edited May 23 '17 at 12:06

Sort the array, then, all repeating elements are next to each other.
Simply iterate the array, while holding a maxSeen and currSeen counters, if current element equals last element, increase currSeen, otherwise - reset currSeen, and replace maxSeen if the currSeen is bigger than it.

Sorting is O(nlogn), and iterating once is O(n), so total of O(nlogn + n) = O(nlogn)

It's homework, so following this guidelines to create a code should be your task. Good Luck!

As a side note, since this is at least as hard as Element Distinctness Problem, and it cannot be done better than O(nlogn) using comparisons based algorithms.

Side note two, it can be done in O(n) time + space using hash-tables.
Create a histogram of the data, which is a hash-map, inwhich a key is an element, and the value is number of occurances. Then, the problem decays to finding maximal value in this hash table.
Building the table is O(n) average case, and finding maximum is O(n).

This however uses O(n) extra memory, and might deteriorate to O(n^2) time under worst case behavior.

I hadn't heard of the Element Distinctness Problem. Thanks for including that link! — templatetypedef, Aug 24 '15 at 21:44
@templatetypedef Look at the second link as well, it points to the articles that proves lower bound using algebraic trees model. (of course it can be done in O(n) average case and space, using hashes, but not using comparisons) — amit, Aug 24 '15 at 21:46

score 3 · Answer 2 · answered Aug 24 '15 at 21:40

You surmise correctly that it's a good idea to start by sorting the array. This costs O(n log n), as you point out. Then you can scan the array and compute the frequency of each value one by one, remembering the most frequent value as you go along. This costs O(n). The total cost is O(n log n + n), which is in O(n log n).

To see why, consider O(2n log n). This is surely in O(n log n) because 2n log n is within a constant factor of n log n. And n log n + n is bounded above by 2n log n, so it is also in O(n log n).

I didn't realize that that you would add the two runtimes, not multiply them. That's me being stupid. I appreciate it! — , Aug 24 '15 at 21:43

score 1 · Answer 3 · answered Apr 30 '20 at 10:43

1

just take a dictionary store every number in it as key, and value as its count initially 1 if number repeated again increase that count otherwise input it in dict.

traverse through each key value in dictionary whose value is greater its key will be largest among all

solved in n+n =2n ~> n

answered Apr 30 '20 at 10:43

xoxocoder

294
3
15

Please note that this is a 4 year old question with already accepted answer. and hashmap/dictionary suggestion is already covered in accepted answer. – Sameer Naik Apr 30 '20 at 21:06

Count occurrences in an array in O(n log n) time

3 Answers3