1

Any efficient algorithm to achieve the following:-

Raw data:-

var arr = [

 ["one", "two", "three", "four", "five"] ,
 ["one", "two", "three", "six", "seven"] ,
 ["one", "two", "eight", "four", "ten"]
 /* There can be more rows (arrays) */

];

Output:-

var arr2 = [

    ["one", "two"], /* this is because these are present in most of sub arrays (which in this case is 3 sub arrays) */
    ["three", "four"], /* after that three and four are present in most of subarrays (which in this case is 2 sub arrays) */
    ["five"], /* for those who have occurred one time order doesn't matter. Whichever comes first. */
    ["six"], 
    ["seven"],
    ["eight"],
    ["ten"]
    /* There can be even more rows... */

]

Conditions:-

  1. In a case if there is only one instance of an element in the whole multidimensional array then it should be appended to the output array (arr2) as a single array element. NOT with other elements (in a single array) who also exist alone. This is a special check only for single instance elements. If an element is present more than one time in the whole multidimensional array then it can be appended to the output array (arr2) with other elements, who occur the same number of times, in a single array. (Sorry for updating this condition so late. Thank you Joseph Mayer for pointing this out.)

Assumptions:-

  1. Element won't repeat in a subarray.
Community
  • 1
  • 1
Omar Tariq
  • 7,536
  • 10
  • 38
  • 57
  • by group ["one","two"] you mean they both occur in the same subarrays or they might occur in different arrays but 3 times each. – Vikram Bhat Feb 01 '14 at 04:47
  • Second option: `They might occur in different subarrays but 3 times each`. This point is clear from the second row / subarray of arr2, where `three` and `four` occur two times but are not necessarily in the same row / subarray. – Omar Tariq Feb 01 '14 at 04:56
  • 2
    Your example output does not make logical sense. It starts by returning elements together which occurred an equal number of times together in a single array. Those elements which occurred three times are returned in an array; those elements which occurred twice are returned in an array; but then those elements which occurred an equal number of times (once) in an array are not returned together! To make sense, the last result in your output should be a single array containing all those elements which occurred once. – Joseph Myers Feb 01 '14 at 05:44
  • @JosephMyers Thank you for pointing this out. Indeed that's a valid mistake. However, I want the output to be exact like this and for this I'll update the question with a special condition for single instance elements. – Omar Tariq Feb 01 '14 at 07:56
  • @OmarTariq Check out the line of code I added to split the last array of single instance elements into separate singleton arrays. – Joseph Myers Feb 03 '14 at 04:20

3 Answers3

1

Use array count[11] to count the number of (1-10) in all subarrays. Keep a visited[11] array to check if you have already encountered a number in the subarrray. Just do the mapping "one" => 0 , "two" => 1, "three" => 2 .... . Use a HashMap for mapping. Then sort the count array that will have the sequence of the words.

Vikram Bhat
  • 6,106
  • 3
  • 20
  • 19
1

Solution

Such an algorithm could involve three steps:

  1. Counting the occurrences of each array value across all subarrays of the two-dimensional array.
  2. Creating a new two-dimensional array and storing all array values having the same count together in new subarrays of this new 2D array.
  3. Sorting the subarrays of this new 2D array in descending order by the common count that their elements have within the original array.

Remark: Since the counts might be quite far from being consecutive, this new 2D array is implemented as an object / "sparse" array before being mapped into a real "dense" array in my example JavaScript implementation of this algorithm which follows:

function mostCommon(a) {
  var byFrequency = [], keyCount = {}, byCount = {}, k;
  /* Step 1 */
  a.map(function(x) {
    x.map(function(y) {
      if (!keyCount[y]) keyCount[y] = 1;
      else keyCount[y]++;
    });
  });
  /* Step 2 */
  for (k in keyCount) {
    if (!byCount[keyCount[k]])
      byCount[keyCount[k]] = [], byFrequency[byFrequency.length] = keyCount[k];
    byCount[keyCount[k]].push(k);
  }
  /* Step 3 */
  byFrequency.sort(function(a,b) { return b-a; });
  return byFrequency.map(function(x) { return byCount[x]; });
}
var arr = [

 ["one", "two", "three", "four", "five"] ,
 ["one", "two", "three", "six", "seven"] ,
 ["one", "two", "eight", "four", "ten"]
 /* There can be more rows (arrays) */

];

console.log(JSON.stringify(mostCommon(arr)));
/* [["one","two"],["three","four"],["five","six","seven","eight","ten"]] */

Update

It is extremely easy to split the last array returned if indeed you wish to treat single elements differently than others, and if the last array contains single elements. I updated my function above with an optional flag to do this and an if statement / for loop that handles the special behavior that you desire.

function mostCommon(a, optSplitSingle) {
    var byFrequency = [], keyCount = {}, byCount = {}, i, k;
    a.map(function(x) {
        x.map(function(y) {
            if (!keyCount[y]) keyCount[y] = 1;
            else keyCount[y]++;
        });
    });
    for (k in keyCount) {
        if (!byCount[keyCount[k]]) {
            byCount[keyCount[k]] = [];
            byFrequency[byFrequency.length] = keyCount[k];
        }
        byCount[keyCount[k]].push(k);
    }
    byFrequency.sort(function(a,b) { return b-a; });
    a = byFrequency.map(function(x) { return byCount[x]; });
    if (optSplitSingle && byCount[1]) {
      for (k=a.length-1, i=0; i<byCount[1].length; i++)
        a[k++] = byCount[1][i];
    }
    return a;
}
var arr = [

 ["one", "two", "three", "four", "five"] ,
 ["one", "two", "three", "six", "seven"] ,
 ["one", "two", "eight", "four", "ten"]
 /* There can be more rows (arrays) */

];

console.log(JSON.stringify(mostCommon(arr, true)));
/* [["one","two"],["three","four"],"five","six","seven","eight","ten"] */
Joseph Myers
  • 6,434
  • 27
  • 36
1

Something like

var numList = [];

for(var i in arr)
{
  for(var j in arr[i])
  {
     if(arr[i][j] in numList) numList[arr[i][j]].count++;
     else numList[arr[i][j]] = { "count":1 };
  }
}

var sortedList = numList.sort(function(o1,o2)
{
   return o1.count < o2.count;
});

results in

[ one: { count: 3 },
  two: { count: 3 },
  three: { count: 2 },
  four: { count: 2 },
  five: { count: 1 },
  six: { count: 1 },
  seven: { count: 1 },
  eight: { count: 1 },
  ten: { count: 1 } ]

Merging rows with equal count should be straightforward Order is O(N) to create the array since javascript arrays use hashing and O(MLogM) to sort the results where N = number of elements in arr and M = number of unique strings. Merging would be O(M)

Community
  • 1
  • 1
waTeim
  • 9,095
  • 2
  • 37
  • 40
  • Your code counts the occurrences of elements, and that is not an answer to the stated problem. Two sentences of hand-waving at the end is also not an answer. The question calls for an efficient *algorithm*. – Joseph Myers Feb 03 '14 at 04:18
  • No reason to belabor the obvious. If OP needed clarification, I would provide. – waTeim Feb 03 '14 at 04:29