4

I'm using Ruby 2.4. Let's say I have an array of strings (which are all just string-i-fied (is that a word?) integers ...

["1", "2", "5", "25", "5"]

How do I write a function that tells me if all the elements in the array occur no more than twice in the array? For example, this array

["1", "3", "3", "55", "3", "2"]

would return false because "3" occurs three times, but this array

["20", "10", "20", "10"]

would return true because none of the elements occur more than twice.

10 Answers10

6

You can determine frequency like this:

frequency = array.reduce(Hash.new(0)) do |counts, value|
  counts[value] += 1
  counts
end
# => { "1" => 1, "3" => 3, "55" => 1, "2" => 1 }

And you can check if any of them occur more than twice like this:

frequency.values.max > 2

If you want to wrap it up nicely, you can add it to Enumerable:

module Enumerable
  def frequency
    f = Hash.new(0)
    each { |v| f[v] += 1 }
    f
  end
end

And then your condition is as simple as:

array.frequency.values.max > 2

Note: this comes as part of Facets.

coreyward
  • 77,547
  • 20
  • 137
  • 166
  • `each_with_object` is often less crufty than `reduce` where you have to deliberately kick the hash forward to the next round. – tadman Jan 14 '17 at 01:12
  • 1
    @tadman I forget that it's part of Enumerable now, to be honest. Still, I think it's great to expose people to `map` and `reduce` and their flexibility when appropriate. – coreyward Jan 14 '17 at 01:29
  • 1
    Oh, for sure! It's just that `reduce` and `inject` have taken a back seat to some of these more special-purpose methods. The interesting thing is every other Enumerable method can be implemented using `inject`, it's the basic building block of that library. – tadman Jan 14 '17 at 01:32
  • Good. Alternatively, `array.each_with_object(Hash.new(0)) { |counts, value| return false if (counts[value] += 1) > 2 }; true` (or `== 3`). – Cary Swoveland Jan 14 '17 at 02:06
6

Enumerable#group_by will do the heavy lifting for this:

def no_element_present_more_than_twice?(a)   
  a.group_by(&:itself).none? do |_key, values|
    values.count > 2
  end
end

p no_element_present_more_than_twice?(["1", "3", "3", "55", "3", "2"])
# => false
p no_element_present_more_than_twice?(["20", "10", "20", "10"])
Wayne Conrad
  • 103,207
  • 26
  • 155
  • 191
6

I've taken it upon myself to benchmark all of your options for you :)

Running each test 1024 times. Test will take about 34 seconds.
_akuhn is faster than _vlasiak by 16x ± 1.0
_vlasiak is faster than _wayne by 3.5x ± 0.1
_wayne is faster than _cary by 10.0% ± 1.0%
_cary is faster than _oneneptune by 10.09% ± 1.0%
_oneneptune is similar to _coreyward
_coreyward is faster than _tadman by 10.0% ± 1.0%
_tadman is faster than _sagarpandya82 by 10.0% ± 1.0%
_sagarpandya82 is faster than _glykyo by 80.0% ± 1.0%

As you can see, @akuhn's answer performs far better than the other algorithms because it exits early once a match has been found.

Note: I edited the answers to produce the same result, but did not edit any of them for optimisation.

Here is the script which produced the benchmarks:

require 'fruity'

arr = Array.new(1000) { |seed|
  # seed is used to create the same array on each script run,
  # hence the same benchmark results will be produced
  Random.new(seed).rand(1..10).to_s
}

class Array
  def difference(other)
    h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
    reject { |e| h[e] > 0 && h[e] -= 1 }
  end
end

compare do
  _coreyward do
    arr.reduce(Hash.new(0)) { |counts, value|
      counts[value] += 1
      counts
    }.max[1] <= 2
  end

  _wayne do
    arr.group_by(&:itself).none? do |_key, values|
      values.count > 2
    end
  end

  _sagarpandya82 do
    arr.sort_by(&:to_i).each_cons(3).none? { |a,b,c| a == b && b == c }
  end

  _tadman do
    arr.sort.slice_when { |a,b| a != b }.map(&:length).max.to_i <= 2
  end

  _cary do
    arr.difference(arr.uniq*2).empty?
  end

  _akuhn do
    count = Hash.new(0)
    arr.none? { |each| (count[each] += 1) > 2 }
  end

  _oneneptune do
    arr.each_with_object(Hash.new(0)) { |element,counts|
      counts[element] += 1
    }.values.max < 3
  end

  _glykyo do
    arr.uniq.map{ |element| arr.count(element) }.max <= 2
  end

  _vlasiak do
    arr.none? { |el| arr.count(el) > 2 }
  end

end
  • This is the best case scenario for _akuhn, since it stops after 20 elements or so, and the others keep iterating 1000 elements. What's interesting is that for `(1..1000).to_a * 2`, _akuhn is still the fastest, though by "only" 10%. – Eric Duminil Jan 14 '17 at 10:42
  • Coincidentally the top three in computational efficiency were also the three I found to be the most interesting algorithms. – Matthew Schuchard Jan 14 '17 at 14:52
5

Try this

count = Hash.new(0)
array.none? { |each| (count[each] += 1) > 2 }
# => true or false

How does this work?

  • Hash.new(0) creates a hash with default value 0
  • none? checks a block for all elements and returns whether no element matches
  • count[each] += 1 increases the count (no nil case since default value is 0)

This is an optimal solution since it breaks as soon as the first offending element is found. All other solution posted here either scan the entire array or have even worse complexity.

NB, if you want to know which elements appear more than twice (for example to print an error message) use find or find_all instead of none?.

akuhn
  • 27,477
  • 2
  • 76
  • 91
  • Nice. It is faster all-around, even for `(1..1000).to_a * 2`. It looks like you'll soon get your well-deserved Ruby silver badge! – Eric Duminil Jan 14 '17 at 10:43
2

Here's another way, using the method Array#difference:

def twice_at_most?(arr)
  arr.difference(arr.uniq*2).empty?
end

where Array#difference is defined as follows:

class Array
  def difference(other)
    h = other.each_with_object(Hash.new(0)) { |e,h| h[e] += 1 }
    reject { |e| h[e] > 0 && h[e] -= 1 }
  end
end

After having found many uses for Array#difference, I have proposed that it be adopted as a core method. The document at this link explains how the method works and provides examples of its use.

Let's try it.

twice_at_most? [1, 4, 2, 4, 1, 3, 4]
  #=> false

Here

arr.uniq*2
  #=> [1, 4, 2, 3, 1, 4, 2, 3] 
arr.difference(arr.uniq*2)
  #=> [4] 

Another example:

twice_at_most? [1, 4, 2, 4, 1, 3, 5]
  #=> true
Cary Swoveland
  • 106,649
  • 6
  • 63
  • 100
1

To avoid a lot of temporary overhead, just sort the array and then split it up into chunks of similar elements. You can then find the longest chunk:

def max_count(arr)
  arr.sort.slice_when { |a,b| a != b }.map(&:length).max.to_i
end

max_count(%w[ 1 3 3 55 3 2 ])
# => 3

max_count(%w[ 1 3 55 3 2 ])
# => 2

max_count([ ])
# => 0
tadman
  • 208,517
  • 23
  • 234
  • 262
1

Just for fun here's one way using each_cons and using none? as used by Wayne Conrad in his answer.

 arr.sort_by(&:to_i).each_cons(3).none? { |a,b,c| a == b && b == c }
Sagar Pandya
  • 9,323
  • 2
  • 24
  • 35
1

In my point of view, that could be a pretty simple solution:

def no_more_than_twice_occur?(array)
  array.none? { |el| array.count(el) > 2 }
end


no_more_than_twice_occur?(["1", "3", "3", "55", "3", "2"]) # => false
no_more_than_twice_occur?(["20", "10", "20", "10"]) # => true
vlasiak
  • 348
  • 4
  • 10
0

Here's an all in one method for you.

def lessThanThree(arr)
  arr.each_with_object(Hash.new(0)) { |element,counts| counts[element] += 1 }.values.max < 3
end

Basically, takes the array, iterates through creating the hash and counting each occurrence, then the values method just produces an array of all the counts (values) and then max finds the highest value. We check if that's less than three, if it is, return true, else return false. You could replace true or false with a code block.

OneNeptune
  • 883
  • 11
  • 20
  • 1
    Isn't the ternary redundant? `x < 3 ? true : false` is the exact same as `x < 3` – Glyoko Jan 14 '17 at 01:08
  • @Glyoko it is - I added the sentence at the end to explain that they could replace it with a code block because I assumed they were going to be doing something else... my mistake, will fix this – OneNeptune Jan 14 '17 at 01:09
-1

For each unique item in the array, count how many times that element appears in the array. Of those values, check if the max is <= 2.

def max_occurence_at_most_2?(array)
  array.uniq.map{ |element| array.count(element) }.max <= 2
end

Not optimized for speed.

Glyoko
  • 2,071
  • 1
  • 14
  • 28
  • this is a good answer - however one edit you can make so it fits the original question is < 3, as written this would return true when he wants false, and false when he wants true. Otherwise good work. – OneNeptune Jan 14 '17 at 01:08
  • Might work, but technically very messy since it slows down exponentially the larger the array gets. – tadman Jan 14 '17 at 01:12
  • Thanks @OneNeptune. That's true tadman, but imo it reads a bit more clearly this way. It's something of a question of simplicity vs speed, and depends on the requirements of the tool. – Glyoko Jan 14 '17 at 01:18