18

I need a very fast way to determine if an array consits only of integers with the value of 9. Here is my current solution:

input = [9,9,9,9,9,9,9,9,9,9,9,9]
input.uniq == [9]

Can you do it faster?

astropanic
  • 10,800
  • 19
  • 72
  • 132
  • 4
    To properly answer this question, one would need to know how large the actual array will be, and what the probability of non-9 values would be. Solutions that are faster for one type of input may be slower for others. – Mark Thomas May 23 '11 at 15:35
  • Good point, the array can be very big, over 2M elements, the probablity of a nine is 1/10 ;) – astropanic May 23 '11 at 15:49
  • 1
    Then it sounds like rejection of non-9 values is what you want to optimize, a la @steenslag's solution. A full array scan is expensive. – Mark Thomas May 23 '11 at 15:54
  • 1
    @bashman: Do you really expect 2M elements to be the same with a 10% probability any element will have a given value? :) – Olivier L. May 23 '11 at 16:45
  • of course not :) The probablity 1/10 is of one single element in the array to be a nine :D – astropanic May 23 '11 at 17:20
  • 1
    Looks as though people are overlooking my answer because it has 0 votes, despite giving the best performance. Putting this comment here to invite people to take a look. – Olivier L. May 23 '11 at 18:02
  • @Olivier L., your solution does essentially the same thing as the ruby built-in `all?` method. – Mark Thomas May 24 '11 at 00:19
  • @Mark Thomas: Indeed, except that it's faster and allows the implementation of other requirements (as stated in my comment to my own question). – Olivier L. May 24 '11 at 00:33
  • @Olivier L.: In jruby 1.6.1, `all?` is faster :) I guess we're splitting hairs now, because they're both "fast enough". – Mark Thomas May 24 '11 at 00:48
  • Interestingly, the fastest jruby solutions are faster than the same solutions in 1.9.2. – Mark Thomas May 24 '11 at 01:02

8 Answers8

31
require 'benchmark'

n = 50000
Benchmark.bm do |x|
  x.report "uniq  " do
    n.times do 
      input = [9,9,9,9,9,9,9,9,9,9,9,9]
      input.uniq == [9]
    end
  end
  x.report "delete" do
    n.times do 
      input = [9,9,9,9,9,9,9,9,9,9,9,9]
      input.delete 9
      input == []
    end  
  end
  x.report "count " do
    n.times do
      input = [9,9,9,9,9,9,9,9,9,9,9,9]
      input.count(9)==input.size
    end
  end
  x.report "select" do
    n.times do
      input = [9,9,9,9,9,9,9,9,9,9,9,9]
      input.select{|x| x != 9}.empty?
    end
  end  
  x.report "detect" do
    n.times do
      input = [9,9,9,9,9,9,9,9,9,9,9,9]
      input.detect { |i| i != 9 }.nil?
    end
  end 

  x.report "all?  " do
    n.times do
      input = [9,9,9,9,9,9,9,9,9,9,9,9]
      input.all?{|x| x == 9} 
    end
  end 

end

it a benchmark for the answers above and some mine

        user       system      total        real
uniq    0.313000   0.000000   0.313000 (  0.312500)
delete  0.140000   0.000000   0.140000 (  0.140625)
count   0.079000   0.000000   0.079000 (  0.078125)
select  0.234000   0.000000   0.234000 (  0.234375)
detect  0.234000   0.000000   0.234000 (  0.234375)
all?    0.219000   0.000000   0.219000 (  0.218750)

if input = [1]+[9]*9:

        user     system      total        real
uniq    0.328000   0.000000   0.328000 (  0.328125)
delete  0.188000   0.000000   0.188000 (  0.203125)
count   0.187000   0.000000   0.187000 (  0.218750)
select  0.281000   0.016000   0.297000 (  0.296875)
detect  0.203000   0.000000   0.203000 (  0.203125)
all?    0.204000   0.000000   0.204000 (  0.203125)

if input = [9]*9 + [1]:

        user     system      total        real
uniq    0.313000   0.000000   0.313000 (  0.328125)
delete  0.187000   0.000000   0.187000 (  0.187500)
count   0.172000   0.000000   0.172000 (  0.187500)
select  0.297000   0.000000   0.297000 (  0.312500)
detect  0.313000   0.000000   0.313000 (  0.312500)
all?    0.281000   0.000000   0.281000 (  0.281250)

if input = [1,2,3,4,5,6,7,8,9]:

        user     system      total        real
uniq    0.407000   0.000000   0.407000 (  0.406250)
delete  0.125000   0.000000   0.125000 (  0.125000)
count   0.125000   0.000000   0.125000 (  0.125000)
select  0.218000   0.000000   0.218000 (  0.234375)
detect  0.110000   0.000000   0.110000 (  0.109375)
all?    0.109000   0.000000   0.109000 (  0.109375)
Ray Baxter
  • 3,181
  • 23
  • 27
Vasiliy Ermolovich
  • 24,459
  • 5
  • 79
  • 77
  • Can you also produce results for the case where the elements are all distinct? The "all?" solution, is faster than "uniq" in that case. – Wayne Conrad May 23 '11 at 15:16
  • 2
    This would be a really great answer if you would label the reports (e.g. `x.report "detect" do`), and also test against arrays that do not pass the test (e.g. `[1]+[9]*9` and `[9]*9+[1]`). – Phrogz May 23 '11 at 15:25
  • It would be a good idea to average these results, so people don't need to comb through four tables of benchmark times. Aside from that, very good! – Aaa May 23 '11 at 21:57
16

You have a few options:

>> input.count(9)==input.size
=> true

or

>> input.select{|x| x != 9}.empty?
=> true

or the solution you had above.

yan
  • 20,644
  • 3
  • 38
  • 48
4

This loops the array and breaks (returning false} when something non-nine is found.

[9,9,9,9,9,9,9,9,9,9,9,9].all?{|x| x == 9} # => true
steenslag
  • 79,051
  • 16
  • 138
  • 171
3

EDIT: find the full source code here. Props to @nash for the original idea.


Iterate and return false as soon as you find an element != match.

def all_matches(arr, match)
  arr.each do |element|
    return false if element != match
  end
  true
end

With 2M random integers from 0 to 9, 50 loops (n=50):

        user       system     total       real
uniq    5.230000   0.010000   5.240000 (  5.219444)
count   2.680000   0.010000   2.690000 (  2.677923)
select  7.580000   0.060000   7.640000 (  7.634620)
detect  0.000000   0.000000   0.000000 (  0.000068)
all?    0.000000   0.000000   0.000000 (  0.000046)
mine    0.000000   0.000000   0.000000 (  0.000032)
delete  5.090000   0.020000   5.110000 (  5.101290)
any?    0.000000   0.000000   0.000000 (  0.000041)

Code used to generate the array:

input = []
2000000.times { input << (rand*10).to_i }

With 2M 9's (all 9's), 50 loops:

        user       system     total       real
uniq    4.900000   0.000000   4.900000 (  4.890030)
count   0.350000   0.000000   0.350000 (  0.351340)
select  5.400000   0.010000   5.410000 (  5.393489)
detect  6.720000   0.000000   6.720000 (  6.685539)
all?    6.070000   0.000000   6.070000 (  6.061914)
mine    5.510000   0.010000   5.520000 (  5.500186)
delete  1.080000   0.010000   1.090000 (  1.084125)
any?    6.200000   0.000000   6.200000 (  6.197529)
Olivier L.
  • 2,575
  • 1
  • 17
  • 11
  • Note that the `mine` benchmark result is the one for my algorithm. – Olivier L. May 23 '11 at 17:56
  • Also note that if you add `return false if arr.empty?` to my method (which might or might not be a requirement), it still performs better than the `all?` method which doesn't do this. – Olivier L. May 23 '11 at 18:07
  • 3
    I don't understand why the count thing is so fast in the last benchmark. – steenslag May 23 '11 at 21:57
  • @steenslag: must have something to do with [branch prediction](http://en.wikipedia.org/wiki/Branch_predictor). – Olivier L. May 23 '11 at 22:23
0

Here's another one that's faster (the count method above is still the fastest):

arr = [9,9,9,9,9,9,9,9,9,9,9,9]
arr.reject { |i| i==9 }.count == 0

and one that's a little slower:

arr.inject(:&) == 9

Here's the 'fruity' gem comparison:

require 'fruity'
compare do
  count { arr.count(9) == arr.size }
  uniq { arr.uniq == [9] }  
  bitwise_and { arr.inject(:&) == 9 }  
  reject { arr.reject { |i| i==9 }.count == 0 }
end  


Running each test 8192 times. Test will take about 3 seconds.
count is faster than reject by 39.99999999999999% ± 10.0%
reject is faster than uniq by 10x ± 1.0
uniq is faster than bit_and by 30.000000000000004% ± 1.0%
Abdo
  • 13,549
  • 10
  • 79
  • 98
0

This works well:

> array = ['apple', 'banana']
> includes = array.uniq.include? 'banana'
> => true

Also, by extension, we can check if all values are the same without knowing what they are:

> array = ['apple', 'banana', 'apple']
> all_same_values = (array.uniq.length > 1) ? false : true
> => false

A related answer here: https://stackoverflow.com/a/1986398/1886534

Community
  • 1
  • 1
Tom
  • 739
  • 7
  • 7
0

I assume you mean input.uniq == [9] as what you have actually returns false for me. What do you mean by faster? Does this code need to run very quickly? I imagine detect is faster as it will return the first element matching the test:

input = [9,9,9,9,9,9,9,9,9,9,9,9]
input.detect { |i| i != 9 }.nil?
muffinista
  • 6,676
  • 2
  • 30
  • 23
0

Maybe the slowest, but this is what came to my mind

input = [9,9,9,9,9,9,9,9,9,9,9,9]
!(input.any { |a| a != 9 })
rubyprince
  • 17,559
  • 11
  • 64
  • 104
  • Yet another method! :) See my answer for the comparative benchmark (I just added your method, labeled `any?`). – Olivier L. May 23 '11 at 22:36