16

I read in the documentation for the String class that eql? is a strict equality operator, without type conversion, and == is a equality operator which tries to convert second its argument to a String, and, the C source code for this methods confirms that:

The eql? source code:

static VALUE
rb_str_eql(VALUE str1, VALUE str2)
{
    if (str1 == str2) return Qtrue;
    if (TYPE(str2) != T_STRING) return Qfalse;
    return str_eql(str1, str2);
}

The == source code:

VALUE
rb_str_equal(VALUE str1, VALUE str2)
{
    if (str1 == str2) return Qtrue;
    if (TYPE(str2) != T_STRING) {
        if (!rb_respond_to(str2, rb_intern("to_str"))) {
            return Qfalse;
        }
        return rb_equal(str2, str1);
    }
    return str_eql(str1, str2);
}

But when I tried to benchmark these methods, I was suprised that == is faster than eql? by up to 20%! My benchmark code is:

require "benchmark"

RUN_COUNT = 100000000
first_string = "Woooooha"
second_string = "Woooooha"

time = Benchmark.measure do
  RUN_COUNT.times do |i|
    first_string.eql?(second_string)
  end
end
puts time

time = Benchmark.measure do
  RUN_COUNT.times do |i|
    first_string == second_string
  end
end
puts time

And results:

Ruby 1.9.3-p125:

26.420000   0.250000  26.670000 ( 26.820762)
21.520000   0.200000  21.720000 ( 21.843723)

Ruby 1.9.2-p290:

25.930000   0.280000  26.210000 ( 26.318998)
19.800000   0.130000  19.930000 ( 19.991929)

So, can anyone explain why the more simple eql? method is slower than == method in the case when I run it for two similar strings?

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
sharipov_ru
  • 643
  • 4
  • 11
  • 3
    Micro benchmarks are not easy to do, and there could be a lot of things that influence your output. Are you sure that the processors are up to speed when you start your benchmark? Have you tried to change the order of the benchmarks? Have you tried to do them multiple times, every time switching between `==` and `eql?`? At the end, `eql?` should be faster than `==` (if the C code is the right one). – mliebelt Apr 21 '12 at 10:17
  • 2
    I was able to confirm these results. Tried switching the order, tried alternating between the two, etc. The results are extremely consistent, `==` appears to be faster than `eql?`. – robbrit Apr 21 '12 at 14:16
  • @mliebelt I'm agree with you that `eql?` must be faster or at least not slower than `==` but I tried to change order of benchmarks. The results were the same. I didn't try to switch `==` and `eql?` every time, can you provide example of this kind benchmark? – sharipov_ru Apr 22 '12 at 04:35
  • Ouch. I'm not surprised you didn't get an answer for over a year now. It *looks* like an easy question, but it's actually *very though*! Glad I stumbled on this question by pure chance. – Marc-André Lafortune Apr 24 '13 at 23:09

3 Answers3

4

The reason you are seeing a difference is not related to the implementation of == vs eql? but is due to the fact that Ruby optimizes operators (like ==) to avoid going through the normal method lookup when possible.

We can verify this in two ways:

  • Create an alias for == and call that instead. You'll get similar results to eql? and thus slower results than ==.

  • Compare using send :== and send :eql? instead and you'll get similar timings; the speed difference disappears because Ruby will only use the optimization for direct calls to the operators, not with using send or __send__.

Here's code that shows both:

require 'fruity'
first = "Woooooha"
second = "Woooooha"
class String
  alias same_value? ==
end

compare do
  with_operator   { first == second }
  with_same_value { first.same_value? second }
  with_eql        { first.eql? second }
end

compare do
  with_send_op    { first.send :==, second }
  with_send_eql   { first.send :eql?, second }
end

Results:

with_operator is faster than with_same_value by 2x ± 0.1
with_same_value is similar to with_eql
with_send_eql is similar to with_send_op

If you're the curious, the optimizations for operators are in insns.def.

Note: this answer applies only to Ruby MRI, I would be surprised if there was a speed difference in JRuby / rubinius, for instance.

Marc-André Lafortune
  • 78,216
  • 16
  • 166
  • 166
2
equal? is reference equality
== is value equality
eql? is value and type equality

The third method, eql? is normally used to test if two objects have the same value as well as the same type. For example:

puts "integer == to float: #{25 == 25.0}"
puts "integer eql? to float: #{25.eql? 25.0}"

gives:

Does integer == to float: true
Does integer eql? to float: false

So I thought since eql? does more checking it would be slower, and for strings it is, at least on my Ruby 1.93. So I figured it must be type dependent and did some tests. When integer and floats are compared eql? is a bit faster. When integers are compared == is much faster, until x2. Wrong theory, back to start.

The next theory: comparing two values of the same type will be faster with one of both proved to be true, in the case they are of the same type == is always faster, eql? is faster when types are different, again until x2.

Don't have the time to compare all types but I'm sure you'll get varying results, although the same kind of comparison always gives similar results. Can somebody prove me wrong?

Here are my results from the test of the OP:

 16.863000   0.000000  16.863000 ( 16.903000) 2 strings with eql?
 14.212000   0.000000  14.212000 ( 14.334600) 2 strings with ==
 13.213000   0.000000  13.213000 ( 13.245600) integer and floating with eql?
 14.103000   0.000000  14.103000 ( 14.200400) integer and floating with ==
 13.229000   0.000000  13.229000 ( 13.410800) 2 same integers with eql?
  9.406000   0.000000   9.406000 (  9.410000) 2 same integers with ==
 19.625000   0.000000  19.625000 ( 19.720800) 2 different integers with eql?
  9.407000   0.000000   9.407000 (  9.405800) 2 different integers with ==
 21.825000   0.000000  21.825000 ( 21.910200) integer with string with eql?
 43.836000   0.031000  43.867000 ( 44.074200) integer with string with ==
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
peter
  • 41,770
  • 5
  • 64
  • 108
  • "So i thought since eql ? does more checking it would be slower" - except it **doesn't** do more checking; look at the C source code that the OP posted (or look it up on rubydoc.org: [==](http://www.ruby-doc.org/core-1.9.3/String.html#method-i-3D-3D) vs. [eql?](http://www.ruby-doc.org/core-1.9.3/String.html#method-i-eql-3F)). – Abe Voelker Apr 24 '12 at 21:09
  • i know abe, edit my answer a bit to make that clear. Was just forming a theory based on the documentation and tested it, proved to be wrong perhaps cause the reason you'r mentioning, my second theory still stands until counter evidence, i agree it doesn't really answer WHY but it's a start isn't it ? – peter Apr 24 '12 at 21:18
  • Fair enough, I wasn't trying to sound like a jerk just pointing out that the C source doesn't match the == vs. eql? assumption – Abe Voelker Apr 24 '12 at 21:21
  • BTW, there is no relationship between `Integer#eql?` and `String#eql?`, but you might experience the same kind of strange speed difference. My answer explains why. – Marc-André Lafortune Apr 24 '13 at 23:14
2

When doing benchmarks, don't use times, because that creates a closure RUN_COUNT times. The extra time taken as a result affects all benchmarks equally in absolute terms, but that makes it harder to notice a relative difference:

require "benchmark"

RUN_COUNT = 10_000_000
FIRST_STRING = "Woooooha"
SECOND_STRING = "Woooooha"

def times_eq_question_mark
  RUN_COUNT.times do |i|
    FIRST_STRING.eql?(SECOND_STRING)
  end
end

def times_double_equal_sign
  RUN_COUNT.times do |i|
    FIRST_STRING == SECOND_STRING
  end
end

def loop_eq_question_mark
  i = 0
  while i < RUN_COUNT
    FIRST_STRING.eql?(SECOND_STRING)
    i += 1
  end
end

def loop_double_equal_sign
  i = 0
  while i < RUN_COUNT
    FIRST_STRING == SECOND_STRING
    i += 1
  end
end

1.upto(10) do |i|
  method_names = [:times_eq_question_mark, :times_double_equal_sign, :loop_eq_question_mark, :loop_double_equal_sign]
  method_times = method_names.map {|method_name| Benchmark.measure { send(method_name) } }
  puts "Run #{i}"
  method_names.zip(method_times).each do |method_name, method_time|
    puts [method_name, method_time].join("\t")
  end
  puts
end

gives

Run 1
times_eq_question_mark    3.500000   0.000000   3.500000 (  3.578011)
times_double_equal_sign   2.390000   0.000000   2.390000 (  2.453046)
loop_eq_question_mark     3.110000   0.000000   3.110000 (  3.140525)
loop_double_equal_sign    2.109000   0.000000   2.109000 (  2.124932)

Run 2
times_eq_question_mark    3.531000   0.000000   3.531000 (  3.562386)
times_double_equal_sign   2.469000   0.000000   2.469000 (  2.484295)
loop_eq_question_mark     3.063000   0.000000   3.063000 (  3.109276)
loop_double_equal_sign    2.109000   0.000000   2.109000 (  2.140556)

Run 3
times_eq_question_mark    3.547000   0.000000   3.547000 (  3.593635)
times_double_equal_sign   2.437000   0.000000   2.437000 (  2.453047)
loop_eq_question_mark     3.063000   0.000000   3.063000 (  3.109275)
loop_double_equal_sign    2.140000   0.000000   2.140000 (  2.140557)

Run 4
times_eq_question_mark    3.547000   0.000000   3.547000 (  3.578011)
times_double_equal_sign   2.422000   0.000000   2.422000 (  2.437422)
loop_eq_question_mark     3.094000   0.000000   3.094000 (  3.140524)
loop_double_equal_sign    2.140000   0.000000   2.140000 (  2.140557)

Run 5
times_eq_question_mark    3.578000   0.000000   3.578000 (  3.671758)
times_double_equal_sign   2.406000   0.000000   2.406000 (  2.468671)
loop_eq_question_mark     3.110000   0.000000   3.110000 (  3.156149)
loop_double_equal_sign    2.109000   0.000000   2.109000 (  2.156181)

Run 6
times_eq_question_mark    3.562000   0.000000   3.562000 (  3.562386)
times_double_equal_sign   2.407000   0.000000   2.407000 (  2.468671)
loop_eq_question_mark     3.109000   0.000000   3.109000 (  3.124900)
loop_double_equal_sign    2.125000   0.000000   2.125000 (  2.234303)

Run 7
times_eq_question_mark    3.500000   0.000000   3.500000 (  3.546762)
times_double_equal_sign   2.453000   0.000000   2.453000 (  2.468671)
loop_eq_question_mark     3.031000   0.000000   3.031000 (  3.171773)
loop_double_equal_sign    2.157000   0.000000   2.157000 (  2.156181)

Run 8
times_eq_question_mark    3.468000   0.000000   3.468000 (  3.656133)
times_double_equal_sign   2.454000   0.000000   2.454000 (  2.484296)
loop_eq_question_mark     3.093000   0.000000   3.093000 (  3.249896)
loop_double_equal_sign    2.125000   0.000000   2.125000 (  2.140556)

Run 9
times_eq_question_mark    3.563000   0.000000   3.563000 (  3.593635)
times_double_equal_sign   2.453000   0.000000   2.453000 (  2.453047)
loop_eq_question_mark     3.125000   0.000000   3.125000 (  3.124900)
loop_double_equal_sign    2.141000   0.000000   2.141000 (  2.156181)

Run 10
times_eq_question_mark    3.515000   0.000000   3.515000 (  3.562386)
times_double_equal_sign   2.453000   0.000000   2.453000 (  2.453046)
loop_eq_question_mark     3.094000   0.000000   3.094000 (  3.140525)
loop_double_equal_sign    2.109000   0.000000   2.109000 (  2.156181)
Andrew Grimm
  • 78,473
  • 57
  • 200
  • 338
  • I agree that micro benchmarking is difficult, but using an `each` or a `while`, as long as it's consistent, won't change the relative speed of things. In any case, I've given the answer as to why there is a speed difference. – Marc-André Lafortune Apr 24 '13 at 23:15