How do I compare two hashes?

Question

I am trying to compare two Ruby Hashes using the following code:

#!/usr/bin/env ruby

require "yaml"
require "active_support"

file1 = YAML::load(File.open('./en_20110207.yml'))
file2 = YAML::load(File.open('./locales/en.yml'))

arr = []

file1.select { |k,v|
  file2.select { |k2, v2|
    arr << "#{v2}" if "#{v}" != "#{v2}"
  }
}

puts arr

The output to the screen is the full file from file2. I know for a fact that the files are different, but the script doesn't seem to pick it up.

possible duplicate of [Comparing ruby hashes](http://stackoverflow.com/questions/1766741/comparing-ruby-hashes) — Geoff Lanotte, Jul 25 '12 at 03:22

the Tin Man · Accepted Answer · 2015-03-05T18:13:29.400

187

You can compare hashes directly for equality:

hash1 = {'a' => 1, 'b' => 2}
hash2 = {'a' => 1, 'b' => 2}
hash3 = {'a' => 1, 'b' => 2, 'c' => 3}

hash1 == hash2 # => true
hash1 == hash3 # => false

hash1.to_a == hash2.to_a # => true
hash1.to_a == hash3.to_a # => false

You can convert the hashes to arrays, then get their difference:

hash3.to_a - hash1.to_a # => [["c", 3]]

if (hash3.size > hash1.size)
  difference = hash3.to_a - hash1.to_a
else
  difference = hash1.to_a - hash3.to_a
end
Hash[*difference.flatten] # => {"c"=>3}

Simplifying further:

Assigning difference via a ternary structure:

  difference = (hash3.size > hash1.size) \
                ? hash3.to_a - hash1.to_a \
                : hash1.to_a - hash3.to_a
=> [["c", 3]]
  Hash[*difference.flatten] 
=> {"c"=>3}

Doing it all in one operation and getting rid of the difference variable:

  Hash[*(
  (hash3.size > hash1.size)    \
      ? hash3.to_a - hash1.to_a \
      : hash1.to_a - hash3.to_a
  ).flatten] 
=> {"c"=>3}

edited Mar 05 '15 at 18:13

answered Feb 08 '11 at 01:51

the Tin Man

158,662
42
215
303

4

Is there anyway to get the differences between the two? – dennismonsewicz Feb 08 '11 at 01:57
5

Hashes can be of same size, but contain different values. In such case Both `hash1.to_a - hash3.to_a` and `hash3.to_a - hash1.to_a` may return nonempty values though `hash1.size == hash3.size`. The part after **EDIT** is valid only if hashes are of different size. – ohaleck Oct 16 '14 at 19:26
3

Nice, but should have quit while ahead. A.size > B.size doesn't necessarily mean A includes B. Still need to take the union of symmetric differences. – Gene Mar 05 '15 at 03:20
Directly comparing the output of `.to_a` will fail when equal hashes have keys in a different order: `{a:1, b:2} == {b:2, a:1}` => true, `{a:1, b:2}.to_a == {b:2, a:1}.to_a` => false – aidan Jan 27 '17 at 05:53
what's the purpose of `flatten` and `*`? Why not just `Hash[A.to_a - B.to_a]`? – JeremyKun Feb 22 '17 at 01:41
or `difference.to_h` – Chen Kinnrot Sep 24 '17 at 09:42
@ohaleck You are right! That's why I prefer to use: `hash1.to_a - hash2.to_a | hash2.to_a - hash1.to_a`. Please take a look at my answer => https://stackoverflow.com/questions/4928789/how-do-i-compare-two-hashes/57862282#57862282 – Victor Sep 09 '19 at 23:34

score 40 · Answer 2 · edited Nov 15 '16 at 23:15

40

You can try the hashdiff gem, which allows deep comparison of hashes and arrays in the hash.

The following is an example:

a = {a:{x:2, y:3, z:4}, b:{x:3, z:45}}
b = {a:{y:3}, b:{y:3, z:30}}

diff = HashDiff.diff(a, b)
diff.should == [['-', 'a.x', 2], ['-', 'a.z', 4], ['-', 'b.x', 3], ['~', 'b.z', 45, 30], ['+', 'b.y', 3]]

edited Nov 15 '16 at 23:15

the Tin Man

158,662
42
215
303

answered Jun 22 '12 at 17:53

liu fengyun

401
4
2

4

I had some fairly deep hashes causing test failures. By replacing the `got_hash.should eql expected_hash` with `HashDiff.diff(got_hash, expected_hash).should eql []` I now get output which shows exactly what I need. Perfect! – davetapley Jul 24 '12 at 19:29
Wow, HashDiff is awesome. Made quick work of trying to see what has changed in a huge nested JSON array. Thanks! – Jeff Wigal Oct 28 '14 at 16:32
Your gem is awesome! Super helpful when writing specs involving JSON manipulations. Thx. – Alain Jun 23 '15 at 18:31
2

My experience with HashDiff has been that it works really well for small hashes but the diff speed doesn't seem to scale well. Worth benchmarking your calls to it if you expect it may get fed two large hashes and making sure that the diff time is within your tolerance. – David Bodow Jul 18 '18 at 23:04
Using the `use_lcs: false` flag can significantly speed up comparisons on large hashes: `Hashdiff.diff(b, a, use_lcs: false)` – Eric Walker May 07 '20 at 13:59
For anyone (like me) who might get tripped up by this, it should (now?) be Hashdiff, not HashDiff. – Travis Kriplean Aug 08 '22 at 15:27

score 21 · Answer 3 · answered Feb 08 '11 at 01:58

21

If you want to get what is the difference between two hashes, you can do this:

h1 = {:a => 20, :b => 10, :c => 44}
h2 = {:a => 2, :b => 10, :c => "44"}
result = {}
h1.each {|k, v| result[k] = h2[k] if h2[k] != v }
p result #=> {:a => 2, :c => "44"}

answered Feb 08 '11 at 01:58

Guilherme Bernal

8,183
25
43

score 11 · Answer 4 · answered Mar 05 '14 at 15:31

11

Rails is deprecating the diff method.

For a quick one-liner:

hash1.to_s == hash2.to_s

answered Mar 05 '14 at 15:31

Evan

7,396
4
32
31

I always forget about this. There are a lot of equality checks that are made easy using `to_s`. – the Tin Man Nov 15 '16 at 23:16
24

It will fail when equal hashes have keys in a different order: `{a:1, b:2} == {b:2, a:1}` => true, `{a:1, b:2}.to_s == {b:2, a:1}.to_s` => false – aidan Jan 27 '17 at 05:54
2

Which is a feature! :D – Dave Morse Jun 12 '18 at 12:15

score 10 · Answer 5 · answered Nov 17 '16 at 16:13

You could use a simple array intersection, this way you can know what differs in each hash.

    hash1 = { a: 1 , b: 2 }
    hash2 = { a: 2 , b: 2 }

    overlapping_elements = hash1.to_a & hash2.to_a

    exclusive_elements_from_hash1 = hash1.to_a - overlapping_elements
    exclusive_elements_from_hash2 = hash2.to_a - overlapping_elements

Victor · Answer 6 · 2020-05-27T10:37:38.970

I developed this to compare if two hashes are equal

def hash_equal?(hash1, hash2)
  array1 = hash1.to_a
  array2 = hash2.to_a
  (array1 - array2 | array2 - array1) == []
end

The usage:

> hash_equal?({a: 4}, {a: 4})
=> true
> hash_equal?({a: 4}, {b: 4})
=> false

> hash_equal?({a: {b: 3}}, {a: {b: 3}})
=> true
> hash_equal?({a: {b: 3}}, {a: {b: 4}})
=> false

> hash_equal?({a: {b: {c: {d: {e: {f: {g: {h: 1}}}}}}}}, {a: {b: {c: {d: {e: {f: {g: {h: 1}}}}}}}})
=> true
> hash_equal?({a: {b: {c: {d: {e: {f: {g: {marino: 1}}}}}}}}, {a: {b: {c: {d: {e: {f: {g: {h: 2}}}}}}}})
=> false

score 2 · Answer 7 · answered May 25 '20 at 21:07

Here is algorithm to deeply compare two Hashes, which also will compare nested Arrays:

    HashDiff.new(
      {val: 1, nested: [{a:1}, {b: [1, 2]}] },
      {val: 2, nested: [{a:1}, {b: [1]}] }
    ).report

# Output:
val:
- 1
+ 2
nested > 1 > b > 1:
- 2

Implementation:

class HashDiff

  attr_reader :left, :right

  def initialize(left, right, config = {}, path = nil)
    @left  = left
    @right = right
    @config = config
    @path = path
    @conformity = 0
  end

  def conformity
    find_differences
    @conformity
  end

  def report
    @config[:report] = true
    find_differences
  end

  def find_differences
    if hash?(left) && hash?(right)
      compare_hashes_keys
    elsif left.is_a?(Array) && right.is_a?(Array)
      compare_arrays
    else
      report_diff
    end
  end

  def compare_hashes_keys
    combined_keys.each do |key|
      l = value_with_default(left, key)
      r = value_with_default(right, key)
      if l == r
        @conformity += 100
      else
        compare_sub_items l, r, key
      end
    end
  end

  private

  def compare_sub_items(l, r, key)
    diff = self.class.new(l, r, @config, path(key))
    @conformity += diff.conformity
  end

  def report_diff
    return unless @config[:report]

    puts "#{@path}:"
    puts "- #{left}" unless left == NO_VALUE
    puts "+ #{right}" unless right == NO_VALUE
  end

  def combined_keys
    (left.keys + right.keys).uniq
  end

  def hash?(value)
    value.is_a?(Hash)
  end

  def compare_arrays
    l, r = left.clone, right.clone
    l.each_with_index do |l_item, l_index|
      max_item_index = nil
      max_conformity = 0
      r.each_with_index do |r_item, i|
        if l_item == r_item
          @conformity += 1
          r[i] = TAKEN
          break
        end

        diff = self.class.new(l_item, r_item, {})
        c = diff.conformity
        if c > max_conformity
          max_conformity = c
          max_item_index = i
        end
      end or next

      if max_item_index
        key = l_index == max_item_index ? l_index : "#{l_index}/#{max_item_index}"
        compare_sub_items l_item, r[max_item_index], key
        r[max_item_index] = TAKEN
      else
        compare_sub_items l_item, NO_VALUE, l_index
      end
    end

    r.each_with_index do |item, index|
      compare_sub_items NO_VALUE, item, index unless item == TAKEN
    end
  end

  def path(key)
    p = "#{@path} > " if @path
    "#{p}#{key}"
  end

  def value_with_default(obj, key)
    obj.fetch(key, NO_VALUE)
  end

  module NO_VALUE; end
  module TAKEN; end

end

Ev Dolzhenko · Answer 8 · 2013-10-04T15:03:08.767

1

If you need a quick and dirty diff between hashes which correctly supports nil in values you can use something like

def diff(one, other)
  (one.keys + other.keys).uniq.inject({}) do |memo, key|
    unless one.key?(key) && other.key?(key) && one[key] == other[key]
      memo[key] = [one.key?(key) ? one[key] : :_no_key, other.key?(key) ? other[key] : :_no_key]
    end
    memo
  end
end

edited Oct 04 '13 at 15:03

answered Oct 04 '13 at 14:50

Ev Dolzhenko

6,100
5
38
30

Benjamin Crouzier · Answer 9 · 2014-10-10T12:42:01.067

If you want a nicely formatted diff, you can do this:

# Gemfile
gem 'awesome_print' # or gem install awesome_print

And in your code:

require 'ap'

def my_diff(a, b)
  as = a.ai(plain: true).split("\n").map(&:strip)
  bs = b.ai(plain: true).split("\n").map(&:strip)
  ((as - bs) + (bs - as)).join("\n")
end

puts my_diff({foo: :bar, nested: {val1: 1, val2: 2}, end: :v},
             {foo: :bar, n2: {nested: {val1: 1, val2: 3}}, end: :v})

The idea is to use awesome print to format, and diff the output. The diff won't be exact, but it is useful for debugging purposes.

score 1 · Answer 10 · answered Dec 31 '14 at 03:35

... and now in module form to be applied to a variety of collection classes (Hash among them). It's not a deep inspection, but it's simple.

# Enable "diffing" and two-way transformations between collection objects
module Diffable
  # Calculates the changes required to transform self to the given collection.
  # @param b [Enumerable] The other collection object
  # @return [Array] The Diff: A two-element change set representing items to exclude and items to include
  def diff( b )
    a, b = to_a, b.to_a
    [a - b, b - a]
  end

  # Consume return value of Diffable#diff to produce a collection equal to the one used to produce the given diff.
  # @param to_drop [Enumerable] items to exclude from the target collection
  # @param to_add  [Enumerable] items to include in the target collection
  # @return [Array] New transformed collection equal to the one used to create the given change set
  def apply_diff( to_drop, to_add )
    to_a - to_drop + to_add
  end
end

if __FILE__ == $0
  # Demo: Hashes with overlapping keys and somewhat random values.
  Hash.send :include, Diffable
  rng = Random.new
  a = (:a..:q).to_a.reduce(Hash[]){|h,k| h.merge! Hash[k, rng.rand(2)] }
  b = (:i..:z).to_a.reduce(Hash[]){|h,k| h.merge! Hash[k, rng.rand(2)] }
  raise unless a == Hash[ b.apply_diff(*b.diff(a)) ] # change b to a
  raise unless b == Hash[ a.apply_diff(*a.diff(b)) ] # change a to b
  raise unless a == Hash[ a.apply_diff(*a.diff(a)) ] # change a to a
  raise unless b == Hash[ b.apply_diff(*b.diff(b)) ] # change b to b
end

score 0 · Answer 11 · answered Jul 29 '19 at 16:59

0

what about convert both hash to_json and compare as string? but keeping in mind that

require "json"
h1 = {a: 20}
h2 = {a: "20"}

h1.to_json==h1.to_json
=> true
h1.to_json==h2.to_json
=> false

answered Jul 29 '19 at 16:59

stbnrivas

633
7
9

score 0 · Answer 12 · answered Oct 11 '22 at 08:49

In my case I wanted to have the attributes merged like { status: [:collecting, :out_for_delivery] } so I did:

    before = attributes.without(*IGNORED_ATTRIBUTES)
    after = replacement.attributes
    diff = before.map do |key, _|
      [key, [before[key], after[key]]] if before[key] != after[key]
    end
    diff.compact.to_h

score 0 · Answer 13 · edited May 23 '17 at 12:10

0

This was answered in "Comparing ruby hashes". Rails adds a diff method to hashes. It works well.

edited May 23 '17 at 12:10

Community

1
1

answered Dec 27 '11 at 03:55

Wolfram Arnold

7,159
5
44
64

7

[Diff method](http://apidock.com/rails/Hash/diff) is deprecated starting from Rails versions newer than v4.0.2. – Andres Apr 24 '15 at 11:53

score -5 · Answer 14 · answered Feb 08 '11 at 01:50

-5

How about another, simpler approach:

require 'fileutils'
FileUtils.cmp(file1, file2)

answered Feb 08 '11 at 01:50

Mike

19,267
11
56
72

4

That only is meaningful if you need the hashes to be identical on the disk. Two files that are different on disk because the hash elements are in different orders, can still contain the same elements, and will be equal as far as Ruby is concerned once they are loaded. – the Tin Man Dec 27 '11 at 05:55

How do I compare two hashes?

14 Answers14

Linked

Related