229

I'll admit that I'm a bit of a ruby newbie (writing rake scripts, now). In most languages, copy constructors are easy to find. Half an hour of searching didn't find it in ruby. I want to create a copy of the hash so that I can modify it without affecting the original instance.

Some expected methods that don't work as intended:

h0 = {  "John"=>"Adams","Thomas"=>"Jefferson","Johny"=>"Appleseed"}
h1=Hash.new(h0)
h2=h1.to_hash

In the meantime, I've resorted to this inelegant workaround

def copyhash(inputhash)
  h = Hash.new
  inputhash.each do |pair|
    h.store(pair[0], pair[1])
  end
  return h
end
Timur Shtatland
  • 12,024
  • 2
  • 30
  • 47
Precipitous
  • 5,253
  • 4
  • 28
  • 34
  • If you are dealing with plain `Hash` objects, the provided answer is good. If you are dealing with Hash-like objects that come from places you don't control you should consider whether you want the singleton class associated with the Hash duplicated or not. See http://stackoverflow.com/questions/10183370/whats-the-difference-between-rubys-dup-and-clone-methods – Sim Sep 12 '14 at 23:23

13 Answers13

257

The clone method is Ruby's standard, built-in way to do a shallow-copy:

h0 = {"John" => "Adams", "Thomas" => "Jefferson"}
# => {"John"=>"Adams", "Thomas"=>"Jefferson"}
h1 = h0.clone
# => {"John"=>"Adams", "Thomas"=>"Jefferson"}
h1["John"] = "Smith"
# => "Smith"
h1
# => {"John"=>"Smith", "Thomas"=>"Jefferson"}
h0
# => {"John"=>"Adams", "Thomas"=>"Jefferson"}

Note that the behavior may be overridden:

This method may have class-specific behavior. If so, that behavior will be documented under the #initialize_copy method of the class.

stevec
  • 41,291
  • 27
  • 223
  • 311
Mark Rushakoff
  • 249,864
  • 45
  • 407
  • 398
  • Clone is a method on Object, BTW, so everything has access to it. See the API details [here](http://ruby-doc.org/core-1.9.3/Object.html#method-i-clone) – Dylan Lacey Aug 28 '12 at 02:55
  • 33
    Adding a more explicit comment here for those who aren't reading other answers that this is does a shallow copy. – grumpasaurus Nov 17 '12 at 16:00
  • #initialize_copy documentation does not seem to exist for Hash, even though there is a link to it on the Hash doc page http://www.ruby-doc.org/core-1.9.3/Hash.html#method-i-initialize_copy – philwhln Jan 10 '13 at 23:42
  • 17
    And for other Ruby beginners, "shallow copy" means that every object below the first level is still a reference. – RobW Jul 01 '13 at 18:51
  • 10
    Note this did not work for nested hashes for me (as mentioned in other answers). I used `Marshal.load(Marshal.dump(h))`. – bheeshmar Sep 24 '13 at 15:28
  • Why use `clone` as opposed to `dup`? Do you need to copy the singleton class in this case? http://stackoverflow.com/questions/10183370/whats-the-difference-between-rubys-dup-and-clone-methods – Sim Sep 12 '14 at 23:27
  • @Sim As a notice, Hash#deep_dup does not clone contained arrays: http://apidock.com/rails/Hash/deep_dup#1534-This-method-does-not-correctly-dup-arrays – Manuel Franco Oct 23 '14 at 14:45
199

As others have pointed out, clone will do it. Be aware that clone of a hash makes a shallow copy. That is to say:

h1 = {:a => 'foo'} 
h2 = h1.clone
h1[:a] << 'bar'
p h2                # => {:a=>"foobar"}

What's happening is that the hash's references are being copied, but not the objects that the references refer to.

If you want a deep copy then:

def deep_copy(o)
  Marshal.load(Marshal.dump(o))
end

h1 = {:a => 'foo'}
h2 = deep_copy(h1)
h1[:a] << 'bar'
p h2                # => {:a=>"foo"}

deep_copy works for any object that can be marshalled. Most built-in data types (Array, Hash, String, &c.) can be marshalled.

Marshalling is Ruby's name for serialization. With marshalling, the object--with the objects it refers to--is converted to a series of bytes; those bytes are then used to create another object like the original.

Wayne Conrad
  • 103,207
  • 26
  • 155
  • 191
  • It's nice that you've provided the information about deep copying, but it should come with a warning that this can cause unintended side effects (for example, modifying either hash modifies both). The main purpose of cloning a hash is preventing modification of the original (for immutability, etc). – Kevin McCarpenter Feb 10 '15 at 02:24
  • 6
    @K.Carpenter Isn't it a _shallow_ copy that shares parts of the original? Deep copy, as I understand it, is a copy that shares no part of the original, so modifying one won't modify the other. – Wayne Conrad Feb 10 '15 at 13:01
  • 1
    How exactly is `Marshal.load(Marshal.dump(o))` deep copying? I can't really understand what happens behind the scenes – Muntasir Alam Aug 24 '16 at 14:14
  • 1
    What this highlights as well is that if you do `h1[:a] << 'bar'` you modify the original object (the string pointed to by h1[:a]) but if you were to do `h1[:a] = "#{h1[:a]}bar"` instead, you would create a new string object, and point `h1[:a]` at that, while `h2[:a]` is still pointing to the old (unmodified) string. – Max Williams Jun 02 '17 at 11:44
  • @MuntasirAlam I added a few words about what marshalling does. I hope that helps. – Wayne Conrad Jun 02 '17 at 13:36
  • 1
    Note: cloning via the Marshal method can lead to remote code execution. https://ruby-doc.org/core-2.2.0/Marshal.html#module-Marshal-label-Security+considerations – Jesse Aldridge Jun 01 '18 at 23:11
  • 1
    @JesseAldridge True, if the input to `Marshal.load` is untrusted, and a good warning to keep in mind. In this case, the input to it comes from `Marshal.dump` in our own process. I think that `Marshal.load` is safe in this context. – Wayne Conrad Jun 02 '18 at 00:23
90

If you're using Rails you can do:

h1 = h0.deep_dup

http://apidock.com/rails/Hash/deep_dup

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
lmanners
  • 2,331
  • 17
  • 9
13

Hash can create a new hash from an existing hash:

irb(main):009:0> h1 = {1 => 2}
=> {1=>2}
irb(main):010:0> h2 = Hash[h1]
=> {1=>2}
irb(main):011:0> h1.object_id
=> 2150233660
irb(main):012:0> h2.object_id
=> 2150205060
James Moore
  • 8,636
  • 5
  • 71
  • 90
7

As mentioned in Security Considerations section of Marshal documentation,

If you need to deserialize untrusted data, use JSON or another serialization format that is only able to load simple, ‘primitive’ types such as String, Array, Hash, etc.

Here is an example on how to do cloning using JSON in Ruby:

require "json"

original = {"John"=>"Adams","Thomas"=>"Jefferson","Johny"=>"Appleseed"}
cloned = JSON.parse(JSON.generate(original))

# Modify original hash
original["John"] << ' Sandler'
p original 
#=> {"John"=>"Adams Sandler", "Thomas"=>"Jefferson", "Johny"=>"Appleseed"}

# cloned remains intact as it was deep copied
p cloned  
#=> {"John"=>"Adams", "Thomas"=>"Jefferson", "Johny"=>"Appleseed"}
Wand Maker
  • 18,476
  • 8
  • 53
  • 87
  • 2
    This works most of the time, but do take care if your keys are integers rather than strings. The keys will turn into strings when you go to and back from JSON. – SDJMcHattie Dec 17 '20 at 20:26
4

I am also a newbie to Ruby and I faced similar issues in duplicating a hash. Use the following. I've got no idea about the speed of this method.

copy_of_original_hash = Hash.new.merge(original_hash)
Mateusz Piotrowski
  • 8,029
  • 10
  • 53
  • 79
2

Use Object#clone:

h1 = h0.clone

(Confusingly, the documentation for clone says that initialize_copy is the way to override this, but the link for that method in Hash directs you to replace instead...)

Josh Lee
  • 171,072
  • 38
  • 269
  • 275
1

This is a special case, but if you're starting with a predefined hash that you want to grab and make a copy of, you can create a method that returns a hash:

def johns 
    {  "John"=>"Adams","Thomas"=>"Jefferson","Johny"=>"Appleseed"}
end

h1 = johns

The particular scenario that I had was I had a collection of JSON-schema hashes where some hashes built off others. I was initially defining them as class variables and ran into this copy issue.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
grumpasaurus
  • 732
  • 1
  • 6
  • 16
1

Since standard cloning method preserves the frozen state, it is not suitable for creating new immutable objects basing on the original object, if you would like the new objects be slightly different than the original (if you like stateless programming).

kuonirat
  • 313
  • 4
  • 6
1

Clone is slow. For performance should probably start with blank hash and merge. Doesn't cover case of nested hashes...

require 'benchmark'

def bench  Benchmark.bm do |b|    
    test = {'a' => 1, 'b' => 2, 'c' => 3, 4 => 'd'}
    b.report 'clone' do
      1_000_000.times do |i|
        h = test.clone
        h['new'] = 5
      end
    end
    b.report 'merge' do
      1_000_000.times do |i|
        h = {}
        h['new'] = 5
        h.merge! test
      end
    end
    b.report 'inject' do
      1_000_000.times do |i|
        h = test.inject({}) do |n, (k, v)|
          n[k] = v;
          n
        end
        h['new'] = 5
      end
    end
  end
end

  bench  user      system      total        ( real)
  clone  1.960000   0.080000    2.040000    (  2.029604)
  merge  1.690000   0.080000    1.770000    (  1.767828)
  inject 3.120000   0.030000    3.150000    (  3.152627)
  
Justin
  • 299
  • 2
  • 12
0

Since Ruby has a million ways to do it, here's another way using Enumerable:

h0 = {  "John"=>"Adams","Thomas"=>"Jefferson","Johny"=>"Appleseed"}
h1 = h0.inject({}) do |new, (name, value)| 
    new[name] = value;
    new 
end
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Rohit
  • 79
  • 1
  • 6
-1

you can use below to deep copy Hash objects.

deeply_copied_hash = Marshal.load(Marshal.dump(original_hash))
ktsujister
  • 471
  • 5
  • 12
-3

Alternative way to Deep_Copy that worked for me.

h1 = {:a => 'foo'} 
h2 = Hash[h1.to_a]

This produced a deep_copy since h2 is formed using an array representation of h1 rather than h1's references.