1

I have a class where I was using the Array#shift instance method on an instance variable. I thought I made a "copy" of my instance variable but in fact I hadn't and shift was actually changing the instance variables.

For example, before I would have expected to get ["foo", "bar", "baz"] both times given the following:

class Foo
  attr_reader :arr
  def initialize arr
    @arr = arr
  end

  def some_method
    foo = arr
    foo.shift
  end
end

foo = Foo.new %w(foo bar baz)
p foo.arr #=> ["foo", "bar", "baz"]
foo.some_method
p foo.arr #=> ["bar", "baz"]

result:

["foo", "bar", "baz"]
["bar", "baz"]

But as shown my "copy" wasn't really a copy at all. Now, I'm not sure if I should be calling what I want a "copy", "clone", "dup", "deep clone", "deep dup", "frozen clone", etc...

I was really confused about what to search for and found a bunch of crazy attempts to do what seems like "making a copy of an array".

Then I found another answer with literally one line that solved my problem:

class Foo
  attr_reader :arr
  def initialize arr
    @arr = arr
  end

  def some_method
    foo = [].replace arr
    foo.shift
  end
end

foo = Foo.new %w(foo bar baz)
p foo.arr #=> ["foo", "bar", "baz"]
foo.some_method
p foo.arr #=> ["foo", "bar", "baz"]

output:

["foo", "bar", "baz"]
["foo", "bar", "baz"]

I understand that Array#replace is an instance method being called on an instance of Array that happens to be an empty array (so for example foo = ["cats", "and", "dogs"].replace arr will still work) and it makes sense that I get a "copy" of the instance variable @arr.

But how is that different than:

foo = arr
foo = arr.clone
foo = arr.dup
foo = arr.deep_clone
Marshal.load # something something
# etc...

Or any of the other crazy combinations of dup and map and inject that I'm seeing on SO?

Community
  • 1
  • 1
mbigras
  • 7,664
  • 11
  • 50
  • 111
  • A lot of the differences between those methods is whether the objects inside the new array are copies of the original objects or if both arrays are pointing to the same objects. – Max Jan 15 '17 at 03:10
  • One thing to note: assigning an variable in ruby never makes a copy of it, it just creates a pointer to the value. So, if `arr = [1,3,4]` and you assign `x = arr`, you only created another 'name' for the original array. – Mereghost Aug 07 '18 at 15:17

2 Answers2

2

This is the tricky concept of mutability in ruby. In terms of core objects, this usually comes up with arrays and hashes. Strings are mutable as well, but this can be disabled with a flag at the top of the script. See What does the comment "frozen_string_literal: true" do?.

In this case, you can call dup, deep_dup, clone easily to the same effect as replace:

['some', 'array'].dup
['some', 'array'].deep_dup
['some', 'array'].clone
Marshal.load Marshal::dump(['some', 'array'])

In terms of differences, dup and clone are the same except for some nuanced details - see What's the difference between Ruby's dup and clone methods?

The difference between these and deep_dup is that deep_dup works recursively. For example if you dup a nested array, the inner array will not be cloned:

  a = [[1]]
  b = a.clone
  b[0][0] = 2
  a # => [[2]]

The same thing happens with hashes.

Marshal.load Marshal::dump <object> is a general approach to deep cloning objects, which, unlike deep_dup, is in ruby core. Marshal::dump returns a string so it can be handy in serializing objects to file.

If you want to avoid unexpected errors like this, keep a mental index of which methods have side-effects and only call those when it makes sense to. An explanation point at the end of a method name indicates that it has side effects, but others include unshift, push, concat, delete, and pop. A big part of fuctional programming is avoiding side effects. You can see https://www.sitepoint.com/functional-programming-techniques-with-ruby-part-i/

Community
  • 1
  • 1
max pleaner
  • 26,189
  • 9
  • 66
  • 118
  • 1
    +1. Also note that it's wise to keep an eye on how the class in question implements the methods. As the [Ruby docs](http://ruby-doc.org/core-2.3.0/Object.html#method-i-dup-label-on+dup+vs+clone) say, "In general, `clone` and `dup` may have different semantics in descendant classes." I don't think arrays or hashes do anything different from the docs for `Object`, but ActiveRecord records do, for example. – Max Jan 15 '17 at 15:31
  • thanks for the response @maxple I'm getting a no method error: `['some', 'array'].deep_dup #NoMethodError: undefined method `deep_dup'` – mbigras Jan 17 '17 at 03:41
  • 1
    @mbigras yeah, i mentioned that in my answer. `deep_dup` is not in Ruby core. If you `require 'active_support/all'` you will get that method. – max pleaner Jan 17 '17 at 03:43
  • Why did you choose to run `Marshal.load` and `Marshal::dump` differently when they are both class methods? – mbigras Jan 17 '17 at 04:00
  • @mbigras that is a good question, I can't keep the details straight sometimes – max pleaner Jan 17 '17 at 04:00
  • @maxple what's weird to me is it still works, is it because `Marshal` is a module and a class? – mbigras Jan 17 '17 at 04:01
  • 1
    @mbigras I was just testing that out myself. Seems like `::` can be used instead of `.` everywhere. – max pleaner Jan 17 '17 at 04:03
2

The preferred method is dup

  • use array.dup whenever you need to copy an array
  • use array.map(&:dup) whenever you need to copy a 2D array

Don't use the marshalling trick unless you really want to deep copy an entire object graph. Usually you want to copy the arrays only but not the contained elements.

akuhn
  • 27,477
  • 2
  • 76
  • 91