3

Suppose I pass the string "abcd" through Ruby's Scan method using the regular expression /(a)|(b)/. This would return an array:

>> results_orig = "abcd".scan(/(a)|(b)/)
#=> [["a", nil], [nil, "b"]]

Now, if I duplicate (.dup) or clone (.clone) this array,

>> results_copy = results_orig.dup
#=> [[["a", nil], [nil, "b"]]

and modify any element of this copy, the original array also gets modified!

>> results_copy[0][0]="hello"
#=> "hello"
>> results_copy
#=> [["hello", nil], [nil, "b"]]
>> results_orig
#=> [["hello", nil], [nil, "b"]]

This is strange, since, first, the arrays have different object IDs (results_orig.object_id == results_copy.object_id returns false), and, second, it does not happen if the array was not the product of the Scan method. To see the latter, consider the following example.

>> a = [1, 2, 3]
>> b = a.dup
>> b[0] = "hello"
>> a
#=> [1, 2, 3]
>> b
#=> ["hello", 2, 3]

My current solution is to run scan twice and catch each array in separate objects---that is, r_orig = "abca".scan(/(a)|(b)/)" ; r_copy = "abca".scan(/(a)|(b)/). But this is going to be very inefficient when I have to scan hundreds of strings.

Is there a proper way to duplicate the array from Scan's results that I can then modify whilst leaving the original results array unharmed?

Edit #1: I am running Ruby 2.0.0-p353 on Mac OS X 10.9.2.

Edit #2: It appears the issue exists when the array structure is nested... simple (single-level) arrays don't seem to have this problem. Corrected my example to reflect this.

John
  • 649
  • 5
  • 6

1 Answers1

4

You need to make a Deep copy. Check out this article for more information. Essentially, you need to do

copied_array = Marshal.load(Marshal.dump(complex_array))

Code source: http://thingsaaronmade.com/blog/ruby-shallow-copy-surprise.html. Marshalling works for arrays, but not for every object. A more robust method to perform a Deep copy is in the answer to this question.

Community
  • 1
  • 1
Guru
  • 1,303
  • 18
  • 32