Why can we modify complex parameters but not scalars in a ruby function?

Question

The question Is Ruby pass by reference or by value? has attracted a lot of helpful answers and also a lot of disagreement. What I don't see in any of the answers so far is anything that explains the following:

ruby -e "def f(x) x=7 end; a=3; f(a); print a" prints 3.

ruby -e "def f(x) x[0]=7 end; a=[3]; f(a); print a[0]" prints 7.

Empirically, this looks to me like there is some kind of distinction between scalar objects and more complex objects such as hashes and arrays, with scalars being passed by value and complex objects by reference. That would be similar to the semantics of C.

My understanding is that everything in ruby is an object, and none of the answers to the earlier question mention a distinction between scalars and complex types. So is my description wrong, and if so, what is a better description?

The best explanation I found is here : http://stackoverflow.com/a/18069011/6419007 — Eric Duminil, Jan 24 '17 at 17:14
Everything is pass-by-value, and the values passed are object references. It's easier to understand your example then. — Eric Duminil, Jan 24 '17 at 17:15
@EricDuminil: *Everything is pass-by-value, and the values passed are object references. It's easier to understand your example then.* Hmm...but this doesn't clarify for me why I get *different* behavior in the two cases. Given your description, why does the scalar example print 3? I would expect it to pass a reference to an object containing the value 3. Then I would expect the function to modify the object by substituting the value 7, so that it would print 7. Is the difference in behavior a difference between immutable and mutable types, rather than scalars and complex types? — , Jan 24 '17 at 17:21
@BenCrowell: There is no concept of "mutable" or "immutable" at the language level. The semantics apply equally to all types. The difference is that in one case you are assigning to the variable to have it point to a different object but never sent any message to any object to do anything to its internal state, whereas in the other case you are sending a message to an object to do something to its internal state. — newacct, Jan 24 '17 at 22:34
@BenCrowell: why would you expect the object to be mutated in the first case? You never call any methods, but calling methods is how you mutate things (or do anything at all, really in OO). — Jörg W Mittag, Jan 24 '17 at 23:07

tadman · Accepted Answer · 2017-01-25T03:53:28.247

The trouble with the terminology here is Ruby is "pass by object reference", which is a way of saying "pointer to object" in other languages. The line between pointer and reference in Ruby is blurred because there are no actual pointers, plus objects themselves are kept in memory by reference counting where a pointer ends up being a reference. So they're pointers that are references to objects, but not references in the conventional sense of being hard-linked to the same variable.

Every variable, by definition, always represents an object, even when it's not defined: nil itself is an object as well as numbers, even floating-point ones. This makes the very term "scalar" almost irrelevant, there's no fundamental types in Ruby like you have in other languages and the difference between a boolean value, a number, a string and a class instance are heavily blurred.

The general rule is you're never able to back-propagate changes to variables, but changes made through methods do propagate. To understand why, here's how Ruby sees your code:

def f(x)
  # Change value of local variable x to 7
  x = 7
end

That just redefines the object that x points to, as even 7 is an object.

The other code is radically different in how it's perceived by Ruby:

def f(x)
  # Send the []= method call to x with the argument 7
  x.send(:[]=, 7)
end

This sends a message (method call) to x to trigger the []= method. That method can do anything it wants with the value, but in the case of arrays, hashes and complex numbers that has specific meaning. It updates the internal state of the object x references.

You can see how this plays out in other scenarios:

def f(x)
  x += 'y'
end

This expands to x = x + y which does a variable reassignment with the intermediate result. The original x value is not modified.

def f(x)
  x << 'y'
end

In this case it's x.send(:<<, 'y') which does an in-place modification on x, so the original is modified.

Being able to recognize method calls is an important thing when writing and understanding Ruby code. Sometimes they're not even all that obvious. You'd think that the presence of = means "variable assignment" but that's not strictly the case:

def f(x)
  x.y = 'z'
end

This looks like it's assigning to the y property of x but it's not, it's just calling the y= method, it's equivalent to x.send(:y=, 'z') which is something x can interpret in any number of ways. That may modify the value, or it might do something completely different. There's no way of knowing without understanding x more closely.

score 2 · Answer 2 · edited May 23 '17 at 11:45

Empirically, this looks to me like there is some kind of distinction between scalar objects and more complex objects such as hashes and arrays, with scalars being passed by value and complex objects by reference.

There is no such thing as a "scalar object" or a "complex object" in Ruby. Everything is an object. Period. And everything is pass-by-value, always, no exceptions. There is never any pass-by-reference going on, ever.

More precisely, Ruby is what is commonly called call-by-object-sharing, call-by-sharing, or call-by-object. This is a special case of pass-by-value, where the value being passed is always a pointer to an object.

Free variables in closures are captured by reference, but that is a different question and has nothing to do with this one.

That would be similar to the semantics of C.

No, actually, it wouldn't. There is no pass-by-reference in C, C is always pass-by-value, just like Ruby.

In C, everything is passed by value. ints are passed by value. chars are passed by value. And pointers are passed by value. Ruby is like C, except there are only pointers; every value that is passed is a pointer to an object.

def f(x)
  x = 7
end

a = 3
f(a)
a #=> 3

def f(x)
  x[0] = 7
end

a = [3]
f(a)
a[0] #=> 7

These two cases are fundamentally different: in the first case, you bind a new value to the parameter x inside the method. This re-binding is only visible inside the method body. Method parameters essentially behave like local variables. (In fact, if you reflect upon the local variables of a method body, you will see that the parameters show up.)

In the second case, you call a method that mutates the receiver. There is no assignment going on. Yes, there is an equals sign, but that is just a part of Ruby's indexing method assignment syntactic sugar. What you are really doing, is calling the method []=, passing 0 and 7 as arguments. It is completely equivalent to calling x.[]=(0, 7); in fact, you can write it that way if you want. (Try it!) Maybe, you would be less confused, if you used the insert method instead of []=, or another method whose name more obviously screams "I am changing the array", such as clear or replace?

The array is still the exact same array that you passed into the method. The reference was not modified. Only the array was. Arrays would be pretty useless if we couldn't insert stuff into them, and that stuff then stayed in there!

So, the difference between the two cases is that in the first case, you assigned a new value, i.e. you mutated the reference, which doesn't work, because Ruby is pass-by-value. In the second case, you mutated the value, which does work, because Ruby is not a purely functional language with purely immutable objects. Ruby is impure, and it does have mutable objects, and if you mutate an object, well, the object mutates.

My mom and my hairdresser refer to me by different names, but if my hairdresser cuts my hair, my mom will also observe that fact.

Note: there are objects which don't have methods that mutate them. These objects are immutable. Integers are such immutable objects, so you can never demonstrate something like the above with Integers, but that is purely a result of the fact that Integers don't have mutating methods, it has nothing to do with them being "scalar". You can have complex, compound objects that don't have any mutating methods, if you want: here is a question about implementing a linked list in Ruby, the two answers contain three implementations of linked lists, all of them immutable. (Disclaimer: one answer with two implementations is from me.)

akuhn · Answer 3 · 2017-01-24T23:51:15.217

Ruby is "pass by pointer to an object."

So now what is the difference?

def f(x)
  x = 7 
end

Assigns a new value to the local variable x—this change is local since you reassigned a local variable.

def  f(x)
  x[0] = 7
end

Assigns a new value to the first element to the array reference by x—this change is global since you modify an object.

The distinction between pass by value and pass by reference does not apply to Ruby, that is from another programming language and it does not make sense in the context of Ruby.

Why can we modify complex parameters but not scalars in a ruby function?

3 Answers3