2

I am trying to get a full understanding of how Ruby locates methods/symbols but am struggling when it involves multiple levels, especially the global/file scope.

When calling methods explicitly on a class, there are lots of illustrations on the order in which the classes, and modules included by them are searched (and thus exactly what super calls in each case). But when not explicitly calling a method, e.g. a plain func args rather than self.func args what is the search order?

Why does in my example below, the member method calling func find the member method before the global, but func2 finds the global without method_missing being called? And when the global is instead an module/class/type, why is the member with the same name not found at all?

Is there any official documentation as to exactly what the language does when it encounters "func, func(), func arg" etc. in a method? There is a lot of third-party blogs, but they only really talked about single instances with include Module and class Type < BaseType.

def func; "global func" end
def func2; "global func 2" end

class Test
  def x; func end
  def y; func2 end
  def z; Math end
  def w
    func = "local_var"
    [func(), func]
  end


  def func(arg=nil); "method func" end
  def func=(x); puts "assign func=" end
  def Math; "method Math" end

  def method_missing(sym, *args, &block)
    puts "method_missing #{sym}"
    super(sym, *args, &block)
  end
end

x = Test.new
puts x.x.inspect # "method func", member overrides global
puts x.y.inspect # "global func 2", "method_missing" was not called
puts x.z.inspect # "Math" module, member did not override global
puts x.w.inspect # ["method_func", "local_var"], local variables are always considered before anything else
Fire Lancer
  • 29,364
  • 31
  • 116
  • 182
  • You can't define methods with capital letters, can you? It wont raise an error at definition, but when you try to call it (try `def Something` instead of an already defined module/class) – Ninigi Sep 09 '16 at 09:54
  • @Ninigi : You can define methods with capital letters, see `Array()` in [`Kernel`](https://ruby-doc.org/core-2.3.1/Kernel.html) – spickermann Sep 09 '16 at 10:03
  • @spickermann uh, well the source is C Code though, not ruby... It's not `def Array`. To be a little clearer with what I meant: You can DEFINE the method, but not call it, because ruby will always assume it is a constant, not a method. – Ninigi Sep 09 '16 at 10:06
  • @Ninigi: Just run `def Foo(string); puts "foo #{string}"; end; Foo('bar')` in IRB. It is possible, it works and there are no errors or warnings - and it is plain Ruby. – spickermann Sep 09 '16 at 10:22
  • @spickermann ah ok, rubys unanticipated smartness strikes again xD of course you can call the capital letter methods, you just need the brackets :D learned something today – Ninigi Sep 09 '16 at 10:28
  • While I agree that naming a method as such is against conventions, i didn't think Ruby itself actually cared about names in any context? Isn't that just Rails library magic for autoloading (method_missing override?), and mapping things like ActiveRecord to SQL table schemas and other such library code that cares? – Fire Lancer Sep 09 '16 at 11:06

1 Answers1

4

Ruby's method lookup algorithm is actually really simple:

  • retrieve the class pointer of the receiver
  • if the method is there, invoke it
  • otherwise retrieve the superclass pointer, and repeat

That's it.

If the algorithm comes to a point where there is no more superclass, but it still hasn't found the method yet, it will restart the whole process again, with method_missing as the message and the name of the original message prepended to the arguments. But that's it. That's the whole algorithm. It is very small and very simple, and it has to be very small and very simple because method lookup is the single most often executed operation in an object-oriented language.

Note: I am completely ignoring Module#prepend / Module#prepend_features since I just don't know enough about how it works. I only know what it does, and that's good enough for me.

Also note: I am ignoring performance optimizations such as caching the result of a method lookup in something like a Polymorphic Inline Cache.

Okay, but here's the trick: where exactly do those class and superclass pointers point to? Well, they do not point to what the Object#class and Class#superclass methods return. So, let's step back a little.

Every object has a class pointer that points to the class of the object. And every class has a superclass pointer that points to its superclass.

Let's start a running example:

class Foo; end

Now, we have class Foo, and its superclass pointer points to Object.

foo = Foo.new

And our object foo's class pointer points to Foo.

def foo.bar; end

Now things start to get interesting. We have created a singleton method. Well, actually, there is no such thing as a singleton method, it's really just a normal method in the singleton class. So, how does this work? Well, now the class pointer points to foo's singleton class and foo's singleton class's superclass pointer points to Foo! In other words, the singleton class was inserted in between foo and its "real" class Foo.

However, when we ask foo about its class, it still responds Foo:

foo.class #=> Foo

The Object#class method knows about singleton classes, and simply skips over them, following the superclass pointer until it finds a "normal" class, and returns that.

Next complication:

module Bar; end

class Foo
  include Bar
end

What happens here? Ruby creates a new class (let's call it Barʹ), called an include class. This class's method table pointer, class variable table pointer, and constant table pointer point to Bar's method table, class variable table, and constant table. Then, Ruby makes Barʹ's superclass pointer point to Foo's current superclass, and then makes Foo's superclass pointer point to Barʹ. In other words, including a module creates a new class that gets inserted as the superclass of the class the module is included into.

There's a slight complication here: you can also include modules into modules. How does that work? Well, Ruby simply keeps track of the modules that were included into a module. And then, when the module is included into a class, it will recursively repeat the steps above for every included module.

And that's all you need to know about the Ruby method lookup:

  • find the class
  • follow the superclass
  • singleton classes insert above objects
  • include classes insert above classes

Now let's look at some of your questions:

When calling methods explicitly on a class, there are lots of illustrations on the order in which the classes, and modules included by them are searched (and thus exactly what super calls in each case). But when not explicitly calling a method, e.g. a plain func args rather than self.func args what is the search order?

The same. self is the implicit receiver, if you don't specify a receiver, the receiver is self. And parentheses are optional. In other words:

func args

is exactly the same as

self.func(args)

Why does in my example below, the member method calling func find the member method before the global, but func2 finds the global without method_missing being called?

There is no such thing as a "global method" in Ruby. There is also no such thing as a "member method". Every method is an instance method. Period. There are no global, static, class, singleton, member methods, procedures, functions, or subroutines.

A method defined at the top-level becomes a private instance method of class Object. Test inherits from Object. Run the steps I outlined above, and you will find exactly what is going on:

  • Retrieve x's class pointer: Test
  • Does Test have a method called func: Yes, so invoke it.

Now again:

  • Retrieve x's class pointer: Test
  • Does Test have a method called func2: No!
  • Retrieve Test's superclass pointer: Object
  • Does Object have a method called func2: Yes, so invoke it.

And when the global is instead an module/class/type, why is the member with the same name not found at all?

Again, there is no global here, there are no members here. This also doesn't have anything to do with modules or classes. And Ruby doesn't have (static) types.

Math

is a reference to a constant. If you want to call a method with the same name, you have to ensure that Ruby can tell that it's a method. There are two things that only methods can have: a receiver and arguments. So, you can either add a receiver:

self.Math

or arguments:

Math()

and now Ruby knows that you mean the method Math and not the constant Math.

The same applies to local variables, by the way. And to setters. If you want to call a setter instead of assigning a local variable, you need to say

self.func = 'setter method'
Jörg W Mittag
  • 363,080
  • 75
  • 446
  • 653
  • `def x; self.func2 end` calls `method_missing`, and `def x; func2 end` does not, so there is clearly some difference. And if it was private, wouldn't calling `func2` be disallowed? – Fire Lancer Sep 09 '16 at 13:38
  • `func2` is private, private methods can only be called without an explicit receiver. So, in `self.func2`, `Object#func2` is not found because it is private, therefore `method_missing` is invoked. I don't get your second question: why would calling private methods be disallowed? What would be the point of private methods if calling them weren't allowed? – Jörg W Mittag Sep 09 '16 at 13:40
  • First off, cudos to that answer @JörgWMittag I think it's a nice read for beginners and intermediates like me. – Ninigi Sep 09 '16 at 13:42
  • Second, to @FireLancer s question about the "private" function, maybe we should back off a little from the word private here, since private in Ruby is not as private as in other languages ;) Other than that... I dont really get the question either – Ninigi Sep 09 '16 at 13:44
  • yeah your right about private, my bad. im used to privates not being available to subtypes (can only be invoked if direct by a method of the same type), but in Ruby that is note exactly the case – Fire Lancer Sep 09 '16 at 13:44
  • So, the thing special about "Math", if just the capitalization of the first letter? Is that the distinction at the technical level (and that classes/modules and constants are generally named with such conventions)? – Fire Lancer Sep 09 '16 at 13:46
  • pretty much, yes. Whenever you send a message starting with a capital letter, ruby will assume it is a constant, unless (as stated in the answer) you give ruby a clue its a method – Ninigi Sep 09 '16 at 13:50
  • No, it's not a convention. That's the *definition* of a constant in Ruby. A variable that starts with a lowercase letter is a local variable. A variable that starts with an uppercase letter is a constant. A variable that starts with a `$` is a global variable. A variable that starts with two `@@` is a class hierarchy variable. A variable that starts with `@` is an instance variable. Methods aren't allowed to start with `@` or `$`, so there's no problem there, but they are allowed to start with letters, and so you can get ambiguities with constants and local variables which you can resolve … – Jörg W Mittag Sep 09 '16 at 13:50
  • … by adding a receiver or an argument list. – Jörg W Mittag Sep 09 '16 at 13:51
  • @JörgWMittag ah thanks... I deleted my comment, since it was not different from what you said :) – Ninigi Sep 09 '16 at 13:53
  • Yes you are right about "CONSTANTS", I guess i just never really considered "Module" as being one, but rather its own thing. Knowing the right keyword I found the correct doc page now, I think just a deficiency from when i learned that stuff / resources i used – Fire Lancer Sep 09 '16 at 13:54
  • @Ninigi By the way, I wouldn't say "private is not as private as in other languages". It is, in fact, *more private*. In Java, for example, one object can access another object's privates if they are both of the same type. In Ruby, an object can only access its own privates, and no others. (Of course, in both languages, once you allow reflection, all of those restrictions fly out the window.) – Jörg W Mittag Sep 09 '16 at 13:55
  • id say its just a different type of private, since in Ruby a subtype can access a private of the current instance, but in Java/C++ style it cant (which personally i find useful, e.g. to force the person subtyping my thing to use the setter/manipulator and not play directly with the attribute, but can also see why not being able to do `other_instance.some_private` is good, guess cant have everything) – Fire Lancer Sep 09 '16 at 14:01