0

I'm looking for some insight into scala internals. We've just come out the other side of a painful debug session, and found out our problem was caused by a unexpected null value that we had thought would be pre-initialised. We can't fathom why that would be the case.

Here is an extremely cut down example of the code which illustrates the problem (if it looks convoluted it's because it's much more complicated in real code, but i've left the basic structure alone in case it's significant).

trait A {
  println("in A")

  def usefulMethod

  def overrideThisMethod = {
    //defaultImplementation
  }

  val stubbableFunction = {
    //do some stuff

    val stubbableMethod = overrideThisMethod

    //do some other stuff with stubbableMethod
  }
}

class B extends A {
  println("in B")

  def usefulMethod = {
    //do something with stubbableFunction
  }
}

class StubB extends B {
  println("in StubB")

  var usefulVar = "super useful"   //<<---this is the val that ends up being null

  override def overrideThisMethod {
    println("usefulVar = " + usefulVar)
  }
}

If we kick off the chain of initialisation, this is what is printed to the console:

scala> val stub = new StubB
in A
usefulVar = null
in B
in StubB

My assumptions

I assume that in order to instantiate StubB, first we instantiate trait A, and then B and finally StubB: hence the printing order of ("in A ", "in B", "in StubB"). I assume stubbableFunction in trait A is evaluated on initialisation because it's a val, same for stubbableMethod.

From here on is where i get confused.

My question

When val overrideThisMethod is evaluated in trait A, i would expect the classloader to follow the chain downwards to StubB (which it does, you can tell because of the printing of "usefulVal = null") but... why is the value null here? How can overrideThisMethod in StubB be evaluated without first initialising the StubB class and therefore setting usefulVal? I didnt know you could have "orphaned" methods being evaluated this way - surely methods have to belong to a class which has to be initialised before you can call the method?

We actually solved the problem by changing the val stubbableFunction = to def stubbableFunction = in trait A, but we'd still really like to understand what was going on here. I'm looking forward to learning something interesting about how Scala (or maybe Java) works under the hood :)

edit: I changed the null value to be var and the same thing happens - question updated for clarity in response to m-z's answer

moncheery
  • 333
  • 4
  • 17

1 Answers1

2

I stripped down the original code even more leaving the original behavior intact. I also renamed some methods and vals to express the semantics better (mostly function vs value):

trait A {
  println("in A")

  def overridableComputation = {
    println("A::overridableComputation")
    1
  }

  val stubbableValue = overridableComputation

  def stubbableMethod = overridableComputation
}

class StubB extends A {
  println("in StubB")

  val usefulVal = "super useful" //<<---this is the val that ends up being null

  override def overridableComputation = {
    println("StubB::overridableComputation")
    println("usefulVal = " + usefulVal)
    2
  }
}

When run it yields the following output:

in A
StubB::overridableComputation
usefulVal = null
in StubB
super useful

Here are some Scala implementation details to help us understand what is happening:

  1. the main constructor is intertwined with the class definition, i.e. most of the code (except method definitions) between curly braces is put into the constructor;
  2. each val of the class is implemented as a private field and a getter method, both field and method are named after val (JavaBean convention is not adhered to);
  3. the value for the val is computed within the constructor and is used to initialize the field.

As m-z already noted, the initialization runs top down, i.e. the parent's class or trait constructor is called first, the child's constructor is called last. So here's what happens when you call new StubB():

  1. A StubB object is allocated in heap, all its fields are set to default values depending on their types (0, 0.0, null, etc);
  2. A::A is invoked first as the top-most constructor;
    1. "in A" is printed;
    2. in order to compute the value for stubbableValue overridableComputation is called, the catch is in fact that the overridden method is called, i.e. StubB::overridableComputation see What's wrong with overridable method calls in constructors? for more details;
      1. "StubB::overridableComputation" is printed;
      2. since usefulVal is not yet initialized by StubB::StubB it's default value is used, so "usefulVal = null" is printed;
      3. 2 is returned;
    3. stubbableValue is initialized with the computed value of 2;
  3. StubB::StubB is invoked as the next constructor in chain;
    1. "in StubB" is printed;
    2. the value for usefulVar is computed, in this case just the literal "super useful" is used;
    3. usefulVar is initialized with the value of "super useful".

Since the value for stubbableValue is computed during constructor run

To prove these assumptions fernflower Java decompiler can be used. Here's how the above Scala code looks when decompiled to Java (I removed irrelevant @ScalaSignature annotations):

import scala.collection.mutable.StringBuilder;

public class A {
   private final int stubbableValue;

   public int overridableComputation() {
      .MODULE$.println("A::overridableComputation");
      return 1;
   }

   public int stubbableValue() {
      return this.stubbableValue;
   }

   public int stubbableMethod() {
      return this.overridableComputation();
   }

   public A() {
      .MODULE$.println("in A");
      // Note, that overridden method is called below!
      this.stubbableValue = this.overridableComputation();
   }
}

public class StubB extends A {
   private final String usefulVal;

   public String usefulVal() {
      return this.usefulVal;
   }

   public int overridableComputation() {
      .MODULE$.println("StubB::overridableComputation");
      .MODULE$.println(
        (new StringBuilder()).append("usefulVal = ")
                             .append(this.usefulVal())
                             .toString()
      );
      return 2;
   }

   public StubB() {
      .MODULE$.println("in StubB");
      this.usefulVal = "super useful";
   }
}

In case A is a trait instead of a class the code is a bit more verbose, but behavior is consistent with the class A variant. Since JVM doesn't support multiple inheritance Scala compiler splits a trait into a abstract helper class which only contains static members and an interface:

import scala.collection.mutable.StringBuilder;

public abstract class A$class {
   public static int overridableComputation(A $this) {
      .MODULE$.println("A::overridableComputation");
      return 1;
   }

   public static int stubbableMethod(A $this) {
      return $this.overridableComputation();
   }

   public static void $init$(A $this) {
      .MODULE$.println("in A");
      $this.so32501595$A$_setter_$stubbableValue_$eq($this.overridableComputation());
   }
}

public interface A {
   void so32501595$A$_setter_$stubbableValue_$eq(int var1);

   int overridableComputation();

   int stubbableValue();

   int stubbableMethod();
}

public class StubB implements A {
   private final String usefulVal;
   private final int stubbableValue;

   public int stubbableValue() {
      return this.stubbableValue;
   }

   public void so32501595$A$_setter_$stubbableValue_$eq(int x$1) {
      this.stubbableValue = x$1;
   }

   public String usefulVal() {
      return this.usefulVal;
   }

   public int overridableComputation() {
      .MODULE$.println("StubB::overridableComputation");
      .MODULE$.println(
        (new StringBuilder()).append("usefulVal = ")
                             .append(this.usefulVal())
                             .toString()
      );
      return 2;
   }

   public StubB() {
      A$class.$init$(this);
      .MODULE$.println("in StubB");
      this.usefulVal = "super useful";
   }
}

Remember that a val is rendered into a field and a method? Since several traits can be mixed into a single class, a trait cannot be implemented as a class. Therefore, the method part of a val is put into an interface, while a field part is put into the class that a trait gets mixed into.

The abstract class contains the code of all the trait's methods, access to the member fields is provided by passing $this explicitly.

Community
  • 1
  • 1
Ihor Kaharlichenko
  • 5,944
  • 1
  • 26
  • 32
  • wow Ihor, it's going to take me a litle while to digest this. Just a comment to say that whether useVal is a val or a var - i see the same behaviour. I dont know whether this changes any of your answer - i mention it because you added this answer while i was making the edit. Thanks, i'll read your full reply once i get a cup of tea :) – moncheery Sep 10 '15 at 15:04
  • `var` vs `val` doesn't really change anything. The only thing that changes is that for `var x` a setter named `x_$eq` is generated, but this doesn't affect anything in your case. – Ihor Kaharlichenko Sep 10 '15 at 15:06
  • Awesome. Perfect answer. I knew it would be something interesting. Thank you Ihor – moncheery Sep 10 '15 at 15:48