27

Below is the source for the reduceLeft method of Scala's TraversableOnce trait. What is happening with the line that reads var acc: B = 0.asInstanceOf[B]?

To me, it seems that if I call this on a list of Strings, such as List("a", "b", "c"), this would result in something like 0.asInstanceOf[String]. However, 0.asInstanceOf[String] throws a ClassCastException at run time if I try it directly.

What is happening with that line, and why is it different than calling 0.asInstanceOf[String] directly when called on a list of Strings?

def reduceLeft[B >: A](op: (B, A) => B): B = {
  if (isEmpty)
    throw new UnsupportedOperationException("empty.reduceLeft")

  var first = true
  var acc: B = 0.asInstanceOf[B]

  for (x <- self) {
    if (first) {
      acc = x
      first = false
    }
    else acc = op(acc, x)
  }
  acc
}

Bonus question: why is acc even initialized to that value? It looks like the code in the first iteration of the for loop will just overwrite that value with the first element in the TraversableOnce object.

ceedubs
  • 412
  • 3
  • 7

2 Answers2

15

Well, the reason this is not failing with a ClassCastException at runtime is either because:

  1. The compiler removes the initialization reasoning that it is never used (I doubt this)
  2. B is erased to Object (or AnyRef), hence the cast is really 0.asInstanceOf[Object]

You could test the first possibility by checking the bytecode but it seems a pointless piece of code. If it is not elided, then this has the unnecessary overhead of boxing an Int on every call to the method (although not an object creation - see below)!

Figuring out what the compiler produces: 1

Welcome to Scala version 2.9.1.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_29).
Type in expressions to have them evaluated.
Type :help for more information.

scala> trait X[A] {
 | def foreach[U](f: A => U): U
 | def isEmpty = {
 | var b = false
 | foreach(_ => b = true)
 | !b
 | }
 | def reduceLeft[B >: A](op: (B, A) => B): B = {
 | if (isEmpty) sys.error("Bad")
 | var first = true
 | var acc: B = 0.asInstanceOf[B]
 | for (x <- this) {
 | if (first) { acc = x; first = false } else acc = op(acc, x)
 | }
 | acc
 | }
 | }
defined trait X

So now:

scala> :javap -v X
Compiled from "<console>"
public interface X extends scala.ScalaObject
  SourceFile: "<console>"
  Scala: length = 0x

  Signature: length = 0x2
   00 0D 
  InnerClass: 
   public abstract #19= #16 of #18; //X=class X of class 
   public final #21; //class X$$anonfun$isEmpty$1
   public final #23; //class X$$anonfun$reduceLeft$1
  minor version: 0
  major version: 49
  Constant pool:
const #1 = Asciz    SourceFile;
const #2 = Asciz    <console>;
const #3 = Asciz    foreach;
const #4 = Asciz    (Lscala/Function1;)Ljava/lang/Object;;
const #5 = Asciz    <U:Ljava/lang/Object;>(Lscala/Function1<TA;TU;>;)TU;;
const #6 = Asciz    Signature;
const #7 = Asciz    isEmpty;
const #8 = Asciz    ()Z;
const #9 = Asciz    reduceLeft;
const #10 = Asciz   (Lscala/Function2;)Ljava/lang/Object;;
const #11 = Asciz   <B:Ljava/lang/Object;>(Lscala/Function2<TB;TA;TB;>;)TB;;
const #12 = Asciz   Scala;
const #13 = Asciz   <A:Ljava/lang/Object;>Ljava/lang/Object;Lscala/ScalaObject;;
const #14 = Asciz   InnerClasses;
const #15 = Asciz   X;
const #16 = class   #15;    //  X
const #17 = Asciz   ;
const #18 = class   #17;    //  
const #19 = Asciz   X;
const #20 = Asciz   X$$anonfun$isEmpty$1;
const #21 = class   #20;    //  X$$anonfun$isEmpty$1
const #22 = Asciz   X$$anonfun$reduceLeft$1;
const #23 = class   #22;    //  X$$anonfun$reduceLeft$1
const #24 = Asciz   java/lang/Object;
const #25 = class   #24;    //  java/lang/Object
const #26 = Asciz   scala/ScalaObject;
const #27 = class   #26;    //  scala/ScalaObject

{
public abstract java.lang.Object foreach(scala.Function1);
  Signature: length = 0x2
   00 05 

public abstract boolean isEmpty();

public abstract java.lang.Object reduceLeft(scala.Function2);
  Signature: length = 0x2
   00 0B 

}

Make of that what you will!

Figuring out what the compiler produces: 2

One other possibility is to put this in a source file and compile it, printing out the intermediate code phase:

./scalac -Xprint:icode X.scala

then you get...

def reduceLeft($this: X, op$1: Function2): java.lang.Object = {
  if ($this.isEmpty())
    scala.sys.`package`.error("Bad")
  else
    ();
  var first$1: scala.runtime.BooleanRef = new scala.runtime.BooleanRef(true);
  var acc$1: scala.runtime.ObjectRef = new scala.runtime.ObjectRef(scala.Int.box(0));
  $this.foreach({
    (new anonymous class X$$anonfun$reduceLeft$1($this, op$1, first$1, acc$1): Function1)
  });
  acc.elem
};

It looks distinctly like the unnecessary boxing is in there! One point is that this won't involve an object creation, merely a lookup, because the boxed values of -127 to 127 are cached.

You can check the erased line by changing the compiler phase in the above print command to "erasure". Hey presto:

var acc: java.lang.Object = scala.Int.box(0).$asInstanceOf[java.lang.Object]();
oxbow_lakes
  • 133,303
  • 56
  • 317
  • 449
  • Did you check that the compiler remove the initialization? `-Xprint:typer` may suggest it is initialized with `scala.Int.box(0)`. But then I don't know what happens at byte code generation. – huynhjl Dec 11 '11 at 16:35
  • Indeed - I haven't checked yet. I suspect that the line is not elided – oxbow_lakes Dec 11 '11 at 16:36
  • I can't claim that the byte code enlightened me, but the `./scalac -Xprint:icode X.scala` output was very helpful, along with your confirmation that it showed unnecessary boxing. Thanks; I'll remember that compiler switch when I run into future questions! – ceedubs Dec 11 '11 at 17:00
  • 1
    The 3 useful phases I find to look at are `icode`, `typer` and `erasure` – oxbow_lakes Dec 11 '11 at 17:04
  • 1
    I am familiar enough with type erasure in Java that I'm a little ashamed that I didn't think about the fact that B would be type erased. I think I overlooked this because I was in the mentality that it had to be doing _something_ useful. Pride is a small price to pay for the useful tips. Thanks for teaching me to fish instead of just giving me a fish. – ceedubs Dec 11 '11 at 19:15
5

The answer to the bonus question can be deduced by attempting to propose an alternative. The accumulator is of type B. What B will you assign to it? (There is a better answer, namely null.asInstanceOf[B] which is what is normally done, but I assume that would leave your question in place.)

psp
  • 12,138
  • 1
  • 41
  • 51
  • +1 for answering bonus question and providing better alternative. In response to "but I assume that would leave your question in place", I actually never would have asked this question if the code were `null.asInstanceOf[B]`. What initially threw me off was thinking that somehow 0 was being cast to any arbitrary type – ceedubs Dec 11 '11 at 19:31
  • Only by this implementation - the method could have simply been implemented as `(self.foldLeft(None: Option[B])(op)) getOrElse throw new UOE()` – oxbow_lakes Dec 11 '11 at 19:49
  • ...and I wonder what analysis was done to decide that the current implementation is more performant (and how much more performant)? – oxbow_lakes Dec 11 '11 at 19:53
  • I meant of course: `(self.foldLeft(None: Option[B]) { (ob, a) => ob map (b => op(b, a)) orElse Some(a: B) } getOrElse throw new UOE`. I guess in the non-empty case this involves rather a lot of object creation. – oxbow_lakes Dec 11 '11 at 20:21
  • 1
    If we were going to provide an elegant but poorly performing implementation for reduceLeft (Options, really?) it'd make more sense to just skip it and let everyone write in terms of folds. The collections have these weird-looking implementations so you don't have to. – psp Dec 11 '11 at 21:04
  • I love the "Options, really" comment, Paul. It is a logical implementation which I would always use myself, unless it was proven that it was suboptimal in practice (I prefer readability over performance, which is why I'm using scala in the first place). I'd be interested to see what analysis was done as to the performance overhead. – oxbow_lakes Dec 12 '11 at 00:03