11

My other question got closed as a duplicate, so I'll try this again. I have also read this question and what I'm asking is different. I'm interested in learning the internal implementation of how Call-by-Name: => Type differs from () => Type.

My confusion is coming from looking at javap and cfr disassembly which shows no difference in the two cases.

e.g. ParamTest.scala:

object ParamTest {
  def bar(x: Int, y: => Int) : Int = if (x > 0) y else 10
  def baz(x: Int, f: () => Int) : Int = if (x > 0) f() else 20
}

javap output javap ParamTest.scala:

public final class ParamTest {
  public static int baz(int, scala.Function0<java.lang.Object>);
  public static int bar(int, scala.Function0<java.lang.Object>);
}

CFR Decompiled output java -jar cfr_0_118.jar ParamTest$.class:

import scala.Function0;

public final class ParamTest$ {
    public static final ParamTest$ MODULE$;

    public static {
        new ParamTest$();
    }

    public int bar(int x, Function0<Object> y) {
        return x > 0 ? y.apply$mcI$sp() : 10;
    }

    public int baz(int x, Function0<Object> f) {
        return x > 0 ? f.apply$mcI$sp() : 20;
    }

    private ParamTest$() {
        MODULE$ = this;
    }
}

EDIT 1: Scala Syntax Tree: scalac -Xprint:parse ParamTest.scala

package <empty> {
  object ParamTest extends scala.AnyRef {
    def <init>() = {
      super.<init>();
      ()
    };
    def bar(x: Int, y: _root_.scala.<byname>[Int]): Int = if (x.$greater(0))
      y
    else
      10;
    def baz(x: Int, f: _root_.scala.Function0[Int]): Int = if (x.$greater(0))
      f()
    else
      20
  }
}

EDIT 2: Mailing list research:

Read this interesting post on the mailing list which essentially states that => T is implemented as () => T. Quote:

First, look at

f: => Boolean

Although this is called a "by-name parameter", it is actually implemented as a Function0,

f: () => Boolean

just with different syntax used on both ends.

Now I'm even more confused by this answer which explicitly states that the two are different.

Questions:

  • How is Scala distinguishing bar from baz? The method signatures (not implementation) for both are identical in the decompiled code.
  • Does the difference in the two scenarios not get persisted into the compiled bytecode?
  • Is the decompiled code inaccurate?
  • Added after Edit 1: I found that the scalac syntax tree does show a difference, bar has the second argument of type _root_.scala.<byname>[Int]. What does it do? Any explanation, pointers in scala source or equivalent pseudo code will be helpful.
  • See EDIT 2 above: Is the quoted block correct? As in, is => T a special subclass of Function0?
Community
  • 1
  • 1
vsnyc
  • 2,117
  • 22
  • 35
  • 2
    Why is it important how these types are represented in JVM bytecode? They are different in Scala code, it's what matters (unless you are trying to call Scala from Java, but it's not what you are asking). – Victor Moroz Oct 26 '16 at 00:43
  • Because Scala works that way. It compiles scala code to .class files and executes in the JVM. Hence the .class file should have necessary and sufficient information. Scala source files are not important at all. See the [linked question](http://stackoverflow.com/questions/40246137/what-does-double-right-arrow-type-with-no-lhs-mean-in-function-argument/), you can reproduce the error calling `foo(baz,100)` after deleting `ParamTest.scala` source file. – vsnyc Oct 26 '16 at 02:11
  • Another motivation for this question is [this answer](http://stackoverflow.com/a/13337382/2063026) where the most upvoted comment (160) asserts, I quote: "Furthermore, "call-by-name" has nothing to do with names. `=> Int` is a different type from `Int`; it's "function of no arguments that will generate an Int" vs just Int". But as we see above, that it is not the case. call-by-name has to do with lazy evaluation, not that it is a function of no arguments ... – vsnyc Oct 26 '16 at 02:15
  • You are confusing programming language with bytecode. If you translate C++ you will get a bunch of instructions like `mov bx, ax`, which will tell you very little about original code. Call-by-name represents lazy evaluation, it's not a function in Scala terms, but bytecode won't tell you this. – Victor Moroz Oct 26 '16 at 02:30
  • I'm trying to understand how is Scala able to distinguish between `() => T` and `=> T` in two methods in `ParamTest.class`. This is tested after the source file is deleted. I am seeing some references in the Scala source, see [this](https://github.com/scala/scala/blob/05016d9035ab9b1c866bd9f12fdd0491f1ea0cbb/src/reflect/scala/reflect/api/StandardDefinitions.scala#L140) or [this](https://github.com/scala/scala/blob/05016d9035ab9b1c866bd9f12fdd0491f1ea0cbb/src/scalap/scala/tools/scalap/scalax/rules/scalasig/ScalaSigPrinter.scala#L347), will dig more. – vsnyc Oct 26 '16 at 02:40
  • As I recently pointed out in a comment on a similar question: `javap` and `cfr` are decompilers for Java, not Scala. They don't know anything about Scala. So, *of course*, they show "nonsense" when it comes to Scala. If you want to see an even more extreme example, try compiling your code with Scala-native and then use a C decompiler on the generated machine code! – Jörg W Mittag Oct 26 '16 at 08:05
  • I wanted to learn how the .class file contains additional information not available to Java bytecode but persisted in the binary file so it can be used by scala to advise at compile time of difference between `() => T` and `=> T`. Because they seem similar to me, which I now know is truly the case. Yuval's answer has been very helpful, it led me in the right direction. See this [excellent answer](http://stackoverflow.com/a/3312036/2063026) from [VonC](http://stackoverflow.com/users/6309/vonc). Now I know the internals, but took me a very long time due to indirect comments and answers :-( – vsnyc Oct 26 '16 at 14:22

3 Answers3

8

How is Scala distinguishing bar from baz? The method signatures (not implementation) for both are identical in the decompiled code.

Scala doesn't need to distinguish between the two. From it's perspective, these are two different methods. What's interesting (to me at least) is that if we rename baz into bar and try to create an overload with a "call-by-name" parameter, we get:

Error:(12, 7) double definition:
method bar:(x: Int, f: () => Int)Int and
method bar:(x: Int, y: => Int)Int at line 10
have same type after erasure: (x: Int, f: Function0)Int
  def bar(x: Int, f: () => Int): Int = if (x > 0) f() else 20

Which is a hint for us that under the covers, something is going on with the translation to Function0.

Does the difference in the two scenarios not get persisted into the compiled bytecode?

Before Scala emits JVM bytecode, it has additional phases of compilation. An interesting one in this case is to look at the "uncurry" stage (-Xprint:uncurry):

[[syntax trees at end of uncurry]] 
package testing {
  object ParamTest extends Object {
    def <init>(): testing.ParamTest.type = {
      ParamTest.super.<init>();
      ()
    };
    def bar(x: Int, y: () => Int): Int = if (x.>(0))
      y.apply()
    else
      10;
    def baz(x: Int, f: () => Int): Int = if (x.>(0))
      f.apply()
    else
      20
  }
}

Even before we emit byte code, bar is translated into a Function0.

Is the decompiled code inaccurate

No, it's definitely accurate.

I found that the scalac syntax tree does show a difference, bar has the second argument of type root.scala.[Int]. What does it do?

Scala compilation is done in phases, where each phases output is the input to the next. In addition to the parsed AST, Scala phases also create symbols, such that if one stage relys on a particular implementation detail it will have it available. <byname> is a compiler symbol which shows that this method uses "call-by-name", so that one of the phases can see that and do something about it.

Yuval Itzchakov
  • 146,575
  • 32
  • 257
  • 321
4

Because Scala works that way. It compiles scala code to .class files and executes in the JVM. Hence the .class file should have necessary and sufficient information.

It does. This information is stored in an annotation called @ScalaSignature. javap -v should show its presence, but it isn't human-readable.

This is necessary because there is a lot of information in Scala signatures which can't be represented in JVM bytecode: not just by-name vs Function0 parameters, but access qualifiers, parameter names, etc.

Community
  • 1
  • 1
Alexey Romanov
  • 167,066
  • 35
  • 309
  • 487
  • Thank you for confirming. Yuval's answer has been very helpful, it led me in the right direction. I am now reading more e.g. [this answer](http://stackoverflow.com/a/3312036/2063026) from [VonC](http://stackoverflow.com/users/6309/vonc). – vsnyc Oct 26 '16 at 14:24
  • 1
    Note that it applies directly only to quite ancient Scala versions: the `ScalaSig` attribute was replaced by `@ScalaSignature` in Scala 2.8. – Alexey Romanov Oct 26 '16 at 14:27
  • Thanks, I'll make a note of that. – vsnyc Oct 26 '16 at 14:28
3

Scala code is as analyzed by compiler and turned into jvm bytecode. On scala level you have implicits, very strong type system, call by name parameters and other stuff like this. In bytecode it is all gone. No curried parameters, no implicits, just plain methods. Runtime doesn't need to distinguish () => A and => A, it just executes the bytecode. All checks and verifications, errors you get are from compiler that analyzes scala code, not the bytecode. In the process of compilation by name is just replaced with Function0 and all usages of such parameter have apply method called on them, but this doesn't happen in parse phase, but later on, this is why you see <byname> in the compiler output. Try looking at later phases.

Łukasz
  • 8,555
  • 2
  • 28
  • 51