What scala statements or code can produce a byte-code which can not be translated to java?

Question

I have read an answer to a question about converting Scala code to Java code. It says:

I don't think it's possible to convert from scala back to standard java since Scala does some pretty low-level byte-code manipulation. I'm 90% sure they do some things that can't exactly be translated back into normal Java code.

So what Scala statements or code can produce bytecode which can not be translated to java?

P.S. I generally agree with that answer, but want a concrete example for learning purposes.

It also depends a lot on how hard you try to do the conversion. — Antimony, Jul 04 '14 at 04:23
maybe you should reformulate your question to something like "given a java compiler. Does there a terminating function exist that for a classes bytecode produces java code that the compiler compiles to the same byte code?" — SpaceTrucker, Jul 04 '14 at 09:42
Ignoring checked exceptions is one thing you can’t convert to simple Java source code… — Holger, Jul 04 '14 at 14:02
See [this question](http://stackoverflow.com/questions/24061672/verifyerror-uninitialized-object-exists-on-backward-branch-jvm-spec-4-10-2-4) which shows a bug in the JVM due to scala generating bytecode that Java can't. This [paper](http://www.dcs.gla.ac.uk/~jsinger/pdfs/pppj13.pdf) covers bytecode n-grams used by different JVM languages. _58.5% of 4-grams executed by Scala are not found in bytecode executed by Java_ — ggovan, Jul 04 '14 at 15:37

score 8 · Answer 1 · answered Jul 04 '14 at 04:39

The answer really depends on how hard you want to try to convert the code.

Since Java and Scala are both turing complete, any program in one can trivially be converted to the other, but this isn't really interesting or useful.

What you really want is to convert the results to readable, idiomatic code. From this perspective, even Java code can't automatically be converted to Java because compilation loses information (though relatively little compared to C) and machines aren't as good as humans at writing human readable code anyway.

If you got a Java and Scala expert, they could probably rewrite your Scala codebase in Java and end up with reasonably idiomatic Java code. But it wouldn't be as readable as Scala due to the simple fact that Scala is a language designed to improve on Java. Scala tries to remove the warts from Java and provide powerful high level programming features, removing the need for all the classic Java boilerplate. So the Java equivalent codebase will not be as readable.

From this perspective, the answer is "any feature in Scala that is not in Java".

You should say it "*probably* would not be as readable as Java". There is price to pay some time for being terse and often that can actually hurt readability. — Adam Gent, Jul 04 '14 at 13:54

score 6 · Answer 2 · edited May 23 '17 at 12:09

Scala's nested blocks do not have a Java equivalent.

Nested block in Scala (taken from this question):

def apply(x: Boolean) = new Tuple2(null, {
  while (x) { }
  null
})

Produces the bytecode

 0: new           #12                 // class scala/Tuple2
 3: dup           
 4: aconst_null   
 5: iload_1       
 6: ifne          5
 9: aconst_null   
10: invokespecial #16                 // Method scala/Tuple2."<init>":(Ljava/lang/Object;Ljava/lang/Object;)V
13: areturn

At instruction 0 an uninitialised object is pushed onto the stack, and then initialised at instruction 10. Between these two points there is a backwards jump from 6 to 5. This actually reveals a bug in the OpenJDK bytecode verifier as it rejects this code despite the fact that it is acceptable by the JVM specifications. This probably got through testing as this bytecode can't be generated from Java.

As in Java nested blocks are not expressions that evaluate to a value then the the closest Java equivalent would be

public Tuple2 apply(boolean x){
  while(x){}
  return new Tuple2(null,null);
}

Which would compile to something akin to

 0: iload_1
 1: ifne          0
 3: new           #12                 // class scala/Tuple2
 6: dup
 7: aconst_null
 8: dup
 9: invokespecial #16                 // Method scala/Tuple2."<init>":(Ljava/lang/Object;Ljava/lang/Object;)V
12: areturn

Note that this doesn't have the uninitialised object on the stack at the time of the backwards jump. (N.B. bytecode was written by hand, do not execute!)

This paper from Li, White, and Singer shows differences in JVM languages including the bytecode that they compile to. It finds that in an N-gram analysis of bytecodes that 58.5% of 4-grams executed by Scala are not found in bytecode executed by Java. This is not to say that Java can't produce these bytecodes, but that they weren't present in the Java corpus.

Oracle seems to treat the verifier spec as only a general guideline. There are numerous places where it differs from the stated behavior, but then again the spec is ambgiuous and incomplete, and I doubt there are any implementations out there that strictly conform anyway. When in doubt, you need to check the source code. — Antimony, Jul 04 '14 at 17:42

score 2 · Answer 3 · edited May 23 '17 at 12:24

As you noted, Scala eventually compiles to JVM bytecode. An obvious instruction from the JVM instruction set, that has no equivalent in the Java language, is goto.

A Scala compiler might use goto for instance to optimize loops or tail-recursive methods. In this case, in Java you would have to emulate the behavior of a goto.

As Antimony hinted, a Turing complete language can at least emulate another Turing complete language. However the resulting program may be heavyweight and suboptimal.

As a final note, decompilers may help. I'm not familiar with the intrinsics of decompilers, but I assume that they rely a lot on patterns. I mean, for example, Java source pattern f(x) compiles to Bytecode pattern f'(x), so with a lot of hard work and experience, some manage to decompile Bytecode f'(y) to Java source f(y).

However, I've not heard of Scala decompilers yet (maybe someone's working on that).

[EDIT] About what I originally meant by emulating the behavior of a goto:

I had in mind switch/case statements inside a loop, and cdshines showed another way by using labeled break/continue in a loop (though I believe that using "disregarded and condemned" features is not standard).

In either of these cases, in order to jump back to an earlier instruction, an idiomatic Java loop (for/while/do-while) is required (any other suggestion?). An endless loop makes it easy to implement, a conditional loop would require more work, but I assume this is doable.

Also, goto isn't limited to loops. In order to jump forward, Java would require other contructs.

A counterexample: in C, there are limitations but you don't have to go through such great lengths, because there's a goto instruction.

As a related topic, if you're interested in non-idiomatic jumps in Scala, c.f. this old Q&A of mine. My point being, not only a Scala compiler might emit goto in a way that's not natural in Java, but a developer can have a tight control on that with the help of Scala macros.

LabelDef: A labelled expression. Not expressible in language syntax, but generated by the compiler to simulate while/do-while loops, and also by the pattern matcher. In my past tests, it could be used for forward jumps as well. In Scala Internals, developers wrote about to removing LabelDef, but I don't know if and when they would.

Therefore, yes you can reproduce the behavior of goto in Java, but because of the complexity involved in so doing, that is not what I would call standard Java, IMHO. Maybe my wording is incorrect, but in my mind the reproduction of an elementary behavior by complex means is an "emulation" of that behavior.

Cheers.

Java uses the `goto` bytecode instruction as well for implementing its control structures. And, after collapsing branch targets during compilation you will have a hard time reconstructing valid control structure for plain Java as well. But in the end, you can transform every valid byte code to a valid Java source code construct. — Holger, Jul 04 '14 at 14:01
Of course Java compiles to patterns that use goto, since the instruction set was designed to support the language. However there is no one to one transformation from one JVM instruction to one Java (or Scala) construct. Some instructions, yes. Some others, no. Goto is one of them. Unless you call "transform to a valid Java source code construct" what I call "emulation". — eruve, Jul 04 '14 at 14:19
It’s not emulation as for valid byte code it’s always possible to find a combination of loops and conditionals with the same behavior, however it might turn out to have not the slightest in common with the original source code… — Holger, Jul 04 '14 at 14:24
@Holger Not really. There are some things you can do in bytecode that are impossible to do in Java, short of emulating the entire JVM. — Antimony, Jul 04 '14 at 17:35
`break` and `continue`, though widely disregarded and condemned when used in conjunction with labels, are Java's GOTO, despite the difference in names: http://pastebin.com/rBnCeJJw — tkroman, Jul 04 '14 at 22:32
@cdshines Am I too tired or you just discovered a bug in your java compiler? i.e. the `lol:` label in your java source code seems to be ignored: `22: goto 25` -> `25: return` — eruve, Jul 05 '14 at 06:17
NOTE: Got it: _The break statement terminates the labeled statement; it does not transfer the flow of control to the label_. That's confusing (to me at least, as I didn't remember that) — eruve, Jul 05 '14 at 06:38
@Antimony: you are right, there *are* such features and that’s what the question was about. But `goto` is not such a feature. — Holger, Jul 28 '14 at 08:48

om-nom-nom · Answer 4 · 2014-07-04T07:28:39.150

It really depends on how you define So what Scala statements or code can produce bytecode which can not be translated to java?.

Ultimately, some of the scala features are backed by the so named ScalaSignature (scala signature) that stores meta information. As of 2.10, it may be deemed as a secret api which is abstracted by the scala reflection mechanisms (which are radically different from java reflection). The documentation is scarce, but you can check out this pdf to get the details (there could be major changes since then). There is no way to produce identical structures in native java, unless you're fallback bytecode manipulation tools.

In a more relaxed sense, there are macroses and implicts which interact solely with a scalac and have no direct analog in java. Yes, you can write java code, identical to result produced by scalac, but you can't write this dynamic instructions that will direct compiler.

score 1 · Answer 5 · edited May 23 '17 at 12:24

I happen to work with a lot of byte code and I once wrote a summary of byte code features that are not reproduceable by writing Java code. However, all these non-existing features are rather conventions of composing byte code instructions. By Java 8, any existing opcode was used by a Java class file format. This is not too surprising as the Java language sort of drives the evolution of the Java byte code format. An exception might be the INVOKEDYNAMIC instruction which was introduced for better supporting dynamic languages on the JVM but even this instruction is used in Java 8 for implementing lambda expressions. Thus, there might be combinations / orders of byte code instructions that are not produced by the javac compiler but there is not a specific instruction that is only used by another JVM language.

Of the byte code feautes that I named in the summary, I would most noteably say that throw undeclared checked exceptions without catching them is a feature that is supported by Scala but not by Java. Otherwise, I would however say that there is no low-level byte-code manipulation by scalac that is unknown to javac. By my experience, most Scala classes can also be written explicitly in Java.

score -4 · Answer 6 · answered Jul 04 '14 at 04:25

-4

I think there is no such code. AFAIK there is only one jvm instruction that java can not generate -- invoke_dynamic. This instruction is for dynamic language and scala is a static type language which means it can not generate it either. So it it possible to translate scala code to java code, and probably un-readable java code.

answered Jul 04 '14 at 04:25

cloud

1,057
7
12

1

1) invoke_dynamic was meant to add better support for dynamics languages, but there is nothing specific that stops you from using it in static language. In fact it's used extensively in java 8 2) invoke_dynamic is currently not used by the scala, but there're plans to use it in newer versions of scala. – om-nom-nom Jul 04 '14 at 07:06

What scala statements or code can produce a byte-code which can not be translated to java?

6 Answers6