19

The Java bytecode language has the JSR instruction.

None of the code I've compiled with the Java 7 compiler uses this instruction.

However, sometimes Java binaries I've downloaded do use it, although rarely.

I'd be interested to know what compilers do use the instruction, and what Java code constructs would cause them to use it.

Edit this is not a duplicate as it refers to the JSR bytecode instruction and not a Java Specification Request

paj28
  • 2,210
  • 3
  • 25
  • 37
  • 3
    As has been stated, it was designed for implementing the `finally` mechanism (to allow the same code segment to be "called" from multiple places in the method), but has been largely eliminated in later versions of `javac` where the job is done by having the compiler simply duplicate code inline. May also be used by some of the other languages that target the JVM. – Hot Licks Jan 15 '14 at 23:16
  • Which binaries were these? Were they obfuscated? – Antimony Jan 15 '14 at 23:41
  • I think Java 7 had to stop using `jsr` because it conflicted with the "stack map" internal structure that was introduced with that version. But earlier versions of `javac` were already eliminating the use of `jsr`. Probably if one encounters a `jsr` it's in a .class created prior to Java 7, and most likely several versions prior. – Hot Licks Jan 16 '14 at 01:37
  • Closely related: [Why are JSR/RET deprecated Java bytecode?](http://stackoverflow.com/questions/5871190/why-are-jsr-ret-deprecated-java-bytecode) – Ciro Santilli OurBigBook.com Apr 30 '15 at 20:34

2 Answers2

20

The JSR instruction is actually not even allowed in Java 7 classfiles. It is only allowed in version 49.0 or earlier classfiles, corresponding to Java 5 or earlier. In practice, it fell out of use long before that.

The JSR/RET mechanism was originally used to implement finally blocks. However, they decided that the code size savings weren't worth the extra complexity and it got gradually phased out.

I don't know the exact versions since I can't find any compilers that old, but based on discussions I found online, it seems that the transition happened in the Java 1.2-1.3 era, with different compilers switching at different times. I have never seen a legitimate classfile from one of these old compilers, but you never know when it could happen.

In practice, the only use of JSR I've seen in the wild is for obfuscation. For example, Zelix Klassmaster used to use it for its string decryption code. I've also used it in several of my own Java crackmes.

Antimony
  • 37,781
  • 10
  • 100
  • 107
  • 4
    When I was working on the IBM AS/400 class file verifier I hated code that was obfuscated using `jsr`, since it got infinitely more complex (and slow and storage hogging) to verify. (And using techniques stolen from obfuscated code I wrote BigUglyMethod, a class file that broke Sun's verifier by tying it in knots. But ours, of course handled it with flying colors.) – Hot Licks Jan 16 '14 at 01:42
  • 3
    @Hot Licks Yeah verification is annoying. I had to read through the Hotspot source several times to figure out how jsrs were actually being verified since the specification was so vague. On the bright side, I also discovered all sorts of cool undocumented tricks in the verifier. For example, I constructed a method which verifies fine but simply rearranging the branches of an if statement causes it to fail verification thanks to a subtle interaction between three different parts of the verifier code. – Antimony Jan 16 '14 at 05:29
  • Sounds like you were using Sun's sub-standard verifier. (Unfortunately, however, it's about the only game in town, since AS/400 native Java was canned in favor of the Sun-based Power version, to save on maintenance costs.) – Hot Licks Jan 16 '14 at 12:23
  • @Hot Licks For my purposes I'm more concerned with what will actually run, not what should pass verification in a more sensible implementation. Hotspot is the de facto standard, so it's what Sun's doing that really matters. – Antimony Jan 16 '14 at 14:27
  • Yeah, I know. Just 5 years later I'm still peeved. – Hot Licks Jan 16 '14 at 14:29
  • @HotLicks Thanks for the answer, explains a lot. I am writing a static code analyser and JSR is a bit of a PITA for generating a control flow graph. When there's multiple JSRs I end up duplicating all the blocks that can be reached by them, as RET may be different. – paj28 Jan 16 '14 at 23:00
  • @paj28 - I managed to get it to work by making the return point be one of the "data" items in the data flow analysis. So it was just like figuring out that "x1" could be either 5 or 7 at a given point. – Hot Licks Jan 16 '14 at 23:03
  • @HotLicks good idea! I'd be interested to chat to you in more detail about this. Drop me a message at paj@pajhome.org.uk if you want to chat. – paj28 Jan 16 '14 at 23:12
  • @paj28 Note that when duplicating the bodies, you have to be careful to make sure you're handling nested subprocedures correctly. Also, you don't need to duplicate all the blocks, only the ones that can reach the RET instruction. But it is a pain. Krakatau has around 250 lines of code just for handling JSRs, and it inlines them before doing any of the complicated analysis. – Antimony Jan 16 '14 at 23:35
  • 1
    @paj28: Would the `jsr` have posed such difficulties if it could only call declared "subroutines", each of which could only be entered at the start, would only be callable from one other subroutine, and could only exit via `ret` or abrupt completion (return or throw)? Duplicating code in `finally` blocks seems really ugly, especially since `finally`-related cleanup may often invoke nested `try` blocks. – supercat Jan 28 '14 at 23:18
  • 1
    @supercat, Most of that is already true. Subroutines can only be entered from the start, can only return from one place, and can only be called from within a single subroutine. The complexity comes from the fact that you have to maintain a stack of dirty bits for the local variables and when returning, you have to do a three way merge. – Antimony Jan 28 '14 at 23:40
  • @Antimony: Is what's necessary for a subroutine any different from what would be required if each "jsr" were replaced with a load of a local variable with a small number, and each "ret" did a table-jump to the instruction following each thing-that-would-be-a-jsr? – supercat Jan 28 '14 at 23:50
  • @Supercat - yes, the difference is that locals which aren't touched by the subroutine don't get merged whereas normal control flow always merges everything. – Antimony Jan 29 '14 at 04:21
  • @Antimony: Thanks for the explanation--I can see why that would be necessary, and why it would make `jsr` more complicated. Still, duplicating all `finally`-branch code would seem like it could get pretty ugly in cases involving nested try-with-resources statements. How does a compiler prevent an exponential explosion of code in such cases? – supercat Jan 29 '14 at 16:16
  • @supercat It doesn't. – Antimony Jan 29 '14 at 19:28
  • 1
    `... they decided that the code size savings weren't worth the extra complexity and it got gradually phased out` I know this question is old, but I couldn't find references to this change elsewhere. Were this decision documented anywhere? – default locale Oct 11 '17 at 10:56
6

According to the JVM specification:

In Oracle's implementation of a compiler for the Java programming language prior to Java SE 6, the jsr instruction was used with the ret instruction in the implementation of the finally clause

assylias
  • 321,522
  • 82
  • 660
  • 783
  • 3
    This is consistent with looking at the searching the OpenJDK source responsible for bytecode generation: it appears in the bytecode generation of a `try` block where `finally` is present. – Louis Wasserman Jan 15 '14 at 23:07