3

I'm looking for a fail-safe way to round-trip between a JVM class file and a text representation and back again.

One strict requirement is that the resulting round-tripped JVM class file is exactly functionally equivalent to the original JVM class file as long as the text representation is left unchanged.

Furthermore, the text representation must be human-readable and editable. It should be possible to make small changes to the the text representation (such as changing a text string or a class name, etc.) which are reflected in the resulting class file representation.

The simplest solution would be to use a Java decompiler such as JAD to generate the text representation, which in this case would simply be the re-created Java source code. And then use javac to generate the byte-code. However, given the state of the free Java decompilers this approach does not work under all circumstances. It is rather easy to create obfuscated byte-code that does not survive a full round-trip class-file/java-source/class-file (in part because there simply isn't a 1:1 mapping between JVM byte-code and Java source code).

Is there a fail-safe way to achieve JVM class-file/text-representation/class-file round-tripping given the requirements above?

Update: Before answering - save time and effort by reading all the requirements above, and note specifically:

  • "Text-representation of JVM bytecode" does not necessarily mean "Java source-code".
knorv
  • 49,059
  • 74
  • 210
  • 294
  • Yes, there is a fail-safe way: create your own app. Although if you do that, a text representation is probably not the most useful. Perhaps if you explained your actual problem, people could point you to a better solution. – kdgregory Sep 21 '09 at 12:02
  • kdgregory: The problem is simply that bytecode does not map 1:1 to Java source code, so to convert Java byte-code to something human readable/editable/assemblable you need something other than a Java decompiler, Please let me know if you need any further clarifications. – knorv Sep 22 '09 at 20:25
  • You were sufficiently clear in your original question. Perhaps I was not sufficiently clear in my response: if you want such a tool, you will have to write it yourself. The reasons that people use decompilers are not to make changes to the decompiled code, therefore the output of a decompiler is not meant to be recompiled. – kdgregory Sep 22 '09 at 20:32
  • kdgregory: Ah, sorry - thought you meant that I should write my own Java apps rather than trying to decompile/disassemble existing byte-code :-) – knorv Sep 22 '09 at 20:48
  • `Furthermore, the text representation must be human-readable and editable.` this particular requirement essentially makes it a decompiler to source code, and that you want a decompiler that can represent all possible programs representable by java byte code. as kdgregory says, i dont think such a program exists. a slight compromise might be to store a textual representation of bytecode (i.e., assembly). tho i doubt that is what you actually want. if you explain why you would want this tool, may be there is another way to achieve the result. – Chii Sep 25 '09 at 11:18
  • Chii: No, I don't want a Java source decompiler. A Java source code decompiler would be insufficient since there are bytecode instructions that cannot be represented using standard Java constructs (the simplest example being "goto"). So what I'm looking for is probably best described as a bytecode disassembler which is able to generate a text-representation that can be turned into byte-code again. The round-tripping is hence extremely important. And the use-case is simply to edit code for which the only available representation is bytecode. – knorv Sep 25 '09 at 13:21
  • Also see http://stackoverflow.com/questions/1109353/java-bytecode-compilation – James Moore Sep 16 '11 at 15:26
  • [This related question](http://stackoverflow.com/questions/791600/java-bytecode-equivalents-for-ilasm-ildasm) might show what you're looking for. – erikkallen Sep 22 '09 at 20:50

5 Answers5

6

The BCEL project provides a JasminVisitor which will convert class files into jasmin assembly.

This can be modified and then reassembled into class files. If no edits are made and the versions are kept compatible the the round trip should result in identical class files except that line number mapping may be lost. If you require a bit for bit identical copy for the round trip case you will likely need to alter the tool to take aspects of the code which are pure meta data as well.

jasmin is rather old and is not designed with ease of actually writing full blown programs in assembly but for modifying string constant tables and constants it should be more than adequate.

ShuggyCoUk
  • 36,004
  • 6
  • 77
  • 101
  • jasmin-2.4 is timestamped 2010-05-07, and the release before that was jasmin-2.3 2008-01-28, 20 months before your answer. Seems like it's updated pretty regularly. – James Moore Sep 16 '11 at 14:50
  • @James this is true, but then again the jvm byte code is exceptionally static. the new invoke dynamic stuff in java 7 should be the only thing that needs updating in recent years, and [here it is](http://svn.apache.org/viewvc?view=revision&revision=826332) – ShuggyCoUk Sep 16 '11 at 15:35
  • @James, even if it is updated, the problem with Jasmin is that it can't represent many of the more obscure classfile features. – Antimony Apr 07 '13 at 00:51
2

Jasmin and Kimera?

Tom Hawtin - tackline
  • 145,806
  • 30
  • 211
  • 305
  • Can Kimera be used to create a .j (Jasmin file) representation of a class file? Please elaborate. – knorv Sep 20 '09 at 17:15
  • "We have developed a Java disassembler that generates jasmin compatible output from Java class files." http://www.cs.cornell.edu/people/egs/kimera/disassembler.html But I can't find where to download it. kimera.cs.washington.edu is gone. – James Moore Sep 16 '11 at 15:06
  • Try Jasper, too: http://www.angelfire.com/tx4/cus/jasper/ It's default output is supposed to be usable by jasmin. – James Moore Sep 16 '11 at 15:17
1

I've written a tool that's designed for exactly this.

The Krakatau disassembler and assembler is designed to handle any valid classfile, no matter how bizarre. It uses an assembly format based on the Jasmin format, but extended to support all the classfile features that Jasmin can't handle. It even supports some of the obscure or undocumented 'features' of Hotspot, such as pre 45.3 classfiles using smaller widths for the Code attribute fields.

It can roundtrip any classfile I know of. The result won't be identical binary wise, but it will have the same functionality (constant pool entries may be rearranged for instance).

Update: Krakatau now supports exact binary roundtripping of classfiles. Passing the -roundtrip flag will preserve the order of constant pool entries, etc.

Antimony
  • 37,781
  • 10
  • 100
  • 107
1

Looks like ASM does this. (This is the same sort of answer as ShuggyCoUk's, but with a different tool.) Jarjar says it uses ASM for exactly the sort of thing you're talking about.

James Moore
  • 8,636
  • 5
  • 71
  • 90
-2

No. There exists valid byte-code without a corresponding Java program.

The Soot project has a quite sophisticated decompiler- http://www.sable.mcgill.ca/dava/ - which may be useful for those byte codes coming from a Java compiler. It is, however, not perfect.

Your best bet is still getting the source code for the class files.

Thorbjørn Ravn Andersen
  • 73,784
  • 33
  • 194
  • 347