2

I am developing an application that analyzes Java applications (Windup). I'd like to be able to recognize programatically if a .class file was generated instead of written by a programmer and compiled.

As a human, I can tell because the decompiled code doesn't make much sense. It looks a bit like some kind of java-ish mix of C.

I could somehow implement recognizing the resulting decompiled code. However, decompilation takes time, I'd like to skip decompiling for generated classes.

For more information on what is a generated .class file, see here.

Is there a way to recognize generated .class just from headers? Or perhaps some specific bytecode sequence?

Ondra Žižka
  • 43,948
  • 41
  • 217
  • 277
  • 6
    [detect the CAFE BABE](http://stackoverflow.com/q/2808646/1065197) – Luiggi Mendoza Sep 22 '15 at 16:14
  • 6
    What do you mean by "generated"? All .class files are generated, by the Java compiler. Do you mean you want to detect if the Java code that is the source of the .class file was generated by a tool instead of a developer? There is no way to make such a disctinction. – JB Nizet Sep 22 '15 at 16:20
  • 2
    It's hard to understand what you mean. Do you want to detect if the .class file was recompiled in the last compilation and contains the latest changes? – João Neves Sep 22 '15 at 16:28
  • No, not CAFE BABE. That recognizes any .class file. Compiler doesn't generate, it compiles. Generators generate. – Ondra Žižka Sep 23 '15 at 14:36
  • Updated the question. – Ondra Žižka Sep 23 '15 at 14:37
  • This sounds like an [X-Y problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). – yshavit Sep 23 '15 at 15:39

2 Answers2

2

As mentioned, any legal class file starts with the magic numbers 0xCAFEBABE as it is specified by the JVMS §4.1. The simple fact that a file starts with this magic number does however not guarantee you that the file represents a compiled Java class, as anybody is able to create such a file.

Reading between the lines of your question, I assume that you want to find out if a class was generated by javac or another compiler / runtime code generator. This is not possible to determine as anybody can imitate javac as close as possible. As a matter of fact, many runtime code generators try to imitate the javac compiler as closely as possible as it can lead to performance improvements as the JIT-compiler recognizes some patterns that are typically used by javac.

If the code that you want to analyze derives a lot from "usual Java code", you can look for byte code patterns that are not representable in the Java language. This way, you can prove that a class was not generated by javac but you cannot generally proof that it was generated by it.

Community
  • 1
  • 1
Rafael Winterhalter
  • 42,759
  • 13
  • 108
  • 192
1

Java Classfiles always begin with the magic bytes CAFEBABE. If you want to recognize classfiles, that's the best way to do it.

If that's not what you want, you'll have to clarify the question.

Antimony
  • 37,781
  • 10
  • 100
  • 107