12

I would like to write a compiler for a toy-language for Java. I would like to generate runnable .class files. I was wondering what is the best library or tool available for doing this? I know I could learn the binary format for all the instructions and build my own constant pool etc, but that seems like work that ought to have been already done: no point reinventing the wheel, right?

Searching online I've found two different Java Assembly languages, Jasmin and Jamaica, however only Jasmin looks somewhat maintained.

Is there a Java library for writing byte codes to a stream? Is this what the Apache BCEL is?

Is their a tool for this that is the "standard" for byte-code generation, like Antlr is for parsing?


PS- The toy language is Brainf***, I wanted something where I could have a simple "grammar" so I could focus on the generation aspect and not the parsing part... that will come later on the next step.

Sled
  • 18,541
  • 27
  • 119
  • 168

4 Answers4

9

ASM and BCEL do basically similar things. I'd recommend ASM as it's much more supported, much smaller, and is up to date JDK-wise.

Sled
  • 18,541
  • 27
  • 119
  • 168
MeBigFatGuy
  • 28,272
  • 7
  • 61
  • 66
3

It sounds like you're looking for Apache BCEL:

The Byte Code Engineering Library (Apache Commons BCEL™) is intended to give users a convenient way to analyze, create, and manipulate (binary) Java class files (those ending with .class).

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • I looked at that and 1) I can't tell if this does what I want, and 2) if this is the preferred tool. I'd really like to hear from someone that has actually done this and can vouch for a tool. – Sled Nov 19 '11 at 16:46
  • @ArtB: It would have helped if you'd said that you'd seen it... I haven't used it myself, but I've certainly heard good things about it. You might also want to look at [cglib](http://cglib.sourceforge.net/) – Jon Skeet Nov 19 '11 at 16:50
  • I had looked at it and CGLib, CGLib looks like its focussed on run-time manipulation. BCEL I'm looking through now – Sled Nov 19 '11 at 17:08
2

JDK 1.6 has the ability to dynamically compile Java classes (see getSystemJavaCompiler). This can be used to compile Java from source without byte code manipulation or temporary files. We're doing this as a way to improve the performance of some reflection API code, but it will just as easily serve your purpose as well.

Create a Java source file from a string containing the code:

   public class JavaSourceFromString extends SimpleJavaFileObject {
       final String code;

       JavaSourceFromString(String name, String code) {
           super(URI.create("string:///"
                            + name.replace('.','/')
                            + Kind.SOURCE.extension),
                 Kind.SOURCE);
           this.code = code;
       }

       @Override
       public CharSequence getCharContent(boolean ignoreEncodingErrors) {
           return code;
       }
   }

// Use your favorite template language here, like FreeMarker
static final String sourceCode = ""
        + "import org.example.MySomethingObject;"
        // DynamicStringGetter would define getString as a standard way to get
        // a String from an object
        + "public class GetStringDynamic implements DynamicStringGetter {\n" 
        + "    public String getString(Object o) {\n"
        + "        MySomethingObject obj = (MySomethingObject) o;\n"
        + "        return o.getSomething();\n"
        + "    }\n"
        + "}\n";

   JavaCompiler compiler = ToolProvider.getSystemJavaCompiler();
   StandardJavaFileManager fileManager = 
       compiler.getStandardFileManager(null, null, null);

   List<JavaFileObject> files = new ArrayList<JavaFileObject>();
   files.add(new JavaSourceFromString("org.example.DynamicClass", sourceCode));

   compiler.getTask(null, fileManager, null, null, null, files).call();

Then you load the newly created class files dynamically.

Alternatively, use byte code manipulation (such as ASM) to create classes on the fly.

As another alternative, there is the Scala CAFEBABE bytecode compilation library. I have not used it personally, but it seems geared more towards creating a new JVM language.

As far as the parsing portion goes, Antlr should serve.

Scott A
  • 7,745
  • 3
  • 33
  • 46
  • Can this even be used to generate standalone class files, or does it only work for in-memory compilation? Also, the toy language is much more low-level than Java making the translation awkward. – Sled Nov 19 '11 at 17:55
  • @ArtB Yes, it can generate standalone class files, or the classes can be saved in memory depending on what you do with the FileManager. We just store the classes in memory here because for our purposes they are ephemeral. An example of the toy language might help determine suitability. Typically generating Java source will be easier than determining byte codes, but if you're doing something more along the lines of adding a new language to the VM it might be overkill. On the other hand, it's easy to refer to the Java libraries using source code. – Scott A Nov 19 '11 at 18:14
  • The whole purpose of this exercise is to learn the tools, so determining byte codes it is. – Sled Nov 19 '11 at 19:22
  • @ArtB Fairy nuff. I suggest ASM or the Scala one I linked above then. – Scott A Nov 19 '11 at 19:24
  • Thanks a lot for the code example above! it seems to work for me, but it's generating a .class file in the root of my project. Is there a way to just retrieve the generated .class file as a byte[] rather than writing to file? Or, alternatively, to specify where the .class file should be written? – Alex Averbuch Jul 19 '12 at 09:23
  • @AlexAverbuch http://blogs.helion-prime.com/2008/06/13/on-the-fly-compilation-in-java6.html has an example showing how to store the resulting classes in a byte array. – Scott A Jul 19 '12 at 18:48
0

Easiest would be to translate your toy-language into valid .java source code using a preprocessor and then just compile it with javac. That's also the way Processing works.

Mark Jeronimus
  • 9,278
  • 3
  • 37
  • 50
  • 1
    This depends if the toy language can be easily translated to Java. There are several compilers for many languages that compile to byte-code but translating them for Java would be a challenging task. – Matteo Nov 19 '11 at 17:30
  • Java isn't a particularly expressive language. One of the reason to use other languages, even JVM-based ones, is to get around this limitation. – skaffman Nov 19 '11 at 18:41