36

I'm replacing some code generation components in a Java program with Scala macros, and am running into the Java Virtual Machine's limit on the size of the generated byte code for individual methods (64 kilobytes).

For example, suppose we have a large-ish XML file that represents a mapping from integers to integers that we want to use in our program. We want to avoid parsing this file at run time, so we'll write a macro that will do the parsing at compile time and use the contents of the file to create the body of our method:

import scala.language.experimental.macros
import scala.reflect.macros.Context

object BigMethod {
  // For this simplified example we'll just make some data up.
  val mapping = List.tabulate(7000)(i => (i, i + 1))

  def lookup(i: Int): Int = macro lookup_impl
  def lookup_impl(c: Context)(i: c.Expr[Int]): c.Expr[Int] = {
    import c.universe._

    val switch = reify(new scala.annotation.switch).tree
    val cases = mapping map {
      case (k, v) => CaseDef(c.literal(k).tree, EmptyTree, c.literal(v).tree)
    }

    c.Expr(Match(Annotated(switch, i.tree), cases))
  }
}

In this case the compiled method would be just over the size limit, but instead of a nice error saying that, we're given a giant stack trace with a lot of calls to TreePrinter.printSeq and are told that we've slain the compiler.

I have a solution that involves splitting the cases into fixed-sized groups, creating a separate method for each group, and adding a top-level match that dispatches the input value to the appropriate group's method. It works, but it's unpleasant, and I'd prefer not to have to use this approach every time I write a macro where the size of the generated code depends on some external resource.

Is there a cleaner way to tackle this problem? More importantly, is there a way to deal with this kind of compiler error more gracefully? I don't like the idea of a library user getting an unintelligible "That entry seems to have slain the compiler" error message just because some XML file that's being processed by a macro has crossed some (fairly low) size threshhold.

Travis Brown
  • 138,631
  • 12
  • 375
  • 680
  • 1
    This question has been marked as ["already answered"](http://stackoverflow.com/q/6570343/334519), but what I'm asking is completely different from what is being asked in that question. I know that it's not possible to change the JVM's method size limit—I'm asking about workarounds and error handling in the context of Scala's new (2.10) macro system. – Travis Brown Jun 10 '13 at 10:05
  • Naively tried -optimise and 2.11 just sat there musing. Because my time on this earth is finite, I ctl-c'd. Maybe it will become obvious to me why this had to end badly. – som-snytt Jun 10 '13 at 16:11
  • @som-snytt: Interesting—same here, and no idea what that means. Without `-optimize`, 2.11.0-M3 does give a reasonable error message, at least. – Travis Brown Jun 10 '13 at 16:28
  • 1
    Rex comment or gauntlet on unadvisedness of general solution to method limit: https://groups.google.com/forum/#!topic/scala-internals/f6PwUxc8K7I – som-snytt Jun 15 '13 at 20:48
  • 3
    Note that your method will not be compiled by JIT (HotSpot) after 32kb. According to JIT-compiler overview by Vladimir Ivanov, HotSpot developer. [Slides (English)](http://www.slideshare.net/iwanowww/jitcompiler-in-jvm-by), [Video (Russian)](http://youtu.be/oYu3HuIYDhI?t=1h20m). Slide N58. Vladimir mentioned that "too large" means approximately 32kb. – senia Jun 16 '13 at 20:43
  • What were you thinking of for "more graceful error handling?" Also, the other answer implies we're talking only about data repr, but really the problem applies to code gen generally. A macro can decide to emit slower code if it knows the faster code can't compile. – som-snytt Jun 23 '13 at 01:38

2 Answers2

10

Imo putting data into .class isn't really a good idea. They are parsed as well, they're just binary. But storing them in JVM may have negative impact on performance of the garbagge collector and JIT compiler.

In your situation, I would pre-compile the XML into a binary file of proper format and parse that. Elligible formats with existing tooling can be e.g. FastRPC or good old DBF. Or maybe pre-fill an ElasticSearch repository if you need quick advanced lookups and searches. Some implementations of the latter may also provide basic indexing which could even leave the parsing out - the app would just read from the respective offset.

Ondra Žižka
  • 43,948
  • 41
  • 217
  • 277
4

Since somebody has to say something, I followed the instructions at Importers to try to compile the tree before returning it.

If you give the compiler plenty of stack, it will correctly report the error.

(It didn't seem to know what to do with the switch annotation, left as a future exercise.)

apm@mara:~/tmp/bigmethod$ skalac bigmethod.scala ; skalac -J-Xss2m biguser.scala ; skala bigmethod.Test
Error is java.lang.RuntimeException: Method code too large!
Error is java.lang.RuntimeException: Method code too large!
biguser.scala:5: error: You ask too much of me.
  Console println s"5 => ${BigMethod.lookup(5)}"
                                           ^
one error found

as opposed to

apm@mara:~/tmp/bigmethod$ skalac -J-Xss1m biguser.scala 
Error is java.lang.StackOverflowError
Error is java.lang.StackOverflowError
biguser.scala:5: error: You ask too much of me.
  Console println s"5 => ${BigMethod.lookup(5)}"
                                           ^

where the client code is just that:

package bigmethod

object Test extends App {
  Console println s"5 => ${BigMethod.lookup(5)}"
}

My first time using this API, but not my last. Thanks for getting me kickstarted.

package bigmethod

import scala.language.experimental.macros
import scala.reflect.macros.Context

object BigMethod {
  // For this simplified example we'll just make some data up.
  //final val size = 700
  final val size = 7000
  val mapping = List.tabulate(size)(i => (i, i + 1))

  def lookup(i: Int): Int = macro lookup_impl
  def lookup_impl(c: Context)(i: c.Expr[Int]): c.Expr[Int] = {

    def compilable[T](x: c.Expr[T]): Boolean = {
      import scala.reflect.runtime.{ universe => ru }
      import scala.tools.reflect._
      //val mirror = ru.runtimeMirror(c.libraryClassLoader)
      val mirror = ru.runtimeMirror(getClass.getClassLoader)
      val toolbox = mirror.mkToolBox()
      val importer0 = ru.mkImporter(c.universe)
      type ruImporter = ru.Importer { val from: c.universe.type }
      val importer = importer0.asInstanceOf[ruImporter]
      val imported = importer.importTree(x.tree)
      val tree = toolbox.resetAllAttrs(imported.duplicate)
      try {
        toolbox.compile(tree)
        true
      } catch {
        case t: Throwable =>
          Console println s"Error is $t"
          false
      }
    }
    import c.universe._

    val switch = reify(new scala.annotation.switch).tree
    val cases = mapping map {
      case (k, v) => CaseDef(c.literal(k).tree, EmptyTree, c.literal(v).tree)
    }

    //val res = c.Expr(Match(Annotated(switch, i.tree), cases))
    val res = c.Expr(Match(i.tree, cases))

    // before returning a potentially huge tree, try compiling it
    //import scala.tools.reflect._
    //val x = c.Expr[Int](c.resetAllAttrs(res.tree.duplicate))
    //val y = c.eval(x)
    if (!compilable(res)) c.abort(c.enclosingPosition, "You ask too much of me.")

    res
  }
}
som-snytt
  • 39,429
  • 2
  • 47
  • 129