8

I'm really interested in the PyPy project, but for the 1st (but less well-known) of its purposes listed below:

  • A set of tools for implementing interpreters for interpreted languages
  • An implementation of Python using this toolchain

In the following blog post, http://morepypy.blogspot.com/2011/04/tutorial-writing-interpreter-with-pypy.html, and http://morepypy.blogspot.com/2011/04/tutorial-part-2-adding-jit.html there's a detailed tutorial on how to implement a brainfork interpreter with RPython, and add a JIT.

However I've read elsewhere that RPython can be troublesome to work with--syntax created for dynamic typing suddenly restricted to inferred static typing leads to hard-to-understand compile errors.

So my question is, are there any other projects that would allow you to write a brainfudge interpreter/JIT like in the tutorial above? Or is PyPy the only option for doing so as succinctly?

(Aside): If one exists, what's the point of RPython in general? Is it just to show that a subset Python can be made type-safe, and Python implemented in that subset? Would it have made more sense just to do "PyPy" in an existing interpreter-creation tool?

lobsterism
  • 3,469
  • 2
  • 22
  • 36

1 Answers1

10

However I've read elsewhere that RPython can be troublesome to work with--syntax created for dynamic typing suddenly restricted to inferred static typing leads to hard-to-understand compile errors.

It's less about syntax (the only thing Python syntax has to do with typing is that it has no place for type annotations, and that can - and was, in 3.0 - changed), and more about:

  1. Good error messages are seriously hard, and the code for them almost inevitable changes as the rest of the compiler changes. Therefore, it's a lot of effort, and one of the first corners to cut when you're working on highly complicated code compiling code written by a handful of experts (quite often the same people who wrote the translator) instead of the general public. The fact that the translator operates quite differently from usual compilers does not help either.
  2. The fact that everything is inferred means the only insight into types (for you as reader) comes from understanding and mentally applying the process by which the translator infers types. And that's quite different from usual compiler techniques and not exactly trivial.

So my question is, are there any other projects that would allow you to write a brainfudge interpreter/JIT like in the tutorial above? Or is PyPy the only option for doing so as succinctly?

I am not aware of any other project which attempts creates a JIT compiler from an interpreter. I'm pretty confident the idea was new when the PyPy guys did it, so the odds that something else like this (if it exists) is more mature than RPython are slim. There are numerous projects which aid individual aspects. There are also a few which tackle many or "all" of these aspects together, such as Parrot. But AFAIK none of them have success stories anywhere as compelling as PyPy.

Parrot is a VM for dynamic languages and features several backends (no JIT since v1.7 as I just learned, but the architecture permits re-introducing one transparently), and apparently grew a rich set of tools for language implementers. The CLR and JVM offer similar services for static object-oriented languages, though I do not know of tools quite as sophisticated as Parrot's.

But instead of you writing an interpreter, it defines am IR (several ones in fact) and your job is compiling the language to that IR (and defining built-in functionality in terms the VM can understand). In this regard, it's different from the RPython approach of writing an interpreter. Also, as with other VMs, you are screwed should some aspect of your language map badly to the IR. Need something radically different from the VM's services? Have fun emulating it (and suffering from awful performance). Need a language-specific optimization (that is not valid for arbitrary IR, and cannot be done ahead of time)? Say goodbye to those performance improvements. I'm not aware of a complete language implementation on Parrot except for toy languages. And since they are not constantly bragging about performance, I fear that they are currently weak in this regard.

LLVM (mentioned by others), as well as many other code generators/backends, is just one ingredient. You'd have to write a fully-blown static compiler lowering your language to the level of abstraction of machine code, rather than an interpreter. That may be feasible, but is certainly quite different.

If one exists, what's the point of RPython in general?

"Writing JIT compilers is hard, let's go shoppingwrite interpreters." Well, it probably started out as "we want to do Python in Python, but it'd be too slow and we don't want to make a programming language from scratch". But these days, RPython is a very interesting programming language in its own right, by virtue of being the world's first programming language with JIT compilers as first-class (not really in the sense of first-class functions, but close enough) language construct.

Would it have made more sense just to do "PyPy" in an existing interpreter-creation tool?

Just for the sake of being meta, doing research, and showing it works, I favor the current approach. Up until the point where the JIT generator worked, you could have had the same in any language with static compilation, C-ish performance potential and macros (or another way to add what they call "translation aspects") -- although those are rare. But writing a well-performing (JIT or not) compiler for the entire Python language has repeatedly proven too hard for humans to do. I'd say it would not have made more sense to write an interpreter, and then struggle to get it right again in a separate JIT compiler codebase, and still optimize anything.

  • Awesome, great answer! So, what I was thinking is if "RPython" was instead ML syntax (or any another language that's already statically typed), would that have made things easier to implement Python (or BF) interpreter/JIT on top of it? Or is there something special about RPython? – lobsterism Aug 26 '12 at 00:22
  • Or since, say, C++ already exists, wouldn't it have been easier just to create a JitDriver class in C++ and do the same thing? Or doesn't it work that way, i.e., would you have to write the C++ compiler/"translator" from scratch if you did it that way? (Why?) – lobsterism Aug 26 '12 at 00:31
  • @lobsterism Yes, you'd have to write essentially a compiler for the language you want to replace RPython. The only reason the RPython translator can add a JIT to an RPython program is because it controls the entire process going from Python bytecode to control flow graphs to type inference to lowering into C-level abstractions to insertion of "translation aspects" (including features like Stackless and sandboxing). Read up on how the JIT works, it is essentially a JIT compiler *for RPython code* with additional features that allow interpreter-like programs to be handled efficiently. –  Aug 26 '12 at 01:17
  • @lobsterism That said, there may be some value in inheriting a type system and a syntax for type annotations for some RLanguage. But even ignoring social aspects, there's a significant reason to use Python as base language: You get to re-use PyPy's Python parser and bytecode compiler for the frontend, and its interpreter for control flow graph generation (via the "flow object space"), so you can focus on type inference, transformations and the back ends. –  Aug 26 '12 at 01:23
  • One more thing--is there any reason translate.py isn't compiled? Wouldn't that make the translation process faster? – lobsterism Aug 26 '12 at 02:05
  • 2
    @lobsterism: Yes. The translation framework itself (the "RPython compiler") is written in normal Python, not RPython. So it can't be compiled without rewriting it in RPython, and my understanding is that the translation framework is a very complex piece of software that would have been much much harder to write in RPython. – Ben Aug 26 '12 at 02:37