3

As a self project I'm creating an interpreter for a programming language I "made up" (It's really just a tiny extension of the JavaScript programming language) but I'm a little lost as to how an interpreter really works and how I should execute the programs that are written in my language. My questions are:

1: Because this is an interpreter, should I be executing statements as I walk my parse tree, or should I instead be generating code in a different language - say, python - and then using a subprocess call to compile and run that generated python file?

2: Am I suppose to execute each statement as I read it? or should I be constructing the entire program's parse tree -> AST in memory before walking the tree and generating / executing code? (Depending on what the answer to question 1 is)

kjh
  • 3,407
  • 8
  • 42
  • 79
  • 1
    See my SO answer showing how to execute a constructed AST: http://stackoverflow.com/a/10555114/120163 – Ira Baxter Dec 26 '13 at 02:39
  • 1
    I swear, you are the most helpful user on SO. – kjh Dec 26 '13 at 02:59
  • @kjh, JavaScript is too sweet to be easily evaluated by an ad hoc AST walker. It worth lowering it into some simpler language first, or, better, into a flat bytecode which is then executed by a trivial interpreter. – SK-logic Dec 26 '13 at 10:38
  • @SK-logic, what does this entail? Do you mean that I should leave out certain features of the language in my interpreter implementation? – kjh Dec 26 '13 at 15:00
  • @kjh, no, I mean that you can *compile* or *translate* your source language first into something simpler before interpreting it. You can remove syntax sugar, lower multiple structured programming statements into a simple common one (e.g., a `goto`), enumerate variables where possible instead of using their symbolic names, etc. – SK-logic Dec 27 '13 at 00:04

1 Answers1

2

one intuitive way to implement an interpreter is to create an executable AST:

  1. parse the source file: you can either write the parser yourself or use one of the many available compiler generators (e.g. Cocoa). If you use a compiler generator, you will have much less trouble with the "low-level stuff" and can concentrate more on implementing the language semantics. however, you will also miss an opportunity to learn how to implement a parser yourself.
  2. generate the executable AST: you implement each operation locally in one node in an execute function/method. an addition node would for example execute its left child, its right child, and then return a result itself. an if node would first execute its condition, and then depending on the condition either execute its if branch child, or its right branch child. you also have to think, where you want to store your run-time data such as for variables.
  3. execute the AST: if you want to execute dynamically typed languages like JavaScript, you have to consider that a variable can have values of different types. you can either have nodes, that adapt to the state of execution or implement a single node that handles all the cases.

one way is to implement it on a top of the Java Truffle framework. if you implement your interpreter in the right way, you will get decent performance. there are also several language implementations available; a simple example language is also included that uses the Cocoa compiler generator. some papers (1) (2) explain, how you can implement things such as local variables or handle values of different types for dynamically typed languages.

i would recommend you to first have a look at the truffle.sl implementation (link how to get it) and then see where you can get from there.

box
  • 3,156
  • 3
  • 26
  • 36
  • 1
    Very interesting answer, I'll definitely look into this. We went over this concept today in my operating systems class when our professor was talking about how Virtual machines work. Pretty cool subject. – kjh Feb 04 '14 at 00:31