0

I've read some SO answers concerning difference between compilers and interpreters. Most of them, when explaining how interpreters work, tells following:

It takes the program, one line at a time, and translates each line before running it: It translates the first line and runs it, then translates the second line and runs it etc.

(from How does an interpreter/compiler work)

And that's what confuses me - translation from high-level language to machine code is called compiling. By definition (https://en.wikipedia.org/wiki/Compiler).

So, is it correct to say that interpreter of high-level language consists of compiler, not just translator?

Community
  • 1
  • 1
kolobok
  • 3,835
  • 3
  • 38
  • 54
  • 1
    I think the answer is no, but it's not clear what you mean by "consists of compiler, not just translator". A compiler need not compile to machine code (eg: the java compiler), and an interpreter may generate machine code as it interprets. But all this is better explained in http://stackoverflow.com/questions/475223/what-is-the-difference-between-implementing-a-compiler-and-an-interpreter?rq=1 for example. – Paul Hankin Jun 26 '16 at 07:30
  • 1
    For the vast majority of interpreters, it's not true that they work line by line. Take any real world programming language with an interpreter (JavaScript, Python, Ruby) and write a program with a syntax error on the last line. You'll notice that that none of the lines will execute. That's because the whole program is parsed before any of it is executed. That whole answer you linked seems pretty inaccurate. – sepp2k Jun 26 '16 at 12:40
  • 1
    There is a compiler hiding in any interpreter, and you can extract it - see Futamura projections: https://en.wikipedia.org/wiki/Partial_evaluation – SK-logic Jun 29 '16 at 11:30

2 Answers2

1

No, an interpreter is not a compiler, especially not "By definition". Essentially an interpreter executes while a compiler translates. An interpreter takes code, reads through it, and then mutates the interpreter's state based on code. It calls the libraries of the language the interpreter is written in to do side-effects like network transactions. Compilers on the other hand output a set of instructions that do what the code describes in a different "language", such as x86 machine code. It drops in pre-written bits of machine code to execute side-effects in the code being compiled.

However, there is a massive overlap of these ideas in practice, especially in more recent languages like Python, Java, and .NET. Often code will be compiled to bytecode and then executed by a bytecode interpreter or JIT compiled again to machine code. For instance, in Python the interpreter contains all this, making it an interpreter that memoizes compilation work in .pyc files.

robot1208
  • 191
  • 1
  • 7
0

Actually the distinction between compiler and interpreter is in type of inputs and outputs, not the way they work. In short, interpreter takes some program and inputs, and yields some output, while compiler takes some program, and yields a program, which given inputs yields some output.

Your question is very interesting and the short answer is "it can but does not have to, and the reverse can be true too, but does not have to", but in order to get there we need to establish some notions.

A programming language L is some means of making sense of it's programs, i.e. of interpreting program P as "a function from inputs to outputs". Let's write [[P]]_L for this function, i.e. [[P]]_L(In) = Out. (Think of this [[.]]_L as a "meaning in L" function, i.e. an interpretation for L).

Now suppose you have 2 languages, L1 and L2 (and let L1 be different than L2 though in general that does not need to be the case).

If you only have means of interpreting L1 programs (e.g. L1 is the machine language of your computer), there are [among others?] those 2 possibilities to compute [[P]]_L2 for any L2-program P -- i.e. to compute [[P]]_L2(In) for any L2-program and any input In:

(1) you do have a L1-program INT, such that [[INT(P,In)]]_L1 = Out = [[P]]_L2(In), or

(2) you do have a L1-program COMP, such that [[COMP(P)]]_L1 = P', such that [[P']]_L1 = [[P]]_L2, i.e. [[P']]_L1(In) = [[P]]_L2(In) for any input In.

The main simplification of this idea is that both L1 and L2 work on the same datatypes, which might not be the case (but introducing interpreting one datatype in some other would make things even more twisted). Also note that the output may contain side-effects (like I/O, or some db update/delete, or whatever) -- but this is not important right now.

Then INT is an interpreter for L2 in L1, and COMP is a compiler from L2 to L1, and I believe this is the only difference that makes sense in general.

There is no problem with INT working like this: take P and In, compile P with COMP, execute the result on In, and return it's value. That's how JIT-compiling interpreters work. INT can as well compile P to some yet other language L3, and then execute it with some L3-interpreter -- that's how VM-based interpreters work.

Now, the exciting part is this: you can generate a compiler COMP (from L2 to L1) having an interpreter INT (for L2 in L1) by means of partial evaluation, as SK-logic mentioned in comments. In great simplification (read: LIES!), notice that the L1 "operations" needed to get from In to Out are present in INT's code. Imagine some mechanism which instead of executing these parts of INT only "writes them down". The final "pile of written-down operations" is a L1-program P' which given In yields Out, just as the original L2 program P would. In other words "compilation is postponed interpretation". That's what the Futamura Projections are about. You can find a lot of papers (many of them freely available) here http://readscheme.org/partial-eval/index.html

So in short, you can make an interpreter from the compiler (e.g. by jit-compiling and executing), and you can make a compiler from the interpreter (e.g. by IInd Futamura projection).

Once again: the difference is only what the interpreters/compilers consume, and what they produce. And how they work is "their own business".

dercz
  • 962
  • 6
  • 9
  • PS Note that a perfectly legit compiler can procude the code of interpreter with the program P inserted in it -- the result will do just what it needs, i.e. consume In and yeild Out. Also think about the cases when L1 and L2 are the same language, or L1 is a subset of L2, or... aren't computers awesome?! – dercz Jul 02 '16 at 17:17