0

I’ve read about bootstrapped compilers, how Java, Go, Typescript etc. compiles itself — but one thing seems off: let’s assume that we use Node.js to write a compiler for a new programming language.

According to Wikipedia, here comes the stage 2. We create a brand new compiler written in language X, and we use the old compiler written in Node.js to process this new compiler. These makes sense, but what I have a hard time understanding is what comes next.

After this first compiler in language X is written, the old compiler can be discarded (quoting) but how is this even possible? Doesn’t the compiler needs to produce Javascript files for Node to understand, and Node environment for the Javascript files to work in? To simplify, I think this process has to be something like this, which will need the source language.

Code in language X -> Compiler written in X -> X code to Node.js code -> Executed in Node.js

Basically, what I’m asking is how a language can compile itself without the source language. Ultimately, won’t the compiler written in that language need source language?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
snowy1000
  • 41
  • 7
  • 1
    "Doesn’t the compiler needs to produce Javascript files": no. Programmers produce Javascript files. Compilers compile them. Your example is poorly chosen, as Javascript is not a compiled language. Think about Java, C, etc. – user207421 Aug 31 '22 at 05:14
  • Even Node is written in Cpp. Now imagine if you could rewrite all that Cpp in JS, to produce a new version of Node executable. If it works well, you are now free from Cpp. – NPras Aug 31 '22 at 05:27
  • See [Writing a compiler in its own language](https://stackoverflow.com/questions/193560/writing-a-compiler-in-its-own-language?rq=1), [How are GCC and g++ bootstrapped?](https://stackoverflow.com/questions/9429491/how-are-gcc-and-g-bootstrapped?rq=1), [Programming language and compiler](https://stackoverflow.com/q/1173780/207421), [How to create a C compiler for custom CPU?](https://stackoverflow.com/questions/8696229/how-to-create-a-c-compiler-for-custom-cpu?rq=1), etc. – user207421 Aug 31 '22 at 05:33
  • First of all thanks, but I already looked at those topics. They are talking about the concept rather than the part I can’t get. Question is not about Node, it’s just an example. But you are right, so let’s say C for example; my question is still same. Doesn’t the second compiler (which is written in the X language) needs a C environment to compile itself? – snowy1000 Aug 31 '22 at 10:06
  • Say you have the compiler code written in X language and you have another code blocl which is written in X. Doesn’t the compiler needs to output C executables? How are you not dependent on C in this case? – snowy1000 Aug 31 '22 at 10:10
  • 1
    The X compiler and its runtime library *are* the 'C environment'. Nothing further required. The compiler and the RTL were both compiled by the prior compiler, which can now be discarded. – user207421 Aug 31 '22 at 10:25
  • 1
    @snowy1000 -- Not at all. Many compilers produce assembly language, and some have assemblers built in so they produce machine language binaries directly. – Tim Roberts Aug 31 '22 at 20:09

1 Answers1

2

A "compiler" is just a program that translates (another) program from one language to another -- from the source language to the target language.

To actually run the compiled program, you (just) need an interpreter for the target language1. You can "chain" compilers to translate through multiple steps and have mulitple layers of interpreters running on top of each other (interpreter for language A written in language B, running on an interpreter for B written in C, etc), but ultimately it must end up with something in "machine language" which is just the language that is interpreted by the silicon CPU in your computer. So "at the bottom" there is just an interpreter running in transistors and wires directly.

The point of a compiler is that:

  • once it is run, you don't need the source code for the program any more, just the compiled code in the target language. If you run that through a second compiler (to a 3rd language) you just need the final output, not the intermediate, etc.
  • (hopefully) the target language runs faster than an interpreter for the source language would have.

You can actually combine a compiler from X to Y with an interpreter for Y and get something that looks a lot like an interpreter for X. The combined package first uses the compiler to translate the program to Y and then immediately interprets that translation (and then throws it away, so you never really see the translation, just the result). This is sometimes called a JIT compiler, and Node.js is an example of this.

Looping back to your original quesstion -- you have a compiler for X, but what is the target language? If the target language is JavaScript, you'll need a JavaScript interpreter to run the compiled programs. When you bootstrap, you'll use the js version of the compiler to compile the X version of the compiler, but once you've done that, you now have a new js version of the compiler (the output of the compiler) and no longer need the original js version of the compiler.


1So a compiler might introduce dependencies on some libraries written in the target language that you'll need along with the target language interpreter

Chris Dodd
  • 119,907
  • 13
  • 134
  • 226