-1

I read about bootstrapping but didn't quite got it yet. I would like to explain how I see it and you to point out if it isn't true. In order to create a compiler which compiles it self it should follow those steps:

Let there be X and Y programming languages.

  1. Write a compiler f from X langauge to Y. for every code A in Y, the compiler gives f(A).
  2. Write a compiler g from Y to Y. because compiler g is written in Y language, then compiler f can compile it. because of that, we can understand that f(c)=g is a compiler from Y to Y.
  3. You can compiler C with g again and then g(C)=g - compiler compiles itself.

I'm feeling that my explanation is lack of for important detailes.
I will be glad if someone could explain with math symbols like I tried to do.

  • I'm confused by your usage of "from" and "to". I feel like one of those prepositions should be "in" instead. A compiler from Y to Y doesn't really make sense (unless Y is JavaScript, I suppose) - that'd just be the identity function. A compiler from Y to Z (e.g. Assembly) *in* Y would make a lot more sense. – sepp2k Apr 05 '18 at 19:20
  • This link leads to a complete description and personal experience (yours!) on building self-compiling compilers. There's nothing like doing this once to really understand it. https://stackoverflow.com/a/7375884/120163 – Ira Baxter Apr 14 '18 at 03:38

2 Answers2

1

For the most part your steps seem correct except that you seem to keep mixing up the source and target language. I'm not sure whether those are just typos or you're genuinely confused. So here's my attempt at a formal description of bootstrapping, which is close to yours, but with a clearer distinction between the source, target and implementation language and a bit more detail:

Let X be the source language (i.e. the language you newly invented), Y the target language (e.g. assembly or machine language) and Z another language, for which at least one implementation already exists. Let T : X -> Y be a translation scheme to translate valid programs written in language X to equivalent programs written in language Y. In other words for any valid program x written in X, T(x) should produce a program y written in Y, which behaves equivalently to the defined behaviour of x.

Now the first thing we do is to implement this function T in the programming language Z. Let's call this implementation C_0. Since an implementation for Z already exists we can now start compiling X programs. This is your step 1 except you seem to have switched X and Y around in your second sentence.

After we did that, we can now implement T again, but this time in language X. So we write a program C_1 that is equivalent to C_0, but written in X instead of Z. We're still compiling from X to Y, but we're doing it in X. We can now use C_0 to apply T(C_1) and we get a(nother) working X compiler. This is similar to your step self, except we still translate X to Y. Translating X to itself would make no sense because that's just doing nothing. And translating Y to itself would make even less sense because Y isn't even our source language.

We can now also apply T(C_1) by using C_1 instead of C_0, but at this point that's only useful for testing purposes (making sure the compiler works). Ideally the result of C_1(C_1) should be the exact same Y code as C_0(C_1). Once we start adding features to X, we might only implement them in C_1, so at that point we'd want to stop using C_0, so we can make use of the new features.

sepp2k
  • 363,768
  • 54
  • 674
  • 675
0

Just imagine that you want to create a new language, called XYZPDQ. You know C, and so you write the compiler for XYZPDQ in C. It works, so now you can write code in the language XYZPDQ and compile it -- using the compiler you wrote in C, perhaps called XYZPDQc.

In fact, you like it so much that you write the compiler a second time -- but this time you write it in the XYZPDQ language. You compile that XYZPDQ compiler code (written in XYZPDQ itself) using the compiler that you wrote in C (the XYQPDQc program). The result of that is a new program that you can run, perhaps called XYZPDQc2.

Now you can compile the XYZPDQ compiler that you wrote in XYZPDQ itself using the XYZPDQc2 program, so the compiler is (in effect) compiling itself.

At this point: You have boot-strapped a compiled language.

cpurdy
  • 1,177
  • 5
  • 12