14

I am not aware of any self-improving compiler, but then again I am not much of a compiler-guy.

Is there ANY self-improving compiler out there?

Please note that I am talking about a compiler that improves itself - not a compiler that improves the code it compiles.

Any pointers appreciated!

Side-note: in case you're wondering why I am asking have a look at this post. Even if I agree with most of the arguments I am not too sure about the following:

We have programs that can improve their code without human input now — they’re called compilers.

... hence my question.

Community
  • 1
  • 1
JohnIdol
  • 48,899
  • 61
  • 158
  • 242
  • 2
    Self-improving in what aspect? How would the compiler be improving itself? Is it adding a new language on it's own, so gcc decides it wants to compile Ruby, and so learns how? Are improve how it compiles C by adding a new optimization level? – James Black Oct 18 '09 at 15:22
  • 1
    What about a compiler that compiles itself? – ChrisInEdmonton Oct 18 '09 at 15:23
  • @JamesBlack self-improving in ANY Aspect :) - I am just trying to understand if there's such a thing – JohnIdol Oct 18 '09 at 15:25
  • Until you can say specifically what you mean this is "Not a Real Question". I mean, do you think that the compiler can comprehend it's own performance and decide what to do about it? – dmckee --- ex-moderator kitten Oct 18 '09 at 23:21
  • I mean ANY self-improving aspect, including the one you mention, which sounds like a question to me. It's not about what I think, I am just reaching out to understand if there's any current research on the topic at ANY level. – JohnIdol Oct 19 '09 at 09:02
  • Does profile-guided optimization count? I know the Intel and TI compilers support this, but have never tried it out – Gautham Ganapathy Oct 21 '09 at 22:02
  • They need to optimize their own source-code (not the code they compile) and possibly recompile themselves in order for that to count! :) – JohnIdol Oct 22 '09 at 12:38
  • A self-hosted compiler would "improve itself" by compiling itself, wouldn't it? – mdm Oct 24 '09 at 01:54
  • What do you mean by self-hosted? If it compiles itself it has the potential to improve itself (even in trivial ways) but it might be not actually doing it – JohnIdol Oct 24 '09 at 12:43
  • Just so you know, self-hosting compilers are not rare. SBCL is self-hosting, gcc is self-hosting, etc. – jrockway Nov 07 '09 at 22:49
  • @jrockway - self-hosting compilers compile themselves, meaning that if they optimize the code they compile ... they optimize themselves. Am I onto something or is it just a case of dog chasing its tail? – JohnIdol Nov 07 '09 at 23:28
  • Until you rigorously define "improve" in terms of compilers, any discussion on this topic is a dance of ignorance. – xofz May 19 '10 at 20:13
  • @Sam Rigor is not required - I am good with whatever people think is an improvement, as long as it's a self-improvement to the compiler source code made by the compiler itself. – JohnIdol Jun 16 '10 at 10:46

8 Answers8

13

While it is true that compilers can improve code without human interference, however, the claim that "compilers are self-improving" is rather dubious. These "improvements" that compilers make are merely based on a set of rules that are written by humans (cyborgs anyone?). So the answer to your question is : No.

On a side note, if there was anything like a self improving compiler, we'd know... first the thing would improve the language, then its own code and finally, it would modify its code to become a virus and make all developers use it... and then finally we'd have one of those classic computer-versus-humans-last-hope-for-humanity kind of things... so ... No.

aviraldg
  • 9,531
  • 6
  • 41
  • 56
8

MilepostGCC is a MachineLearning compiler, which improve itself with time in the sense that it is able to change itself in order to become "better" with time. A simpler iterative compilation approach is able to improve pretty much any compiler.

Davide
  • 17,098
  • 11
  • 52
  • 68
4

25 years of programming and I have never heard of such a thing (unless you're talking about compilers that auto download software updates!).

dicroce
  • 45,396
  • 28
  • 101
  • 140
4

Not yet practically implemented, to my knowledge, but yes, the theory is there:

  • Goedel machines: self-referential universal problem solvers making provably optimal self- improvements.
MaD70
  • 4,078
  • 1
  • 25
  • 20
  • 2
    so the answer is: NO - there's no such a thing out there. Time travel is possible in theory but I don't see anyone from the future :) - All jokes aside, thanks for the link. Any other examples you can think of or Goedel machines are the only theorized examples of such a thing? – JohnIdol Nov 07 '09 at 23:26
  • They are state-of-the-art to my knowledge, previous work was done by Marcus Hutter (http://www.hutter1.net/), a student of Schmidhuber. But I need to be clear: Goedel machines are not "theorized" in the sense that there are vague statements that they are possible; no, in Schmidhuber's work there is a clear plan to build (program) one. What is left is engineering, which is not necessary easy: in particular it requires an axiomatic model of the hardware performance where it run. With current crappy CPUs is unrealistic a deterministic performance model, I was thinking to let the program... – MaD70 Nov 08 '09 at 00:32
  • .. monitor itself (via hardware performance counters), learn and perfect incrementally an axiomatic probabilistic performance model. But this is pure speculation. – MaD70 Nov 08 '09 at 00:35
  • From the linked page - "The searcher systematically and efficiently tests computable proof techniques (programs whose outputs are proofs) until it finds a provably useful, computable self-rewrite. We show that such a self-rewrite is globally optimal - no local maxima! - since the code first had to prove that it is not useful to continue the proof search for alternative self-rewrites." How long does that typically take to run? That sounds alot like it would be a race against the heat death of the universe. Especially since existing compiler theory is pretty advanced already. – Adam Luchjenbroers Dec 01 '09 at 08:02
  • No: time is equally divided between doing the actual task and searching for better solutions. Imagine a robot executing a plan - if there is a stringent time-limit, at worst it will complete the task executing a less efficient plan. So, if there is a time-limit the actual task will be completed anyway (if both the plan is computable and doable in that time-limit, of course). Also, if one have a good starting point there is nothing that prevent to start from there - possibly a rewrite will discover a better version, not (yet) computable by traditional means. – MaD70 Dec 03 '09 at 18:46
  • @MaD70: Didn't find any reference to Goedel machines on Hutter's page (I may not have known exactly what to look for), but it looks impossible to me. It isn't possible to algorithmically find the simplest program equivalent to program X. If it were, we could, say, write a program that would test Goldbach's conjecture and either print a counterexample or never stop, and reduce it to either a simple infinite loop or a print statement. Any sort of exhaustive search would race against proton decay and lose. – David Thornley May 19 '10 at 21:10
  • @David Thornley: the link points directly to Goedel machines home page on Schmidhuber's site: http://www.idsia.ch/~juergen/goedelmachine.html Read a paper on *that* page: he does not wrote that a Goedel machine will find "the simplest program equivalent to program X". A Goedel machine will rewrite itself incrementally and it only assures, by proof, that each rewrite will be better than the preceding version. Eventually, it will find by chance, in the time allotted, a global optimal rewrite, but this is not guaranteed, of course, and you can never be sure that current rewrite is optimal. – MaD70 Jun 07 '10 at 17:43
  • @MaD70: Thanks, found a preprint there. It appears to me that the machine will limit itself to provably correct improvements, which has its own issues. Note that a deterministic CPU performance model is almost certainly slower than a real model, which means that it's got a performance deficit to begin with. It doesn't look like it'll be practical to me, although we're likely to learn stuff by trying it. – David Thornley Jun 07 '10 at 17:48
  • @David Thornley: you are welcome. On an *axiomatic* deterministic CPU performance model: you need to understand it NOT as "emulation" but as instruction cycles counting, as in the past. So given a rewrite, you can calculate its performance by a simple addition of each instruction cycle of the rewritten program. But see next comment. – MaD70 Jun 07 '10 at 21:27
  • @youngsters: in the past, CPU datasheets declared for each instruction how many cycles were needed to execute them (1 cycle = 1 tick of clock). CPUs performances were much more predictable in that time. Of course, nowadays we have a completely different picture. For that reason in the second comment to this answer I alluded to an axiomatic **probabilistic** performance model, automatically constructed (machine learning again). – MaD70 Jun 07 '10 at 21:28
  • So your argument in favor is an non-existent machine based on a vague unimplementable theory? Sorry, the answer is NO. – Cerin Mar 15 '11 at 01:22
2

A self improving compiler would, by definition, have to have self modifying code. If you look around, you can find examples of people doing this (self modifying code). However, it's very uncommon to see - especially on projects as large and complex as a compiler. And it's uncommon for the very good reason that it's ridiculously hard (ie, close to impossible) to guarantee correct functionality. A lot of coders who think they're smart (especially Assembly coders) play around with this at one point or another. The ones who actually are smart mostly move out of this phase. ;)

Russell Newquist
  • 2,656
  • 2
  • 16
  • 18
1

In some situations, a C compiler is run several times without any human input, getting a "better" compiler each time. Fortunately (or unfortunately, from another point of view) this process plateaus after a few steps -- further iterations generate exactly the same compiler executable as the last.

  1. We have all the GCC source code, but the only C compiler on this machine available is not GCC. Alas, parts of GCC use "extensions" that can only be built with GCC. Fortunately, this machine does have a functional "make" executable, and some random proprietary C compiler. The human goes to the directory with the GCC source, and manually types "make".

  2. The make utility finds the MAKEFILE, which directs it to run the (proprietary) C compiler to compile GCC, and use the "-D" option so that all the parts of GCC that use "extensions" are #ifdef'ed out. (Those bits of code may be necessary to compile some programs, but not the next stage of GCC.). This produces a very limited cut-down binary executable, that barely has enough functionality to compile GCC (and the people who write GCC code carefully avoid using functionality that this cut-down binary does not support).

  3. The make utility runs that cut-down binary executable with the appropriate option so that all the parts of GCC are compiled in, resulting in a fully-functional (but relatively slow) binary executable.

  4. The make utility runs the fully-functional binary executable on the GCC source code with all the optimization options turned on, resulting in the actual GCC executable that people will use from now on, and installs it in the appropriate location.

  5. The make utility tests to make sure everything is working OK: it runs the GCC executable from the standard location on the GCC source code with all the optimization options turned on. It then compares the resulting binary executable with the GCC executable in the standard location, and confirms that they are identical (with the possible exception of irrelevant timestamps).

After the human types "make", the whole process runs automatically, each stage generating an improved compiler (until it plateaus and generates an identical compiler). http://gcc.gnu.org/wiki/Top-Level_Bootstrap and http://gcc.gnu.org/install/build.html and Compile GCC with Code Sourcery have a few more details.

I've seen other compilers that have many more stages in this process -- but they all require some human input after each stage or two. Example: "Bootstrapping a simple compiler from nothing" by Edmund Grimley Evans 2001 http://homepage.ntlworld.com/edmund.grimley-evans/bcompiler.html And there is all the historical work done by the programmers who have worked on GCC, who use previous versions of GCC to compile and test their speculative ideas on possibly improved versions of GCC. While I wouldn't say this is without any human input, the trend seems to be that compilers do more and more "work" per human keystroke.

Community
  • 1
  • 1
David Cary
  • 5,250
  • 6
  • 53
  • 66
0

I'm not sure if it qualifies, but the Java HotSpot compiler improves code at runtime using statistics.

But at compile time? How will that compiler know what's deficient and what's not? What's the measure of goodness?

There are plenty of examples of people using genetic techniques to pit algorithms against each other to come up with "better" solutions. But these are usually well-understood problems that have a metric.

So what metrics could we apply at compile time? Minimum size of compiled code, cyclometric complexity, or something else? Which of these is meaningful at runtime?

duffymo
  • 305,152
  • 44
  • 369
  • 561
  • Firstly : that's not self improving is it? Secondly : If someone thinks that that is an application of AI, it is not. What it simply does is checks for performance bottlenecks and tries to improve those by compromising on other things. – aviraldg Oct 18 '09 at 15:02
  • Thanks, I see your point but I am talking about a compiler that improves itself - not the code that it compiles. – JohnIdol Oct 18 '09 at 15:04
0

Well, there is JIT (just in time) techniques. One could argue that a compiler with some JIT optimizations might re-adjust itself to be more efficient with the program it is compiling ???

SystematicFrank
  • 16,555
  • 7
  • 56
  • 102