10

In recent months, I have been seeing mentions of "LLVM" all over the place. I've looked it up, but the description of a "modern compiler infrastructure" doesn't really tell me anything. I can't find much about it, other than some mention of a c compiler that comes along with it (which doesn't seem to be any different from any other C compiler out there.)

Is there some difference between this LLVM thing and any other compiler, say, GCC? Or is it an over-hyped replacement benefiting from being newer than the competition?

Ethan McTague
  • 2,236
  • 3
  • 21
  • 53
  • 3
    Have you heard of Apple? Have you ever seen an obnoxious 10-year-old with white earphones? Thanks, LLVM. – Kerrek SB Jan 05 '17 at 02:01
  • 11
    Possibly answered here: http://stackoverflow.com/questions/2354725/what-exactly-is-llvm - though in reality, when a lot of people talk about LLVM, they actually mean `clang`, the C and C++ compiler that made a massive impact thanks to Apple (and thanks to pretty useful compiler error messages). Simplified, `clang` compiles C/C++ to an intermediate representation for LLVM, and LLVM then compiles machine code from that. The advantage is that if you have a new programming language, you only write 1 compiler (Language -> LLVM) and not worry about x86/x64/arm/other platforms since LLVM does that. – Michael Stum Jan 05 '17 at 02:02
  • @MichaelStum that seems to have cleared it up a bit, but I'm still unclear on why `clang` itself is suddenly more popular – Ethan McTague Jan 05 '17 at 02:04
  • @EthanMcTague I don't know for sure (only did C and C++ on the side), but Apple made it the compiler for macOS and iOS, so that helped a lot. remember people praising the speed and useful error messages as well, but a more seasoned C or C++ developer would have to answer that. Some answers to http://stackoverflow.com/questions/4885903/why-is-clang-not-used-more might be enlightening. – Michael Stum Jan 05 '17 at 02:08
  • @MichaelStum I think I've read that `gcc` is still the primary compiler on macOS (though, the speed would definitely be a good explanation!) – Ethan McTague Jan 05 '17 at 02:09
  • 1
    @EthanMcTague clang is, and has been, the default compiler on all apple operating systems for more than 5 years. :) – echristo Jan 05 '17 at 18:11

1 Answers1

31

There is some academic literature on the matter, I recommend the AOSA book chapter on it, written by the principal author (Chris Lattner).

LLVM is a collection of libraries built to support compiler development and related tasks. Each library supports a particular component in a typical compiler pipeline (lexing, parsing, optimizations of a particular type, machine code generation for a particular architecture, etc.). What makes it so popular is that its modular design allows its functionality to be adapted and reused very easily. This is handy when you're developing a compiler for an existing language to target a new hardware architecture (you only have to write the hardware specific components, all the lexing, parsing, machine independent optimization, etc. are handled for you), or developing a compiler for a new language (all the back end stuff is handled for you), or when you're doing something compiler adjacent (like analyzing source code, embedding a language in a larger application, etc.).

In order to support this, LLVM employs a pretty sophisticated internal representation (called the LLVM IR, creatively enough) that is basically assembly language for a theoretical hardware architecture designed to make targeting it with a compiler very easy. Most of the LLVM libraries take the IR in, operate on it, and output the modified IR, supporting the project's aim of modularity. This is in contrast to GCC, which (historically, I haven't checked recently) has a less complete IR and thus the separate phases of compilation are very tightly coupled because they have to share a lot of information.

Clang is the flagship compiler built on the LLVM framework.

Irisshpunk
  • 758
  • 5
  • 8
  • 4
    clang source (haven't looked at LLVM) is also much easier to understand than GCC source, so it's good for learning and experimenting with compilers. Likely partially due to said modularity. – Erik Nyquist Jan 05 '17 at 04:42