0

Would JIT-compiling a C++ template at compile time be a viable strategy for faster compile times? Is this maybe already done in large compilers like LLVM, and if not, what are the (maybe obvious) downsides making this non-viable?

For clarification, what I mean is that one takes the C++ template language, not as an interpreted system for generating C++ AST, but as a JIT-compilable language that one passes to e.g. LLVMJit or similar systems that emit binary blobs that in-turn generate the resulting AST of the template application result, given the template arguments.

Would this theoretically speed up some compilation times? AFAIK, JIT/interpretation speedup depends heavily on the frequency of the called code, but I can imagine some templates as being applied many times.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Fl0wless
  • 318
  • 2
  • 12
  • Can you explain what will you expect `#define foo bar` to get compiled into, if `bar` is not even defined at this point, but a few thousand lines later, in the code, after a completely different header file gets `#include`d? And it can be either declared as a function, a template, or a completely different macro? – Sam Varshavchik Dec 05 '21 at 14:49
  • I think I am not really getting your point. However in my mind the whole idea of c++ (template) compilation is to do a lot of compile time checking so that runtime code doesn't have to. So it is that last thing I would like to see JIT-ed. Also the philosophy is to generate code with predictable runtime behavior (unlike garbage collected languages), and a JIT would break that too – Pepijn Kramer Dec 05 '21 at 15:06
  • You are describing the behavior of Java generics. C++ templates [don't work this way](https://stackoverflow.com/questions/36347/what-are-the-differences-between-generic-types-in-c-and-java) – Igor Tandetnik Dec 05 '21 at 15:44
  • Templates *are* (instantiated and) compiled at compile-time, and the "template language" is not an interpreted system generating ASTs. Also, templates are not macros. – molbdnilo Dec 05 '21 at 15:49
  • @molbdnilo the word 'macro' was an unfortunate typo. I am talking about the template as a function from input parameters to C++ class/expression being compiled. If it helps my mental model of C++ templates is a glorified substitution calculus like the lambda calculus. – Fl0wless Dec 05 '21 at 18:47
  • 1
    @SamVarshavchik I was trying to talk exclusively about templates, not macros. Sorry for the confusion. – Fl0wless Dec 05 '21 at 18:48
  • @IgorTandetnik would you mind to elaborate? Are C++ templates not working on an AST level or what exactly are you referring to? – Fl0wless Dec 05 '21 at 18:49
  • I believe C++ templates are fundamentally working on AST level and are not amenable to preprocessing into any data structure that's not effectively AST. There is no missed opportunity to speed up template instantiation that could be captured by slowing down processing of the template definition. In contrast, Java generics are designed to be compilable to Java bytecode, with minimal work at instantiation time. – Igor Tandetnik Dec 05 '21 at 18:53
  • `template auto whathappens(T a, T b) { return a+b; }` -- what kind of JIT code do you propose this to be compiled into? When the template gets instantiated you will get numerical addition here if `T` is a numeric value. T could be a `std::string`, in which case this becomes string concatenation. If `T` is some custom type with an overloaded `+` operator, the return value can be some other object entirely. How do you propose to JIT-compile a template, like this, keeping everything in mind? – Sam Varshavchik Dec 05 '21 at 19:11
  • @IgorTandetnik Thanks, is that because most templates are only instantiated sparsely/rarely or because the maximal speedup of such an approach is just not that high? Or am I just wrong that template instantiation is the bottleneck in C++ compilation? – Fl0wless Dec 05 '21 at 19:12
  • @SamVarshavchik no I am not talking about JIT'ing the resulting code, in this case the whathappens(T a, T b) {return a+b;}, but the template itself as a function taking in the AST/type information of a type T and generating the AST for the instantiated function whathappens(T a, T b) {...}, which is then passed on to code generation as normal, so the actual semantics are a non-issue for this approach. – Fl0wless Dec 05 '21 at 19:16
  • It's because the same syntax may produce wildly different implementations at instantiation time. There's not enough information in template definition to progress much farther than capturing AST. – Igor Tandetnik Dec 05 '21 at 19:16
  • @IgorTandetnik but would it not be advantageous to push out the AST as fast as possible? But I do think you are probably right in that the available scope of the work is to small to get many gains. What I think is that my initial assumption that the instantiation instead of the subsequent compilation of the instantiated AST's is an actual significant bottleneck, upon further thinking, may not be right. – Fl0wless Dec 05 '21 at 19:24
  • I don't understand the question. Push out of where to where else? – Igor Tandetnik Dec 05 '21 at 19:26
  • @IgorTandetnik so as far as I understand the C++ template system, we get a .cpp, do lexical analysis, run the C macro processor, then from the token stream generate an AST, _then_ look at the AST of all templates and apply the substitution rules of the template language, and then compile the AST down to IR/binaries. So we basically have an AST-level interpreter for all template code that we run to generate the instantiated code. That is where I would want to apply JIT techniques, basically dumping (x86, etc.) binary code that takes AST pointers and generates AST blobs that generate the code. – Fl0wless Dec 05 '21 at 19:32
  • So if we have a template or chain of templates that is/are instantiated regularly, we could gain the usual JIT speedup for that step. At least in my theory. – Fl0wless Dec 05 '21 at 19:35
  • And I'm saying there's not enough information in the template definition's AST to generate anything remotely close to x86 machine code. Only after the actual template arguments are substituted at the point of instantiation, is there enough information. I'm still not sure I understand where you think time saving opportunities are to be found. – Igor Tandetnik Dec 05 '21 at 20:43
  • @IgorTandetnik again I am _not_ talking about JIT'ing the resulting code generated _by_ the template. I am talking about JIT'ing the specific code generator defined _by_ the template. After all templates are just another language on top of C++ and are, to my knowledge, currently interpreted and not (JIT) compiled. – Fl0wless Dec 06 '21 at 10:37

1 Answers1

-3

Would JIT-compiling a C++ template at compile time be a viable strategy for faster compile times?

Certainly yes, but implementing that idea will take you several years of work full-time. Consider making your PhD thesis on that topic (the issue is to find a good PhD advisor).

In practice, every modern C++ compiler (e.g. GCC or Clang/LLVM) evolved from a simpler C (or old C++) compiler.

Another (perhaps related) research topic is to make a C++ compiler with both JIT-compiling techniques and multi-threading.

Both GCC or Clang/LLVM are (in end of 2021) monothreaded compilers. You might consider adding several pthreads inside them.

The issue is to find a good qualified PhD advisor. I am not qualified enough for that role.

A related book is of course: Pitrat's book (yellow cover) on Artificial Beings, the conscience of a conscious machine

On Linux, a possibility could be to generate GCC plugins tailored in compiling a known set of C++ templates, and dlopen-ed at compile time. My old Bismon could be a starting point. Or the RefPerSys project....

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547