0

I am considering the question of transpiling a language (home-grown DSL) to C vs to C++.

I haven't done any 'native' programming for over 15 years, so I want to check my assumptions.

Am I right into assuming that transpiling to the newest C++ version (17) would enable the native compiler to use a much wider range of 'modern' Intel/AMD CPU instructions, resulting in a more efficient executable (beyond the multi-threading / memory-model part of C++, which already by itself seems a good enough reason to go for C++)?

Put another way, isn't a large part of 'more recent' CPU instructions never generated by a C compiler, simply because it has too little information about the programmer intent, due to the simpler syntax of C? I know I could access all CPU instructions with assembler, but that is precisely what I don't want to do. Ideally, I would want the generated code to still be as platform-independent as possible.

Sebastien Diot
  • 7,183
  • 6
  • 43
  • 85
  • 1
    Any particular reason as to why you're transpiling to C? If you're going as far as caring about performance this much, it means that it must be a rather advanced DSL (in one way or another). At that point, why not simply use a proper backend (I'd recommend LLVM)? You can even emit LLVM IR in text form, too. – Tim Čas Dec 30 '16 at 13:00
  • @TimČas I'll have to look into that too. But as I used to program in C, a long time ago, I assumed it would be easier to transpile to C. – Sebastien Diot Dec 30 '16 at 13:43
  • 1
    You'd be bound to C semantics, though (e.g. undefined overflow, no guarantees w.r.t. some operators [for a signed x, is `x >> 5` an arithmetic or logical shift?], et cetera). Still, many languages *do* compile to C with some success, but be careful about the gotchas! – Tim Čas Dec 30 '16 at 13:56

1 Answers1

6

All of your assumptions about the relationship between programming language and "modern CPU instructions" are incorrect.

Let's consider the GNU Compiler Collection.

The choice of language here doesn't much matter, as the language front-ends all end up generating the same intermediate form called GIMPLE. The optimizing passes then work on that.

The range of CPU instructions which can be emitted is controlled by the -mtune option. For x86, GCC is capable of emitting modern AVX 512 instructions when optimizing some very plain-looking C code. Automatic loop vectorisation is a powerful thing. Try it out: implement memcpy and look at the generated assembly.

My advice: generate clean, un-clever C code, and crank up the optimization level. Just like you would do if writing code by hand.

You might also consider implementing your language directly as a front-end to GCC or LLVM, without transpiling to C or C++. LLVM was designed for this purpose, intended to make implementing new languages easy, and still taking advantage of modern optimization approaches.

Jonathon Reinhart
  • 132,704
  • 33
  • 254
  • 328
  • Small note, implementing a new front-end to GCC is notoriously painful. – Dietrich Epp Dec 30 '16 at 11:52
  • Interesting. But I still need to use some 'libraries' to do multi-threading in C (idk yet if that would make things more difficult for me or not), or am I wrong on this too? – Sebastien Diot Dec 30 '16 at 11:53
  • 1
    Yes, threading would be a feature that may push you to C++, although I haven't much experience with `std::thread`. C has POSIX threads but Windows of course doesn't support them. I'm assuming `std::thread` works on MSVC. Will your DSL have threading primitives? – Jonathon Reinhart Dec 30 '16 at 11:55
  • 2
    Threads have been added to C and C++ as a language in sync in 2011. That said, unfortunately not all C libraries do yet implement C11 threads. – Jens Gustedt Dec 30 '16 at 12:12
  • 1
    Thanks @Jens, I forgot about ``, as I haven't used I yet. This question is from 2012, but perhaps things are getting better? http://stackoverflow.com/questions/8876043/multi-threading-support-in-c11 – Jonathon Reinhart Dec 30 '16 at 12:24
  • @JonathonReinhart Yes, I will definetly support threads (somehow) as it would be an actor-based language. – Sebastien Diot Dec 30 '16 at 13:39
  • @JensGustedt I totally missed that standard support for threads was added to C. I think this settles the question, as generating C code is much easier than generating C++ code. – Sebastien Diot Dec 30 '16 at 13:41