1

I have written my lexer and parser in flex and bison. My project is based on C++ and I would love to stick to it. Currently my Interpreter is written in C++ but I want to achieve faster execution time by converting to bytecode (some form of a VM-level bytecode) when my interpreter works. I know this can be achieved through LLVM. I had problems using it from a x64 OS and developing on a Visual Studio 2012 (32-bit). Some of which can be found @ LLVM linker errors on VS. The other tool I came across is ANTLR and if I understand correctly then the latest release does not easily integrate into C++ yet. Many references were found for the same but a quick one can be @ ANTLR integration with C++ issue. Also I do not want to dispose off my lexer and parser written in flex and bison. What are my options if I want to generate bytecode from my AST?

EDIT: My aim is to generate bytecode from my AST (for the target architecture) so the code can be executed at a Virtual Machine level. Currently I have an Interpretor which interpretes (executes the AST) based on C++ library and generates bytecode. I want to generate Bytecode straight from my AST and execute the AST in its bytecode.

Would be appreciated.

Community
  • 1
  • 1
Segmented
  • 2,024
  • 2
  • 23
  • 44

1 Answers1

1

Generating native bytecode directly from your AST is not possible (well actually it is, but that would be extremely difficult). You need some kind of intermediary step like emitting LLVM bytecode or code in some programming language of your choice. Please note that LLVM bytecode is not the same as native target machine bytecode. The LLVM bytecode has to be compiled to native binaries for target machines which is done by the respective frontend. So you could as well just generate C++ code from your AST using a handwritten code emitter which traverses your syntax tree. Then you use a C++ compiler for the target platform to compile it to the desired native binary.

jasal
  • 1,044
  • 6
  • 14
  • thanks. As I mentioned in my question I ran into problems while trying to use LLVM. I understand that LLVM bytecode is not same as the native machine target but it provides for a VM. I am looking for some alternative to LLVM that can do the same for me. – Segmented Jun 05 '14 at 10:51
  • I don't think that there is an alternative to `LLVM` that provides the functionality you are looking for. At least I've never heard of any. – jasal Jun 05 '14 at 10:53
  • Why not use C++ as _intermediate language_? You don't need `LLVM`'s VM functionality for generating native binaries anyway. Otherwise it would still be an interpreter which is just interpreting another representation of the same script code. – jasal Jun 05 '14 at 10:55
  • My current compiler uses C++11 as an intermediate language. I am a beginner in this field but from what I came across, I found LLVM can achieve much faster execution as it depends on the bytecode to be executed. As with my recent scenario I would like to use a tool of similar nature that boosts up my compiler. – Segmented Jun 05 '14 at 10:59
  • I'm a bit confused now. When you say you are using C++11 as intermediate language, does that mean that you are already __generating__ C++ from your AST (i.e. writing C++ code to a file) that you compile afterwards or does it mean that you use C++ for __interpreting__ the script? – jasal Jun 05 '14 at 11:03
  • Sorry for the confusion. I am using C++ for interpreting the script. – Segmented Jun 05 '14 at 13:44
  • There exist some other projects as alternatives to LLVM, but I don't know how maintained they are: this one for example: https://people.eecs.berkeley.edu/~necula/cil/ – Jonathan Apodaca Feb 02 '18 at 21:01