1

Possible Duplicate:
How to code a compiler in C?

How would I start writing a compiler from scratch (no Flex or Bison or Lex or Yacc) in C? I have a language that I wrote an interpreter for, and it's kind of like Forth. Sort of. It takes in symbols and interprets them one at a time, using a stack.

How would I make a compiler?

That wasn't a particularly spammy bit; just to show people the syntax and simplicity.

http://github.com/tekknolagi/StackBased

Community
  • 1
  • 1
tekknolagi
  • 10,663
  • 24
  • 75
  • 119
  • 4
    Start with the Dragon Book - http://dragonbook.stanford.edu/ – Nikolai Fetissov Aug 15 '11 at 21:17
  • @Hans - that question wasn't particularly helpful. I'm looking at a far simpler problem (considering my language) and want to roll my own, absent of parser/lexer tools. – tekknolagi Aug 15 '11 at 21:20
  • 1
    @tekknolagi: That **is** pretty helpful. If you're not going to use any existing tools, then you're going to have to read a book. The theory is not simple, and it's not going to be possible to explain the entire procedure in a SO post. – Oliver Charlesworth Aug 15 '11 at 21:25
  • Also see this question (lots of references in the answer): http://stackoverflow.com/questions/1669/learning-to-write-a-compiler – Kaos Aug 24 '11 at 09:50

1 Answers1

5

Simple!

  1. You tokenize the input.
  2. You build a proper representation of it, generally this is an Abstract Syntax Tree, but that is not required.
  3. You perform any tree transformations you may require (optional).
  4. You generate the code by walking the tree.
  5. You link any disparate portions together (optional)

Flex and Bison help with stage 1 and 2, everything else is up to you. If you're still stuck, I suggest going through "Programming Language Pragmatics" or The Dragon Book.

Yann Ramin
  • 32,895
  • 3
  • 59
  • 82
  • I'm not sure where to go after interpreting it. How do I make something that can take input and then compile it to a binary? – tekknolagi Aug 15 '11 at 21:19
  • @tekknolagi: Start by thinking of "how would I transform this to C code?" at each step of your program structure. Remember, you are not running it, but rather transforming your application to another instruction set (be it CPU or even C) – Yann Ramin Aug 15 '11 at 21:20
  • Each of the commands corresponds to C code... Would it write all the C to a file and use `gcc`?? – tekknolagi Aug 15 '11 at 21:21
  • 1
    @tekknolagi: If you don't want to generate assembly, its a perfectly legitimate approach. Other options are emitting actual assembly code, or using LLVM. – Yann Ramin Aug 15 '11 at 21:23
  • I'm not familiar with assembly. Is that a good place to start? – tekknolagi Aug 15 '11 at 21:24
  • @tekknolagi: I would suggest transforming your source code to a well formed C program first. The code generator can often be adapted readily to other outputs at a later point. Instead of actual processor assembly, I would suggest you emit LLVM IR (a pseudo-assembly), which then allows you to use all of the LLVM optimizers and code generators. – Yann Ramin Aug 15 '11 at 21:29