19

Recently, I was going around looking for ideas on what I can build using C this summer and I came across this post: Interesting project to learn C?

Implement a programming language. This doesn't have to be terribly hard - I did the language that must not be named - but it will force you to learn a lot of the important parts of C. If you don't want to write a lexer and/or parser yourself, you can use lex/flex and yacc/bison, but if you plan on that you might want to start with a somewhat smaller project.

I was kinda intrigued about the implementing a programming language answer and I'm wondering how do I go about starting this? I've gone through the whole K&R book and I've done some of the exercises as well. I also have a bit of experience in C++ and Java if that matters. Any tips? Thanks!

Community
  • 1
  • 1
Rex Homming
  • 191
  • 1
  • 2
  • 3

14 Answers14

7

I'd start with a simple desk calculator program that can read things like:

5 + 10 * 3

and print the answer. Then you can progress it to add variables, control flow, even functions.

Walter Bright
  • 4,277
  • 1
  • 23
  • 28
  • 3
    Although implementing a compiler to evaluate `5 10 + 3 *` is easier :) http://en.wikipedia.org/wiki/Reverse_Polish_notation. Still a simple desk calculator – Merlyn Morgan-Graham Aug 25 '10 at 02:20
6

Can I just say, I have seen many people asking questions like "How do I make a programming language?" or "How hard is it to make a programming language" and most of the answers just tell them that you have to go through years of university and read books that are 1000 pages long. I am here to tell everyone that you may post those answers, but it doesn't help them at all in their journey to make a programming language. I am 16 and have been doing programming for almost 2 years and I write programming languages. Quite advanced object orientated ones as well, but I haven't read any books, no have I done 8 years of university. To get people started, here's a simple programming language written in C#:

string code = "print Hello World";
foreach (string a in code.Split('\n'))
{
    if (a.StartsWith("print "))
    {
        Console.WriteLine(a.Substring(6));
    }
}

anyone who knows basic C# should be able to understand this. You can't start making programming languages without having some programming experience. Make sure you learn a programming language and make sure you know a lot about it, then just start writing simple little bits of code, like I've posted, and with experimentation and practice, you'll start writing some complex programming languages in no time :)

  • 3
    Hey, I'm 15(was 14 not long ago). And I've created JavaScript, and Lua interpreters. The problem with this approach is that it is very static. I would recommend that people first learn about Lexers, Parser's, and (optionally) ASTs. This is however, a good demonstration of how many different ways there are to make a language. In my (limited) experience the concepts are a lot simpler when you know them, i.e. making a Lexer sounds difficult, but usually you can simplify it to the core concepts. – Coolq B Aug 17 '17 at 23:59
  • 2
    That's a transpiler... – Jordan Baron Mar 09 '18 at 14:22
  • I like your approach so so so much. – Khashayar Motarjemi Nov 05 '20 at 00:22
6

Learn about regular expressions, grammars, and a good parser generator.

Even if you end up implementing your own parser, these are the fundamental concepts to implementing any programming language.

Uri
  • 88,451
  • 51
  • 221
  • 321
6

Start with a very simple (toy) language; later you can create a more complex syntax.

You could write an interpreter to parse strings like,

integer x
integer y
set x, 2
set y, 5
add x, y // x = x + y
print x

and evaluate each line immediately. If you store the lines in a vector it'd be easy to implement loops with goto command.


An example, Another World (vintage game)
Script editor:

alt text

Community
  • 1
  • 1
Nick Dandoulakis
  • 42,588
  • 16
  • 104
  • 136
  • 1
    Rather than worrying about how the compiler will be written and what tools will be used, the primary concern is the language design. There are no useful details to get into until what you are designing is obvious. I completely agree with the concept of this answer. – erisco Jun 09 '10 at 22:56
3

Well, I think something like that is really hard to do but also it would be a great pet project. You should have notions of parsers, lexers, flow control, paradigms (imperative, functional, OO) and many other things.

Many people says the Dragon Book is one of the best books for this. Maybe you can take a look at it :)

Good Luck!

Marcote
  • 2,977
  • 1
  • 25
  • 24
  • 3
    Don't recommend the Dragon book for a "very simple" toy project. Just don't. – dmckee --- ex-moderator kitten Jun 09 '10 at 18:06
  • @dmckee what is the problem if it's for learning purposes? – Marcote Jun 09 '10 at 18:09
  • The problem is that offering a comprehensive---indeed encyclopedic---reference written in fully abstract form as a tutorial is silly and counter productive. Compilers are deep, complicated magic to do well, but they are simple to do simply. The dragon book buries that fact. – dmckee --- ex-moderator kitten Jun 09 '10 at 18:34
  • 3
    @Markust: The Dragon book is a great reference if you are into heavy computer science theory about compilers and programming languages. Its not a great "getting started" book. – Yann Ramin Jun 09 '10 at 18:40
  • @dmckee: I do recommend the dragon book for the small projects. There's no reason to ignore complexity and pretend its not there. – Paul Nathan Jun 09 '10 at 18:53
  • @Paul: All I can say is I disagree strongly. I don't recommend K&R as a c book for first time programmers, and it's an easier read. – dmckee --- ex-moderator kitten Jun 10 '10 at 01:37
  • Well I think that knowing where the dragons live is important, if only so you can avoid visiting their lairs sooner than you have to. If you're starting down this road, shouldn't you read this book? I think so. – Warren P Jul 30 '10 at 17:34
1

you can read some well-written papers by Niklaus Wirth:

  • "Compiler Construction" (available here) is a short, concise introduction to the art of building a compiler.
  • "Algorithms + Data Structure = Programs" (unfortunately out of print), presents a simpler language (named PL/0) in his last chapter.

although those papers are mainly written in Pascal, the concepts exposed are easily translated to C.

Adrien Plisson
  • 22,486
  • 6
  • 42
  • 73
1

If you speak French you may be interested in one of my colleagues courses (freely available) http://matthieuamiguet.ch/scientifique/enseignement/langages-et-compilateurs although he uses Python to explain the concepts of language construction and compilation.

English PDF from PyCon 2010 http://matthieuamiguet.ch/assets/files/scientifique/publis/TeachingCompilersWithPython_Paper.pdf

I may have to speak to him about translating his info to English 8)

Reuben Mallaby
  • 5,740
  • 4
  • 47
  • 45
1

I've made a simple language parser in Java some time ago, basically evaluated mathematical expressions, replaced constants and variables and provided some feedback on syntax/type errors.

The easiest way I found to do such a thing was to make a parse tree. This can be done easily by using two stacks, an operator stack and a result stack. Afterwards you could just parse it recursively using a DFS, maybe use the visitor pattern if you decide to implement this in a object oriented language.

There is a lot to say about these things and if you want to I can explain them more in-depth, I didn't because I thought you'd want to try implementing the above mentioned yourself, but if you do, just notify me and we can talk.

Radu Chivu
  • 1,057
  • 7
  • 17
1

One old compiler tutorial is this one. Though it is in Pascal it is a very good source of information. If you want something more recent you should have a look at ANTLR.

INS
  • 10,594
  • 7
  • 58
  • 89
1

Scheme from Scratch is a nice series of blog posts about implementing Scheme in C. The code is very readable, and each version builds on the previous one in a way that's easy to follow.

Here is the first installment: v0.1 - Integers.

namin
  • 37,139
  • 8
  • 58
  • 74
0

Another alternative is to build a language without looking at anything else. Figure out what you can do easily, and go from there. For example, you could parse expressions into a list of tokens, separating with whitespace, and use prefix notation (which is quite simple to deal with). This sort of thing is a huge amount of fun, and you can learn a lot from experimenting.

  • Experimentation can be good. I'd say don't *only* experiment, and definitely look up answers to problems you run into after you've taken an honest crack at them yourself, whether you managed to solve them or not. However, completely unstructured experimentation (dinking around) isn't terribly useful – Merlyn Morgan-Graham Aug 25 '10 at 02:15
0

To keep things simple, I recommend implementing a simple postfix language. FORTH or the core part of PostScript would be great choices.

Norman Ramsey
  • 198,648
  • 61
  • 360
  • 533
0

Read through posts on the usenet newsgroup comp.compilers, it is accessible through Google Groups. It has many discussions related to building a language, building a compiler, lex/yacc, grammars and the like. Of course, you'd have to have good familiarity with the classics such as the dragon book, the tiger book among many books on compilers and, good books on algorithms and data structures.

The Original C Compiler is being given a new life. Most of it is being rewritten, and its code base is small enough to be read and understood in a summer vacation. Consider reading the code along with the papers that were used to write the code of this or any working compiler and I'm sure you'd get ideas about where to start, etc.

vpit3833
  • 7,817
  • 2
  • 25
  • 25
0

Let someone else do the dirty work for you, namely, the lexer and the parser. Use cup, yacc, or bison to handle the syntax. This will let you focus on the more important language design decisions. There are even example parser definitions for many languages that you can use as a template for yours.

Chris
  • 1,303
  • 8
  • 6