6

Possible Duplicates:
Methodologies for designing a simple programming language
Learning to write a compiler

I would like to write a programming language with a syntax similar to QBasic but even simpler. I want it to be for beginning programmers. Its simplicity will encourage aspiring programmers not to give up and get them interested in programming. For example: Instead of QBasic's PRINT "Hello World!"

I would use

Write "Hello World!"

or a little more like VB

Write ("Hello World")

How would I go about adapting the basic syntax to make my language?

Community
  • 1
  • 1
RCProgramming
  • 452
  • 3
  • 14
  • Possible duplicate of http://stackoverflow.com/q/1208338/246069 – YWE Nov 03 '10 at 21:37
  • "Its simplicity will encourage aspiring programmers not to give up and get them interested in programming." -- Not to discourage you, but simplistic languages tend to get into one's way after a short time. I prefer languages that are *simple*, but don't fall short for larger/more sophisticated tasks (in particular, Python). Apart from that, are you asking for input how the syntax should be like, or do you want hints on how to actually parse it? –  Nov 03 '10 at 21:38
  • You're right. I started last year learning BASIC and I really enjoyed it so now I'm attempting to learn C so I can do some deeper programming. I also learned visual basic. I do think however, that If I can somehow modify the basic syntax, I could possibly add my own commands as well making it a more powerful version of BASIC but with a simpler syntax. – RCProgramming Nov 03 '10 at 21:41
  • Also what do you mean by parse. I want to make a language that I can write a compiler (or modify an existing one) for and adapt Qbasic syntax to my language. – RCProgramming Nov 03 '10 at 21:43
  • It seems like you are asking two questions at once - which is as lethal as trying to tackle two problems at once instead of solving them seperately. Are you (1) asking how to write a compiler for a programming language? There are a few questions on this topic on SO, and unless you have a specific question not covered by those, this question is a duplicate. Or are you (2) asking for ideas on a QBasic-like syntax? –  Nov 03 '10 at 21:47
  • I am asking 1. How would I write a compiler for a QBasic like syntax and 2. If possible, could I modify an existing QBasic compiler to compile my language. I would gladly accept any ideas on syntax though! – RCProgramming Nov 03 '10 at 21:52
  • Well, you could modify an existing compiler. But even if it is a very well-written piece of code, hacking it will require the same knowledge needed to write a compiler out of thin air (even more if the original author(s) felt like being clever and wrote messy code). –  Nov 03 '10 at 21:54
  • How would I modify an existing compiler and which one (If possible I'd prefer QBasic because I like the programming environment) – RCProgramming Nov 03 '10 at 21:58

2 Answers2

14

This is not a simple task. Language parsing and compiler theory are pretty hefty subjects. Lots o' math. You also have to decide what platform you want to target, which will also determine whether your language is fully compiled (eg. C/C++, Pascal), compiled into bytecode (e.g. Python, Java), or interpreted at runtime (eg. VBScript, JavaScript). For specifying the language itself, brush up on the Backus-Naur format.

To help you along, there are several robust parser generators out there, including:

  • Lex/Yacc (Flex/Bison are the GNU Versions) - The old school industry standard. For developing a compiler in C/C++
  • ANTLR - If you're interested in creating a compiler using Java
  • Boost.Spirit - A different approach, allowing specification of the language using C++ itself.

And many more. A comparison can be found here, while yet another list can be found here

If you're really interested in the full theory, you want to check out The Dragon Book.

But I must reiterate: This is a big subject. There are many, many tools to help you along the way, but the rabbit hole goes pretty deep.

rossipedia
  • 56,800
  • 10
  • 90
  • 93
  • 3
    This is a big subject, but does *not* involve much of (at least what most people would think of as) math. – Jerry Coffin Nov 03 '10 at 21:49
  • Thank you very much for this thorough answer. I have a few questions about your answer. What does parsing mean? What category would a language like BASIC fall under? – RCProgramming Nov 03 '10 at 21:49
  • A few remarks: (1) No, it's not math - but highly abstract stuff nonetheless, yeah. (2) JavaScript isn't interpreted since forever (most implementations even JIT compile now). These days, no serious language solely interprets the source code or even the AST directly (older Ruby implementations did, languages with compiletime metaprogramming propably do). (3) (E)BNF is useful to know, but the DSLs parser generators use either differ or are completely unrelated, so it's not the most important things. Not to mention that that's only the grammar, you still have to build an AST and make it run. –  Nov 03 '10 at 21:52
  • 1
    Parsing is the process of taking a sequence of "tokens" (which are usually the words, symbols, etc of a language), and finding out what that sequence logically represents. In computer languages, parsing is what happens when a compiler breaks up source code into it's parts, and then analyzes what those parts are supposed to mean. – rossipedia Nov 03 '10 at 21:53
  • 1
    @delnan: I guess you're right. I haven't studied this stuff in about 7 years or so, so I guess I might not be up to snuff on the current state of things. – rossipedia Nov 03 '10 at 21:54
  • Thank you. I have a few Questions 1. What is a DSLs parser generator 2. What is (E)BNF? 3. What is AST? Thanks – RCProgramming Nov 03 '10 at 21:57
  • @RCProgramming: DSL stands for "domain specific language", parser generators (like the already mentioned lex/yacc) define their own little language one can specify the grammar relatively easily. Of course the language is specific to that domain - it's not usful for anything else, but it does this one thing well. BNF is the Backus-Naur Form Bryan mentioned, the E is for extended BNF - which is, well, an extension of BNF with a few advantages. AST stands for abstract syntax tree, the data structure the compiler build during/after parsing (this is what all other parts work on). –  Nov 03 '10 at 22:02
  • Thank you very much. Have you ever written a language before? If so could you give me an example of it because I might be interested in programming in it because I would like to try something non mainstream – RCProgramming Nov 03 '10 at 22:05
  • @RCProgramming: Me? No, nothing which is complete enough to count as (usable) programming language. I only play around with various parts of the big picture. If you want to try a great non-mainstream language, there are several. If you got a few years to have your mind blown again and again, try Haskell ;) –  Nov 03 '10 at 22:13
  • Ill look into that. Can you give me an example of your "non usable" language? – RCProgramming Nov 03 '10 at 22:25
  • For the most part, it's a parser without backend (code generator) - e.g. a Lisp - or a backend without parser - a minimalistic language for arithmetic. Incidentally, both are in Haskell because Haskell has a neat parsing library, parsec, and is great for handling trees. I also have an incomplete/work-in-progress "compiler" for an utterly limited pascal-like language I want to extend with a few advanced features, but at the time being, it can't even parse expressions. –  Nov 03 '10 at 22:32
2

I think the up shot of this is:

  1. Simple to use.
  2. Simple to design/implement.
  3. Strong expressive abilities.

Pick 1.9 of them.

It's very possible to get a reasonable degree of any two of those. Doing any two fully is very hard and trying to get all three leaves you in a no-mans-land where you don't do any well.

p.s. I speek from experiance for #1+#3

BCS
  • 75,627
  • 68
  • 187
  • 294