How to prove the correctness of a given grammar?

Question

I am wondering how programming langauge developers validate and prove that their grammar is correct. Suppose that I created a new grammar for a new langauge. I can test my grammar with a unit test tool by providing different kinds of test programs. However, I will never 100% ensure that my grammar is correct. How do language developers ensure that their grammar is correct in real world?

Let's say I created a grammar for a new language using pencil and paper. However, I did a mistake and my grammar accepts the expressions that end with a + like 2+2+. I will implement my language using this incorrect grammar, if I don't find the mistake in it. After implementation and unit testing, I can find the error. Is it possible to find it before starting any implementation?

Definitely, I can try my grammar with some sample inputs using pencil and paper (derivation etc.), but I may miss some corner cases. Is there a better approach or how in the real language developers test their grammar?

What does it mean for a grammar to be "correct"? Or did you mean to ask how to check that a parser correctly recognises the expected grammar? — rici, Apr 10 '17 at 14:43
In theory, you would produce a correctness proof. I don't know whether this is done in the real world, but I doubt it. Without a correctness proof, though, you don't know the grammar is correct. So perhaps people do not know if their grammars are correct - or, rather, the grammars are defined to be correct, and nobody really knows what language they describe! — Patrick87, Apr 10 '17 at 18:59
I updated my question. How can i do a correctness proof for grammar? Any link or explanation? — Barış Akkurt, Apr 11 '17 at 07:52

score 0 · Answer 1 · answered Apr 13 '17 at 20:52

A proof is a logical argument that demonstrates the truth of a claim. There are as many ways to prove something as there are ways of thinking about a problem. A common way to prove things about discrete structures (like grammars) is using mathematical induction. Basically, you show that something is true in base cases - the simplest cases possible - and then show that if it's true for all cases under a certain size, it must therefore be true for cases of the next size.

In our case: suppose we wanted only to prove your grammar didn't generate + at the end of a word. We could do induction on the number of productions used in constructing a string in the language. We would identify all relevant base cases, show the property holds for these strings, and then show that longer strings in the language are constructed in such a way that it is impossible to get a + at the end. Here's an example.

S := S + S | (S) | x

Base case: the shortest string in the language is x, generated as S -> x. It does not end with a +.

Induction hypothesis: assume all strings produced using up to and including k productions do not end with +.

Induction step: we must show strings produced using more than k productions do not end with +. If we apply the rule (S) to any string generated from S, we do not add + so the property holds. If we apply S + S to strings generated from S, the last symbol in S + S is the last symbol of a shorter string (at least 2 symbols shorter) generated by S. By the induction hypothesis, that string did not end in +, so neither does this one. There are no other productions, so no string in the language ends in +. QED

How to prove the correctness of a given grammar?

1 Answers1