16

I'm confused about how context-sensitivity and ambiguity influence each other.

What i think is correct is:

Ambiguity:

An ambiguous grammar leads to the construction of more than one parse-tree using either left or right derivation. A language where all possible grammars are ambiguous is an ambiguous language.

For instance C++ is an ambiguous language because x * y always can mean two different things as discussed in: Why can't C++ be parsed with a LR(1) parser?.

Context-sensitivity:

A context-sensitive grammar has rules where the left hand side of these rules may contain (non)terminal symbols additional to the one nonterminal required within the lhs of all rules of different kinds of grammars. Which means that you cannot just replace a nonterminal while descending. Instead you have to look at the surrounding nonterminals first.


Now what bothers me are statements that more or less say that context-sensitive parsers can parse ambiguities like x * y. For instance in the above linked question it is stated that "... a parser that does [decorating a syntax tree while creating it] isn't context free, and LR parsers (the pure ones) are context free." In my opinion this implies that context-sensitive parsers (the opposite of context-free parsers?) can do this. Another example would be Is any part of C++ syntax context sensitive? where this question is answered with "Yes ...". Same here: what is an ambiguous context free grammar?

I don't see what this C++ ambiguity has to do with context-sensitivity. I don't think that there is any context-sensitive grammar that can deal with this ambiguity. For instance if you take a fictional rule like Typedef, <other>*, PointerDeclaration -> Ident "*" Ident

then you still would not be able to determine (with pure parsing) whether the concrete first Ident (e.g. "x") was used during the Typedef (e.g. typedef double x;).


So is it possible that the term "context-sensitivity" is used within the linked questions although meaning something simple like context-dependency (e.g. more information needed than provided by a simple parser). Or is there any kind of connection between "real" context-sensitivity" and ambiguities.

Edit More specified question: Is there any kind of ambiguities within context-free grammars that can be dealt with by using context-sensitive grammars. This question occurs to me because in the linked questions it sounds like the C++ ambiguity is sometimes referred to as an context-sensitivity problem.

Edit2 Additional Info: The book Computer Systems states on page 346 that requirements such as having the same amount of actual and formal parameters can be expressed by context-sensitive grammars. But this is very cumbersome because you would need lots of complexe rules. But maybe this could also apply to the C++ ambiguity mentioned earlier. So that we have rules like

"Typedef double x", <other>*, PointerDeclaration -> "x" "*" Ident

Of course such rules would be very limited and you would need a huge amount to express every possibility. At least this could be an approach to the answer of the question if (theoratically) context-free free ambiguities can be replaced with the usage of context-sensitive rules

Community
  • 1
  • 1
user764754
  • 3,865
  • 2
  • 39
  • 55
  • 3
    I like the question, but I am waiting for some SO loser to come and shout at you for this not being programming related. – San Jacinto May 22 '11 at 13:11
  • 6
    @San: It's programming related. It's probably not related to the act of development, though. And calling people "losers" for upholding the policies of the website is a bit of a stretch; please don't employ personal epithets on this site. – Lightness Races in Orbit May 22 '11 at 14:11
  • 3
    @tomalak Let me be fair to you; my first inclination is to blow you off, but you make a valid point. However, my complaint or "personal epithet" is much more concerning the increasingly frustrating state that SO is falling in to, whereby questions pertaining to daily programming tasks are often moved to programmers or superuser simply because the crowd here doesn't see the same utility in the question. – San Jacinto May 23 '11 at 19:52
  • 1
    @SanJacinto: Then it would be more appropriate on [meta](http://meta.stackoverflow.com). :) – Lightness Races in Orbit May 23 '11 at 20:27

3 Answers3

5

Context-sensitivity and ambiguity are entirely orthogonal. There exist ambiguous context-free languages and unambiguous context-sensitive languages.

A context-sensitive language is a formal language that can be parsed by a context-sensitive grammar (CSG). Every context-free language is also a context-sensitive language since context-free grammars are simplified context-sensitive languages. Not every formal language is context-sensitive though; there are languages that even a CSG cannot describe.

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • This answer could be improved with examples (even links to examples). As it stands, it simply confirms OP's assertions. – OJFord Jan 26 '15 at 03:50
  • Saying that they don't even correlate is doubtful after we have observed that ambiguities are resolved by the context I think that you need to provide more ground for your statement. Saying `they are orthogonal` is too weak. Furthermore, we observe all our ambiguities in the context-free languages and think of context-sensitive as much less ambigous. Saying that there exist ambigous context-free languages does not make sense at all. So, everything is against your statement of orthogonality. – Valentin Tihomirov Dec 26 '15 at 20:30
  • The basic facts from the parsing theory, about context-free being a special case of context-sensitive that you spend 2/3 of your answer does not contribute anything. I know it for 2 decades. How did it help me to avoid the deception that OP falls in? I believe exactly like OP that context saves you from many ambguities that you have in context-free grammars. What the fact that context-free is subset of context sensitive change in this deception? – Valentin Tihomirov Dec 26 '15 at 20:33
4

If you want to parse a context-sensitive language with a context-free parser, you define an a context-free grammar that accepts a superset of a context-sensitive language (because they are less powerful). Because you accept a superset, you might get ambiguities or false-positive results, which must be resolved after the parsing.

Example One: a XML-like language that allows for any tag name. Because context-free grammar cannot parse a sentence ww that consists of two repetitive words w = {a,b}+, it cannot parse <ID></ID> where IDs are equal as well. Thus one defines a context-free grammar that accepts a superset:

start -> elem
elem  -> open elem* close
open  -> '<' ID '>'
close -> '</' ID '>'
ID    -> ('a' / 'b')+

This obviously parses even the sentences that one doesn't want, therefore an extra check for equivalent IDs in open and close has to be done.

Example Two: C-like Typedef in a very simple language. Imagine a language that contains only typedef, pointers and multiplications. It has only two IDs, a and b. An example of such a language:

typedef a;
b * a;                    // multiplication
a * b;                    // b is pointer to type a 

The context-free grammar would be like:

start                     -> typedef multiplication-or-pointer+
typedef                   -> 'typedef' ID ';'
multiplication-or-pointer -> ID '*' ID ';'
ID                        -> 'a'
ID                        -> 'b'

The grammar does not accept superset, but it does not know if it sees multiplication or pointer, thus it is ambiguous. And therefore one has to go through the result and decide, if it is multiplication or pointer, depending what type is defined in typedef.

With context-sensitive grammar one can do much more. Very roughly (and imprecisely):

start                                     -> typedef multiplication-or-pointer+
typedef                                   -> 'typedef' ID ';'
multiplication-or-pointer                 -> pointer / multiplication
'typedef' 'a' ';' WHATEVER pointer        -> 'a' '*' ID   
'typedef' 'b' ';' WHATEVER pointer        -> 'b' '*' ID   
'typedef' 'b' ';' WHATEVER multiplication -> 'a' '*' ID
'typedef' 'a' ';' WHATEVER multiplication -> 'b' '*' ID
ID                                        -> 'a'
ID                                        -> 'b'

Please note, that what I show here is not precise, because I limited number of IDs. In general, there can be an infinite number of IDs. You can write a context-sensitive grammar for a general case (even though it must be absolutely unintuitive), but you can't write a context free grammar.

Regarding your Edit 1: I hope the previous example answers that.

Regarding your Edit 2: There are another tricks how to express that so the rules are not so limited, but they are usually mind-blowing and IMO it is the reason why nobody uses CSG formalism.

NB: context-sensitive grammar is equivalent to a linear bounded automaton, context-free grammar is equivalent to a pushdown automaton. It is not right to say that context-free parser is an opposite of context-sensitive parser.

jk_
  • 5,448
  • 5
  • 25
  • 23
0

Compilers do not use "pure" (whatever that may mean) grammars to do their parsing - they are real-world programs that do what all real-world programs do - apply heuristics in certain situations. This is why C++ compilers (and compilers for most other languages, except for undergrad exercises) are not produced using compiler generators.