2

Here is a QML grammar (extracted from https://github.com/kropp/intellij-qml/blob/master/grammars/qml.bnf):

/* identifier, value, integer and float are terminals */

qml ::= object  /* Simplified */

object ::= type body
body ::= '{' (property_definition|signal_definition|attribute_assignment|method_attribute)* '}'
type ::= 'double'|'real'|identifier

attribute_assignment ::= (attribute ':')? attribute_value ';'?
item ::= list|object|string|boolean|number|identifier|value
attribute_value ::= method_call|method_body|item|value+

property_definition ::= 'default'? 'readonly'? 'property' ('alias'|'var'|type) property (':' attribute_value)?
signal_definition ::= 'signal' signal ('(' (signal_parameter ',')* signal_parameter? ')')?
signal_parameter ::= ('var'|type) parameter

method_attribute ::= 'function' method '(' (parameter ',')* parameter? ')' method_body

method_call ::= method '(' (argument ',')* argument? ')'

method_body ::= '{' javascript '}'
javascript ::= ('{' javascript '}'|'var'|'['|']'|'('|')'|','|':'|';'|string|identifier|number|value)*

list ::= '[' item? (',' item)* ']'

property ::= identifier
attribute ::= identifier
signal ::= identifier
parameter ::= identifier
method ::= identifier
argument ::= string|boolean|number|identifier|value

number ::= integer|float
boolean ::= 'true'|'false'

Is it LALR(1)? My program raises a reduce/reduce conflict for the closure I[n] which contains the conflicting items:

// other items here...
[item ::= identifier . , {]  // -> ACTION[n, {] = reduce to item  
[type ::= identifier . , {]  // -> ACTION[n, {] = reduce to type  
// other items here...
rici
  • 234,347
  • 28
  • 237
  • 341
Swordow
  • 35
  • 6
  • I edited your question to include the grammar you're referring to (or, at least, part of it). It's not clear to me where `identifier` and `value` come from, since they are not defined in the file you link to; I assume they are tokens. It's considered bad style on SO to include essential content for a question as a link. If the Github repo's owner edits the file, which they could do at any time, it could make the question and any answer meaningless. Please avoid such links in the future. – rici Dec 12 '19 at 20:59
  • Thanks!! `identifier` almost is a token in many cases. Also, i assumed the `value` is a token. – Swordow Dec 13 '19 at 00:13

1 Answers1

3

Note:

The following answer was written on the basis of the information provided in the question. As it happens, the actual implementation of QML only accepts user declarations for types whose names start with an upper case letter, while names of properties must start with a lower case letter. (Many built-in types have names which start with lower case letters, too. So it's not quite as simple as just dividing identifiers into two categories in the lexical scan based on their first letter. Built-in types and keywords still need to be recognised as such.)

Unfortunately, I haven't been able to find a definitive QML grammar, or even a formal description of the syntax. The comments above were based on Qt's QML Reference.

Thanks to @mishmashru for bringing the above to my attention.


The grammar is ambiguous so the parser generator correctly identifies a reduce/reduce conflict.

In particular, consider the following simplified productions extracted from the grammar, where most alternatives have been removed to focus on the conflict:

body ::= '{' attribute_assignment* '}'
attribute_assignment ::= attribute_value
attribute_value ::= method_body | item
method_body ::= '{' javascript '}'
item ::= object | identifier
object ::= type body
type ::= identifier

Now, consider the body which starts

{ x {

We'll suppose that the parser has just seen x and is now looking at the second {, to figure out what action(s) to take.

If x is an ordinary identifier (whatever "ordinary" might mean, then it can resolve to item, which is an alternative for attribute_value. Then the second { presumably starts a method_body, which is also an alternative for attribute_value.

If, on the other hand, x is a type, then we're looking at an object, which starts type body. And in that case the second { is the start of the interior body.

So the parser needs to decide whether to make x into an attribute_value directly, or to make it into a type. The decision cannot be made at this point, because the { lookahead token doesn't provide enough information.

So it's clear that the grammar is not LR(1).

Without knowing anything more about the problem domain, it's hard to give good advice. If it is possible to distinguish identifier and type, perhaps by consulting a symbol table, then you could solve this problem by using some kind of lexical feedback.

Community
  • 1
  • 1
rici
  • 234,347
  • 28
  • 237
  • 341
  • 3
    Thanks! As you said, it is not LR(1), so i think it is not the correct QML grammar used in QT and i was misled by the qml.bnf in github. The QML grammar used in QT is a LALR grammar. QLALR is a parser generator for LALR grammars. It is used to generate front-ends for QML(https://code.qt.io/cgit/qt/qlalr.git/tree/), but i cound not find the correct grammar for qml.Maybe, i could remove the `method_call` `method_body` `javascript` `method` `method_attribute` `value` and `argument` which are used to support embedded javascript and the rest is LR(1). – Swordow Dec 13 '19 at 01:15
  • I rewrite the grammar and remove the token `value` that could not be figured out the meaning and add one rule `lambda ::= 'function' method_body` and modify `attribute_value` to `attribute_value ::= method_call | lambda | item` and it is LR(1). – Swordow Dec 13 '19 at 02:13
  • To reduce a conflict 'item' vs 'type' QML does following trick: all types are started with Capital letter. So we just need to rewrite 'type' rule like type: id_with_capital_latter and introduce this non terminal in the grammar – mishmashru Jan 15 '20 at 00:27
  • @mishmashru: Interesting. That's certainly not evident from the question, nor from the cited grammar. Do you have a reference, by chance? – rici Jan 15 '20 at 00:41
  • @mishmashru: Interesting. According to the url rici provided, found this: "Note that in both cases, the type name must begin with an uppercase letter in order to be declared as a QML object type in a QML file" from [QML Object Types](https://doc.qt.io/qt-5/qtqml-typesystem-objecttypes.html) and [Defining Object Types through QML Documents](https://doc.qt.io/qt-5/qtqml-documents-definetypes.html#naming-custom-qml-object-types) – Swordow Feb 02 '20 at 08:41