I am relatively new to Spirit Qi, and am trying to parse an assembler-like language.
For example, I'd like to parse:
Func Ident{
Mov name, "hello"
Push 5
Exit
}
So far, so good. I can parse it properly. However, the error handler sometimes comes up with strange error locations. Take for example the following faulty code:
Func Ident{
Mov name "hello" ; <-- comma is missing here
Push 5
Exit
}
Here are the rules involved in this parsing:
gr_function = lexeme["Func" >> !(alnum | '_')] // Ensure whole words
> gr_identifier
> "{"
> *( gr_instruction
|gr_label
|gr_vardecl
|gr_paramdecl)
> "}";
gr_instruction = gr_instruction_names
> gr_operands;
gr_operands = -(gr_operand % ',');
The parse will notice the error, but complain about a missing "}" after the Mov. I have a feeling that the issue is in the definition for "Func", but cannot pinpoint it. I'd like the parser to complain about a missing "," It would be ok if it complained about consequential errors, but it should definitely pinpoint a missing comma as the culprit.
I have tried variations such as:
gr_operands = -(gr_operand
>> *(','
> gr_operand)
);
And others, but with other strange errors.
Does anyone have an idea of how to say "Ok, you may have an instruction without operands, but if you find one, and there is no comma before the next, fail at the comma"?
UPDATE
Thank you for your answers so far. The gr_operand is defined as follows:
gr_operand = ( gr_operand_intlit
|gr_operand_flplit
|gr_operand_strlit
|gr_operand_register
|gr_operand_identifier);
gr_operand_intlit = int_;
gr_operand_flplit = double_;
gr_operand_strlit = '"'
> strlitcont
> '"'
;
gr_operand_register = gr_register_names;
// TODO: Must also not accept the keywords from the statement grammar
gr_operand_identifier = !(gr_instruction_names | gr_register_names)
>> raw[
lexeme[(alpha | '_') >> *(alnum | '_')]
];
escchar.name("\\\"");
escchar = '\\' >> char_("\"");
strlitcont.name("String literal content");
strlitcont = *( escchar | ~char_('"') );