So, I've been playing with the
What I have done is added a rule to parse string literals. The purpose is so that I can parse and compile programs like (functionality already built-in):
int ret(int x) {
return x;
}
int main() {
int x = 5;
return ret(x)*2;
}
As well as (want to add this functionality),
string print(string s) {
return s;
}
int main() {
string foo = "bar";
print(foo);
return 0;
}
Whether or not the last two examples compile with say gcc, is inconsequential.
So, the gist of what I added is the following:
Within the file expression_def.hpp (production rule 'quoted_string' has been added):
quoted_string = '"' >> *('\\' >> char_ | ~char_('"')) >> '"'; // ADDED THIS
primary_expr =
uint_
| quoted_string // ADDED THIS
| function_call
| identifier
| bool_
| '(' > expr > ')'
;
within ast.hpp, the variant type 'std:string' has been added:
typedef boost::variant<
nil
, bool
, unsigned int
, std::string // ADDED THIS
, identifier
, boost::recursive_wrapper<unary>
, boost::recursive_wrapper<function_call>
, boost::recursive_wrapper<expression>
>
operand;
Here is the rule declaration for the addition, as well as the rule it's colliding with:
qi::rule<Iterator, std::string(), skipper<Iterator> > identifier;
qi::rule<Iterator, std::string()> quoted_string; // declaring this without the skipper
// lets us avoid the lexeme[] incantation (thanks @sehe).
The problem now, is that the compiler confuses what should be an 'identifier' for a 'quoted_string' - or actually just a std::string.
My guess is, the fact that they both have a std::string signature return type is the cause of the problem, but I don't know a good workaround here. Additionally, the 'identifier' struct has a data member of type std::string that it is initialized with, so really the compiler cannot tell between the two and the variant std::string ends up being the better match.
Now, if I change std::string to char* like so:
typedef boost::variant<
nil
, bool
, unsigned int
, char* // CHANGED, YET AGAIN
, identifier
, boost::recursive_wrapper<unary>
, boost::recursive_wrapper<function_call>
, boost::recursive_wrapper<expression>
>
operand;
it will compile and work with integers, bet then I am unable to parse strings (in fact, VS will call abort()) It should be noted that because each variant needs an overload, I have something in my code along the lines of:
bool compiler::operator()(std::string const& x)
{
BOOST_ASSERT(current != 0);
current->op(op_string, x);
return true;
}
and
void function::op(int a, std::string const& b)
{
code.push_back(a);
code.push_back(b.size());
for (uintptr_t ch : b)
{
code.push_back(ch);
}
size_ += 2 + b.size();
}
These both work swimmingly when I need to parse strings (of course sacrificing the ability to handle integers).
Their integer equivalents are (and found in compiler.cpp)
bool compiler::operator()(unsigned int x)
{
BOOST_ASSERT(current != 0);
current->op(op_int, x);
return true;
}
and of course:
void function::op(int a, int b)
{
code.push_back(a);
code.push_back(b);
size_ += 2;
}
If I have to change the variant type from std::string to char*, then I have to update the overloads, and because of C legacies, it gets to look a bit ugly.
I understand this might be a bit daunting and not really appealing to comb through the source, but I assure you it really isn't. This compiler tutorial simply pushes bytecode into a vector, which by design only handles integers. I am trying to modify it to handle strings, as well, hence the additions and overloads, as well as the need for unintptr_t. Anyone familiar with the material and/or Boost will likely know exactly what they are looking at (ehem, @sehe, ehem!).