3

Follow-Up Question

So, I've been playing with the

Boost Mini C Tutorial

What I have done is added a rule to parse string literals. The purpose is so that I can parse and compile programs like (functionality already built-in):

int ret(int x) {
    return x;
}

int main() {
   int x = 5;
   return ret(x)*2;
}

As well as (want to add this functionality),

string print(string s) {
   return s;
}

int main() {
   string foo = "bar";
   print(foo);
   return 0;
}

Whether or not the last two examples compile with say gcc, is inconsequential.

So, the gist of what I added is the following:

Within the file expression_def.hpp (production rule 'quoted_string' has been added):

quoted_string = '"' >> *('\\' >> char_ | ~char_('"')) >> '"'; // ADDED THIS

            primary_expr =
            uint_ 
            |   quoted_string // ADDED THIS
            |   function_call
            |   identifier
            |   bool_
            |   '(' > expr > ')'
            ;

within ast.hpp, the variant type 'std:string' has been added:

typedef boost::variant<
            nil
            , bool
            , unsigned int
            , std::string // ADDED THIS
            , identifier
            , boost::recursive_wrapper<unary>
            , boost::recursive_wrapper<function_call>
            , boost::recursive_wrapper<expression>
        >
        operand;

Here is the rule declaration for the addition, as well as the rule it's colliding with:

qi::rule<Iterator, std::string(), skipper<Iterator> > identifier;
qi::rule<Iterator, std::string()> quoted_string; // declaring this without the skipper 
                             // lets us avoid the lexeme[] incantation (thanks @sehe).

The problem now, is that the compiler confuses what should be an 'identifier' for a 'quoted_string' - or actually just a std::string.

My guess is, the fact that they both have a std::string signature return type is the cause of the problem, but I don't know a good workaround here. Additionally, the 'identifier' struct has a data member of type std::string that it is initialized with, so really the compiler cannot tell between the two and the variant std::string ends up being the better match.

Now, if I change std::string to char* like so:

typedef boost::variant<
                nil
                , bool
                , unsigned int
                , char* // CHANGED, YET AGAIN
                , identifier
                , boost::recursive_wrapper<unary>
                , boost::recursive_wrapper<function_call>
                , boost::recursive_wrapper<expression>
            >
            operand;

it will compile and work with integers, bet then I am unable to parse strings (in fact, VS will call abort()) It should be noted that because each variant needs an overload, I have something in my code along the lines of:

bool compiler::operator()(std::string const& x)
    {
        BOOST_ASSERT(current != 0);
        current->op(op_string, x);
        return true;
    }

and

 void function::op(int a, std::string const& b)
        {
            code.push_back(a);
            code.push_back(b.size());
            for (uintptr_t ch : b)
            {
                code.push_back(ch);
            }
            size_ += 2 + b.size();
        }

These both work swimmingly when I need to parse strings (of course sacrificing the ability to handle integers).

Their integer equivalents are (and found in compiler.cpp)

bool compiler::operator()(unsigned int x)
        {
            BOOST_ASSERT(current != 0);
            current->op(op_int, x);
            return true;
        }

and of course:

  void function::op(int a, int b)
        {
            code.push_back(a);
            code.push_back(b);
            size_ += 2;
        }

If I have to change the variant type from std::string to char*, then I have to update the overloads, and because of C legacies, it gets to look a bit ugly.

I understand this might be a bit daunting and not really appealing to comb through the source, but I assure you it really isn't. This compiler tutorial simply pushes bytecode into a vector, which by design only handles integers. I am trying to modify it to handle strings, as well, hence the additions and overloads, as well as the need for unintptr_t. Anyone familiar with the material and/or Boost will likely know exactly what they are looking at (ehem, @sehe, ehem!).

Community
  • 1
  • 1
Dylan_Larkin
  • 503
  • 4
  • 15
  • It's just not true. Nothing in that code is related to the added `op(....)` overload. It's about the visitors that handle the variant. My stream of ~2 weeks ago showed how to deal with this stage IIRC. It could have been something I did off line. Sorry for having been offline for a while – sehe Dec 09 '15 at 10:10
  • Hi @sehe, yes last night after debugging, I discovered a few things. Namely, when running the compiler on a simple int main() {int x = 5;, return x;}, what happens is when an op_load should be called (op_code #16), and, the overloaded op(string const&, int) is called and the resultant "op_string" is pushed and as a result, you can only return the integer 1, rather than 5. I will change the post when I get into office. I am aware it that I has to do with visitors that handle the variant std:string. If I had a better way to describe it, I'd have solved it by now, so all I can do is reproduce it – Dylan_Larkin Dec 09 '15 at 12:49
  • Keep at it. I'm sorry I won't have time to pick it back up – sehe Dec 09 '15 at 12:50
  • @sehe, I'm really struggling! – Dylan_Larkin Dec 10 '15 at 12:25
  • @Dylan_Larkin I realized I never shared the work in progress I had, you can see what I did here https://gist.github.com/sehe/d06c5587d363ac5316fb - it's raw and unfinished, but at least I can show you something this way that you don't have to wait for. Recorded **[livecoding session](https://www.livecoding.tv/video/answering-boost-questions-daily-8/)** to match (from december 1st) – sehe Dec 10 '15 at 17:17
  • @sehe, that's just the source unedited. I don't see any changes. – Dylan_Larkin Dec 10 '15 at 17:20
  • Checking... hold on; Oh I know I remember now. It was lost and I started over, the changes were trivial enough in my mind, but ... Oh god. Sorry again then. You should be able to pick up the pieces from the stream... :( Don't remember how much I did off stream. – sehe Dec 10 '15 at 17:21
  • @sehe, I incorporated all of those changes last weekend (from the video, that is), and was very appreciative (as always). The problem is, once those changes have been made, we lose the ability to compile functions that return integers (the original intent of the mini compiler). – Dylan_Larkin Dec 10 '15 at 17:22
  • All the changes you made can be seen in the code that I posted with the question... – Dylan_Larkin Dec 10 '15 at 17:26
  • @sehe, I think this question is a great candidate for livecoding! – Dylan_Larkin Dec 14 '15 at 02:32
  • 1
    No, not all those changes can be seen. I was way ahead, and wanted to deliver a working thing. It'll be a good candidate once you will have specified a language (grammar + semantics) to begin with. I was having to make up how to declare variables (are variables typed?) , how to assign string values or int values, how to handle (heterogeneous) operator overloads etc. (Does `"a" + "b"` mean something? And `"abcde" + 3`? And `"7" - 3`? _What_ do they mean?). Making it work _technically_ is less interesting. – sehe Dec 14 '15 at 07:56
  • Please, if there is something I am missing, point me to where I can find it. As far as integers are concerned, they are typed and this compiler will handle error checking in that respect. Yes, the addition of string literals will add a new layer for which all of those details will have to be addressed. So did you accidentally delete it? I was under the impression that you had moved on. I continue to toil over this ... – Dylan_Larkin Dec 14 '15 at 12:03
  • I did indeed accidentally reboot (wiping my tmpfs). The main thing you need to do is decide what you want to achieve (specifics). I can show you how to achieve it /after/ that. – sehe Dec 14 '15 at 12:50
  • First and foremost, I am interested in introducing strings and functions on strings (nothing too, crazy). That being said, they would probably have to be a C-strings. Pushing them char by char as we did before was close to what I wanted. As far as error checking, I can worry about that later. – Dylan_Larkin Dec 14 '15 at 13:06
  • A solution to my question would be a good start! – Dylan_Larkin Dec 14 '15 at 13:07
  • I can't answer it without more information. – sehe Dec 14 '15 at 13:26
  • I don't know what else to say. When we added std::string as a variant type, it killed the ability to perform functions on integers (in the example you were originally working on, on livecoding). The problem is matching return types within the variant named 'operand' above – Dylan_Larkin Dec 14 '15 at 13:51
  • @sehe Did you discontinue the work? I have given up. – Dylan_Larkin Dec 17 '15 at 15:50
  • Yeah. I was waiting for your input. I have no interest in making up a language just so I can help you get unstuck (only to find out that I made something up that wasn't what you needed and you'll just get stuck in a different place) – sehe Dec 17 '15 at 15:51
  • To be honest, I would never ask you to take on such a large project. I think I can handle most of that on my own, and I can always consult here if I get stuck. I really just wanted to get the one issue fixed. I don't learn if you do it all, but in this case, I'm out of bullets. – Dylan_Larkin Dec 17 '15 at 15:52
  • You are not. I have asked you very very specific questions. And you came back with "I don't know what else to say (bla...)". That's not out of bullets. That's stubborn or not motivated. Either that or some miscommunication that I am not sure I can fix. – sehe Dec 17 '15 at 15:54
  • Ok, let me refresh. The compiler example as is can compile simple mathematical functions like "return x*2". With the functionality you added on livecoding.tv, (to handle string literals), the compiler loses this functionality because now there is a collision of signatures within our variant 'operand'. In particular, 2 string types - the std::string you added, and of course the member std::string 'name' within the 'identifier' struct. I would like the functionality you created in addition to the original functionality of the Mini C example. – Dylan_Larkin Dec 17 '15 at 15:57
  • There is no collision of those particular overloads. You were confused. The compiler example only knows about 1 type, so either you make the VM type aware, or you need make all supported operations polymorphic. Depending on the choice you need to adapt your languages grammar. I want you to do that first. I'm not going to convince you why it is required. If you want to find out, you're welcome to do it yourself. (You'll run into it anyways). Look above for which _exact_ questions you need to answer to fill in the blanks. – sehe Dec 17 '15 at 16:00
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/98255/discussion-between-dylan-larkin-and-sehe). – Dylan_Larkin Dec 17 '15 at 16:07

0 Answers0