Declaration Derivation in Language Parser

Question

I'm trying to write a lightweight library for parsing C source code.

Here's the way I've considered writing a declaration parser:

Decl CParser::ParseDecl(std::istream& in);

Or something like:

void CParser::Parse(std::istream& in);
// and
virtual void CParser::OnDecl(const Decl& decl);

Where Decl is a base class that may be inherited by either a TypedefDecl, FunctionDecl, or VariableDecl.

Is it okay that client code will have to cast into a derived class to get more information about the declaration? Or is there a better way to do this?

Edit:

The function itself isn't very well defined yet, it may actually be a callback, like CParser::OnDecl(const Decl& decl); which may be overloaded by a derived class like CFomatter: public CParser or something. That's not entirely part of the question.

I'm really just curious if it's okay that a client of the library will have to cast the Decl object. There's a lot of different declaration types in the C language (even more in C++) and it seems like writing a callback or a parser for each one of them would be just as bad as having to derive the base class.

You could overload the function for `TypedefDecl`, `FunctionDecl` and `VariableDecl`. That way the user already knows what type the declaration is instead of having to guess every time they call the function. Also by returning by value you will be [slicing the object](http://stackoverflow.com/questions/274626/what-is-object-slicing) which is probably not what you want. — phantom, Apr 27 '15 at 19:47
@Phantom didn't you mean "override", speaking of a virtual function ? — Christophe, Apr 27 '15 at 19:51
@Phantom I didn't know about object slicing, thanks. The details about the function aren't that important to me because I'm not too sure if I'll stick with it. The main question for me is whether it's a bad design if client code will have to cast the base class. — tay10r, Apr 27 '15 at 19:52
@Christophe I meant the OP could declare a separate function for each of them. The OP could also declare virtual functions in the `Decl` class that would work. — phantom, Apr 27 '15 at 19:54
@Phantom generally a parser parses some text input and returns tokens. Generally you don't know which token in advance. — Christophe, Apr 27 '15 at 19:56
@Christophe In that case it would be best to add virtual functions that allow the user to get enough information without having to cast to a derived class. — phantom, Apr 27 '15 at 19:58

Christophe · Accepted Answer · 2015-04-27T20:34:23.680

First, of all you have to avoid slicing, for example by returning a a pointer.

    Decl*  CParser::ParseDecl(std::istream& in);

Then, generally speaking, forcing a client to cast a return value is a symptom of a bad design. What if he casts to the wrong type ? How shall he know to which type he has to cast ? If the user makes the wrong cast, it's undefined behaviour (and extremely nasty bugs).

    CParser cp; 
    ...
    Decl* token = cp.ParseDecl(ifs); 
    FunctionDecl *ftoken;
    VariableDecl *vtoken;  
    if (????) {   // <============  how does your user know ?
        ftoken = static_cast<FunctionDecl>(token);  
        //... do something with ftoken specific to function declarations 
    }
    else if (????) {
         vtoken = static_cast<VariableDecl>(token); 
         //... do something specific for variable declarations 
    }

To make the things more robust, you should at least make the type polymorphic, by having one or more virtual functions. Then your client can use the safer dynamic casting and make the right decision (which returns nullptr in case of wrong casting):

    ...
    if (ftoken = dynamic_cast<FunctionDecl>(token)) {   // If wrong type fotken will be set to nullptr and you go to the else part
        //... do something with ftoken specific to function declarations 
    }
    else if (vtoken = dynamic_cast<VariableDecl>(token)) { // attention: it's = and not ==
          //... do something specific for variable declarations 
    }

This would be an acceptable design. But if you have polymorphic types, you could rethink your design by making use of this polymorphism, instead of forcing user to take care of casting. One possible way could for example be to define class specific functions as polymorphic ones:

class Decl {
 public: 
    virtual void display_details() { cout << "No detail for this token"; }
    ...
 };   
 class VariableDecl : public Decl {
 ...
     display_details() { cout<<"This is variable "<<name; }
 }; 
 class FunctionDecl : public Decl {
 ...
     display_details() { cout<<"This is function "<<name<<" with "<<nparam<< " parameters"; }
 };

The user could then just refer to the specific building blocs, without worying to much about the real type of the object:

Decl* token = cp.ParseDecl(ifs);
token->display_details();

Another popular design for more complex situations is the visitor design pattern. This is for example used by boost::variant : it could be worth looking at their examples, event if you don't intend to use this library.

So you're saying it's okay as long as the client code can use dynamic casting? — tay10r, Apr 27 '15 at 20:07
I think that it's not perfect, but acceptable. It sometimes allow flexibility in an easy way. But this comes at the cost of maintenability: what if one day you decide to derive further your tokens (for example subdivide `TypeDecl` into `EnumDecl`, `StructDecl` and `TypedefDecl`) ? — Christophe, Apr 27 '15 at 20:40

Declaration Derivation in Language Parser

1 Answers1