Better structure for request based protocol implementation

Question

I am using a protocol, which is basically a request & response protocol over TCP, similar to other line-based protocols (SMTP, HTTP etc.).

The protocol has about 130 different request methods (e.g. login, user add, user update, log get, file info, files info, ...). All these methods do not map so well to the broad methods as used in HTTP (GET,POST,PUT,...). Such broad methods would introduce some inconsequent twists of the actual meaning.

But the protocol methods can be grouped by type (e.g. user management, file management, session management, ...).

Current server-side implementation uses a class Worker with methods ReadRequest() (reads request, consisting of method plus parameter list), HandleRequest() (see below) and WriteResponse() (writes response code & actual response data).

HandleRequest() will call a function for the actual request method - using a hash map of method name to member function pointer to the actual handler.

The actual handler is a plain member function there is one per protocol method: each one validates its input parameters, does whatever it has to do and sets response code (success yes/no) and response data.

Example code:

class Worker {
    typedef bool (Worker::*CommandHandler)();
    typedef std::map<UTF8String,CommandHandler> CommandHandlerMap;

    // handlers will be initialized once
    //   e.g. m_CommandHandlers["login"] = &Worker::Handle_LOGIN;
    static CommandHandlerMap m_CommandHandlers;

    bool HandleRequest() {
        CommandHandlerMap::const_iterator ihandler;
        if( (ihandler=m_CommandHandlers.find(m_CurRequest.instruction)) != m_CommandHandler.end() ) {
            // call actual handler
            return (this->*(ihandler->second))();
        }
        // error case:
        m_CurResponse.success = false;
        m_CurResponse.info = "unknown or invalid instruction";
        return true;
    }

    //...


    bool Handle_LOGIN() {
        const UTF8String username = m_CurRequest.parameters["username"];
        const UTF8String password = m_CurRequest.parameters["password"];

        // ....

        if( success ) {

            // initialize some state...
            m_Session.Init(...);
            m_LogHandle.Init(...);
            m_AuthHandle.Init(...);

            // set response data
            m_CurResponse.success = true;
            m_CurResponse.Write( "last_login", ... );
            m_CurResponse.Write( "whatever", ... );
        } else {
            m_CurResponse.Write( "error", "failed, because ..." );
        }
        return true;
    }


};

So. The problem is: My worker class now has about 130 "command handler methods". And each one needs access to:

request parameters
response object (to write response data)
different other session-local objects (like a database handle, a handle for authorization/permission queries, logging, handles to various sub-systems of the server etc.)

What is a good strategy for a better structuring of those command handler methods?

One idea was to have one class per command handler, and initializing it with references to request, response objects etc. - but the overhead is IMHO not acceptable (actually, it would add an indirection for any single access to everything the handler needs: request, response, session objects, ...). It could be acceptable if it would provide an actual advantage. However, that doesn't sound much reasonable:

class HandlerBase {
protected:
    Request &request;
    Response &response;
    Session &session;
    DBHandle &db;
    FooHandle &foo;
    // ...
public:
    HandlerBase( Request &req, Response &rsp, Session &s, ... )
    : request(req), response(rsp), session(s), ...
    {}
    //...
    virtual bool Handle() = 0;
};

class LoginHandler : public HandlerBase {
public:
    LoginHandler( Request &req, Response &rsp, Session &s, ... )
    : HandlerBase(req,rsp,s,..)
    {}
    //...
    virtual bool Handle() {
        // actual code for handling "login" request ...
    }
};

Okay, the HandlerBase could just take a reference (or pointer) to the worker object itself (instead of refs to request, response etc.). But that would also add another indirection (this->worker->session instead of this->session). That indirection would be ok, if it would buy some advantage after all.

Some info about the overall architecture

The worker object represents a single worker thread for an actual TCP connection to some client. Each thread (so, each worker) needs its own database handle, authorization handle etc. These "handles" are per-thread-objects that allow access to some sub-system of the server.

This whole architecture is based on some kind of dependency injection: e.g. to create a session object, one has to provide a "database handle" to the session constructor. The session object then uses this database handle to access the database. It will never call global code or use singletons. So, each thread can run undisturbed on its own.

But the cost is, that - instead of just calling out to singleton objects - the worker and its command handlers must access any data or other code of the system through such thread-specific handles. Those handles define its execution context.

Summary & Clarification: My actual question

I am searching for an elegant alternative to the current ("worker object with a huge list of handler methods") solution: It should be maintainable, have low-overhead & should not require writing too much glue-code. Additionally, it MUST still allow each single method control over very different aspects of its execution (that means: if a method "super flurry foo" wants to fail whenever full moon is on, then it must be possible for that implementation to do so). It also means, that I do not want any kind of entity abstraction (create/read/update/delete XFoo-type) at this architectural layer of my code (it exists at different layers in my code). This architectural layer is pure protocol, nothing else.

In the end, it will surely be a compromise, but I am interested in any ideas!

The AAA bonus: a solution with interchangeable protocol implementations (instead of just that current class Worker, which is responsible for parsing requests and writing responses). There maybe could be an interchangeable class ProtocolSyntax, that handles those protocol syntax details, but still uses our new shiny structured command handlers.

This seems like a bad job for C++. I would re-consider your design approach, and maybe design a HTTP servlet in Java. What specific things can't you do with HTTP that you want to do with FRUNSTP (your protocol)? — Richard J. Ross III, Aug 08 '12 at 00:40
HTTP methods and REST style interface would work for most things. But then, there are also commands like e.g. "files meta update", which should set different metadata fields on a bunch of different files. That would break with REST concept (where each file would have its single URL). Of course, one could just use "file meta update" for a single file. But the multi-file-update (besides many other "multi-entity" queries) is a performance critical thing in my software. — Frunsi, Aug 08 '12 at 00:46
Oh and of course, my protocol is NOT STATELESS, as HTTP. A typical session consists of login, querying different data, updating something, and finally closing the connection. — Frunsi, Aug 08 '12 at 00:50
Then you would just use REST with a POST / PUT that describes the action you are taking in detail, e.g. a path like: `POST http://myserver.com/files/update-meta?token=XX_XXXX_XX, [ 'fileName' : { meta-attribute : 'meta-value' }, ... ]` where `token` is an access token. Why reinvent the wheel when there are already great tools out there?. — Richard J. Ross III, Aug 08 '12 at 00:50
Using POSTs that describe the action results in the same dilemma server-side (it also has to implement one handler per action). — Frunsi, Aug 08 '12 at 00:51
yes, but using tools like apache tomcat, you can efficiently use attributes to distribute the methods using classes & such. However, your approach has it's advantages, to be certain. It just doesn't appear to be as maintainable as other approaches available. — Richard J. Ross III, Aug 08 '12 at 00:53
What's wrong with simply passing the request parameters and response object to each handler as input parameters? — Remy Lebeau, Aug 08 '12 at 00:55
@RemyLebeau: The problem is, that the handler must have access to some "connection specific state" (a session object, and more), besides request & response. — Frunsi, Aug 08 '12 at 01:01
So why can't you have a connection/session state object as well? The thread that is reading each request knows the connection it is reading from. — Remy Lebeau, Aug 08 '12 at 01:28
@RemyLebeau: Yes, no problem, I can have those state objects (they already exist..). I am searching for a good way to structure that whole thing (in a maintainable, low-overhead way & without writing too much code at all). — Frunsi, Aug 08 '12 at 01:49
@RemyLebeau: Please read me edit (appended section "Summary & Clarification: My actual question"). Hope this explains my intent!? — Frunsi, Aug 08 '12 at 02:36
Would it be possible to use thread local storage for storing the database handle? If so, then you could use a "HandlerDispatch" class which stores a thread local connection to the database. — TheSteve, Aug 10 '12 at 04:10
@TheSteve: TLS would be possible. But it actually would not solve any problem, right? I do not see where that could help. — Frunsi, Aug 10 '12 at 04:33
It appears I misunderstood the problem. I'll will post an idea as an answer. — TheSteve, Aug 10 '12 at 04:54
I'm not expert on this, but will implementing the protocol as a state machine help? — Ankush, Aug 10 '12 at 18:45
Slightly off topic: Consider using code generation to handle all the boiler plate. This opens up possibilities that might otherwise fall into the 'hard to maintain' or 'too much glue code' category. Where I currently work uses hand written generators (a bunch of ugly `ostream << "class " << varClassname` type statements) with XML config for all the protocol interfaces and implementations. It's inelegant, and works extremely well if it's integrated with the build process. Adding a new message is a simple XML edit, and then write the function definition (the declaration would be generated). — Zero, Aug 15 '12 at 01:59
@Zero: Yes, code generation would have some advantages. But the biggest problem (IMHO) would be that one would loose IDE support (syntax highlighting, code completion, rough syntax validation, ...). — Frunsi, Aug 15 '12 at 14:32

score 16 · Accepted Answer · answered Aug 12 '12 at 17:36

16

You've already got most of the right ideas, here's how I would proceed.

Let's start with your second question: interchangeable protocols. If you have generic request and response objects, you can have an interface that reads requests and writes responses:

class Protocol {
  virtual Request *readRequest() = 0;
  virtual void writeResponse(Response *response) = 0;
}

and you could have an implementation called HttpProtocol for example.

As for your command handlers, "one class per command handler" is the right approach:

class Command {
  virtual void execute(Request *request, Response *response, Session *session) = 0;
}

Note that I rolled up all the common session handles (DB, Foo etc.) into a single object instead of passing around a whole bunch of parameters. Also making these method parameters instead of constructor arguments means you only need one instance of each command.

Next, you would have a CommandFactory which contains the map of command names to command objects:

class CommandFactory {
  std::map<UTF8String, Command *> handlers;

  Command *getCommand(const UTF8String &name) {
    return handlers[name];
  }
}

If you've done all this, the Worker becomes extremely thin and simply coordinates everything:

class Worker {
  Protocol *protocol;
  CommandFactory *commandFactory;
  Session *session;

  void handleRequest() {
    Request *request = protocol->readRequest();
    Response response;

    Command *command = commandFactory->getCommand(request->getCommandName());
    command->execute(request, &response, session);

    protocol->writeResponse(&response);
  }
}

answered Aug 12 '12 at 17:36

casablanca

69,683
7
133
150

1

+1 cause I really like this solution. I've used it myself in production code. However, here's a word of advice. Beware that when using "one class per command handler" approach, you will literally end up with over a hundred classes of mostly boilerplate. I've once written an XMLRPC server where I had "only" about two dozen handler classes, but that became a real pain after a while. If I could to do it again, I'd definitely think twice and probably opt for multiple member functions. Perhaps you could refactor related functions into a handful separate classes. Try to find a good middle way. – djf Aug 13 '12 at 22:42
@djf: Having many small classes should not be a pain if designed properly. But usually this is a sign that something is wrong with the service interface itself -- why is a single service exposing so many methods? – casablanca Aug 14 '12 at 03:23
1

I agree completely. Frunsi is looking at 130 methods in the interface!! I think redesigning the interface should happen before refactoring the server implementation. Just my 2¢. – djf Aug 14 '12 at 10:12
@djf: I didn't realize that the OP actually mentioned 130 methods in his question, so yes, your comment is all the more relevant. :) – casablanca Aug 15 '12 at 03:56
@djf: Having those 130 methods is a topic on its own. Yes, a redesign would help a bit (e.g. a more OOP-like protocol), but there still would be a lot of methods and many special cases. Its complex software, not just an abstract "manage some objects thing", nor pure CRUD thing. – Frunsi Aug 15 '12 at 14:38
Nice solution. Is it reasonable to encapsulate read/write command non-primitive parameters? – user1154664 Aug 15 '12 at 18:12
@djf Great response to the given question. Especially the runtime Factory lookup -- I find a lot of C++ programs shy away from these practices too often as a form pre-optimization. – Pyrce Aug 15 '12 at 21:47
OK, accepted! However, I will try to keep heap allocations to a minimum, as well as all initialization work (though workers are pre-allocated, there may be high-load situations where worker instances must be created when new clients connect [it works similar to "Apache Worker MPM": some pre-allocated workers in standby mode, and up to max no. additional workers when required]). Maybe I will also use two-staged handlers (as in the answer from TheSteve). – Frunsi Aug 15 '12 at 22:33
With DI frameworks like [Boost DI](https://github.com/boost-experimental/di) or [Fruit] (https://github.com/google/fruit) available, it's actually preferable to take those dependencies via the Command handler constructor. Then you can have a narrow signature for `execute`, and declare only what you need explicitly in each individual ctor without it ballooning. – John Neuhaus Apr 26 '18 at 22:51

score 5 · Answer 2 · answered Aug 10 '12 at 05:41

If it were me I would probably use a hybrid solution of the two in your question.
Have a worker base class that can handle multiple related commands, and can allow your main "dispatch" class to probe for supported commands. For the glue, you would simply need to tell the dispatch class about each worker class.

class HandlerBase
{
public:
    HandlerBase(HandlerDispatch & dispatch) : m_dispatch(dispatch) {
        PopulateCommands();
    }
    virtual ~HandlerBase();

    bool CommandSupported(UTF8String & cmdName);

    virtual bool HandleCommand(UTF8String & cmdName, Request & req, Response & res);
    virtual void PopulateCommands();

protected:
    CommandHandlerMap m_CommandHandlers; 
    HandlerDispatch & m_dispatch;
};

class AuthenticationHandler : public HandlerBase
{
public:
    AuthenticationHandler(HandlerDispatch & dispatch) : HandlerBase(dispatch) {}

    bool HandleCommand(UTF8String & cmdName, Request & req, Response & res) {
        CommandHandlerMap::const_iterator ihandler;                     
        if( (ihandler=m_CommandHandlers.find(req.instruction)) != m_CommandHandler.end() ) {                     
            // call actual handler                     
            return (this->*(ihandler->second))(req,res);                     
        }                     
        // error case:                     
        res.success = false;                     
        res.info = "unknown or invalid instruction";                     
        return true; 
    }

    void PopulateCommands() {
        m_CommandHandlers["login"]=Handle_LOGIN;
        m_CommandHandlers["logout"]=Handle_LOGOUT;
    }

    void Handle_LOGIN(Request & req, Response & res) {
        Session & session = m_dispatch.GetSessionForRequest(req);
        // ...
    }
};

class HandlerDispatch
{
public:
    HandlerDispatch();
    virtual ~HandlerDispatch() {  
        // delete all handlers 
    }

    void AddHandler(HandlerBase * pHandler);
    bool HandleRequest() {
        vector<HandlerBase *>::iterator i;
        for ( i=m_handlers.begin() ; i < m_handlers.end(); i++ ) {
            if ((*i)->CommandSupported(m_CurRequest.instruction)) {
                return (*i)->HandleCommand(m_CurRequest.instruction,m_CurRequest,m_CurResponse);
            }
        }
        // error case:                                                            
        m_CurResponse.success = false;
        m_CurResponse.info = "unknown or invalid instruction";

        return true; 
    }
protected:
    std::vector<HandlerBase*> m_handlers;
}

And then to glue it all together you would do something like this:

// Init
m_handlerDispatch.AddHandler(new AuthenticationHandler(m_handlerDispatch));

Thanks, looks promising. Having two map lookups (instead of one) should be an acceptable trade-off. — Frunsi, Aug 15 '12 at 22:25

πάντα ῥεῖ · Answer 3 · 2012-08-15T22:36:38.460

As for the transport (TCP) specific part, did you have a look at the ZMQ library that supports various distributed computing patterns via messaging sockets/queues? IMHO you should find an appropriate pattern that serves your needs in their Guide document.

For choice of the protocol messages implementation i would personally favorite google protocol buffers which works very well with C++, we are using it for a couple of projects now.

At least you'll boil down to dispatcher and handler implementations for specific requests and their parameters + the necessary return parameters. Google protobuf message extensions allow to to this in a generic way.

EDIT:

To get a bit more concrete, using protobuf messages the main difference of the dispatcher model vs yours will be that you don't need to do the complete message parsing before dispatch, but you can register handlers that tell themselves if they can handle a particular message or not by the message's extensions. The (main) dispatcher class doesn't need to know about the concrete extensions to handle, but just ask the registered handler classes. You can easily extend this mechanism to have certain sub-dispatchers to cover deeper message category hierarchies.

Because the protobuf compiler can already see your messaging data model completely, you don't need any kind of reflection or dynamic class polymorphism tests to figure out the concrete message content. Your C++ code can statically ask for possible extensions of a message and won't compile if such doesn't exist.

I don't know how to explain this in a better way, or to show a concrete example how to improve your existing code with this approach. I'm afraid you already spent some efforts on the de-/serialization code of your message formats, that could have been avoided using google protobuf messages (or what kind of classes are Request and Response?).

The ZMQ library might help to implement your Session context to dispatch requests through the infrastructure.

Certainly you shouldn't end up in a single interface that handles all kinds of possible requests, but a number of interfaces that specialize on message categories (extension points).

Thanks, but my actual protocol is fine. I know google protocol buffers, but do not need them here. ZMQ and its Guide doesn't help either. — Frunsi, Aug 15 '12 at 22:19
@Frunsi OK, that's just an idea and would have been my 1st approach to the problem as I have used this several times successfully. From an abstract point view I would consider your problem to be solved within the [Chain of Responsibility](http://en.wikibooks.org/wiki/C++_Programming/Code/Design_Patterns#Chain_of_Responsibility) pattern. — πάντα ῥεῖ, Aug 15 '12 at 22:51
I think that chain of responsibility pattern is inefficient for my use case (it would make sense if multiple handlers are involved in processing a single request)... — Frunsi, Aug 16 '12 at 04:27
Yes, my code already spents some effort for de-/serialization (protocol is simple and can be efficiently parsed & written). The `Request` and `Response` objects basically contain just a list of key value pairs. A mode for sending/receiving binary data rounds it off (and it can use `sendfile()` system call for zero-copy...) — Frunsi, Aug 16 '12 at 04:34
Could you please present a tiny example of dispatching extensions? For example, message is `Animal`, and extensions are `Cat` and `Dog`. Handler for `Cat` prints `"Meow"`. Handler for `Dog` prints `"Bark"`. — , Sep 28 '17 at 09:30
@wowofbob You may have noticed that πάντα ῥεῖ's account is suspended and they can't respond to your comment. Regarding your question you should know that protobuf extensions are completely different from polymorphic inheritance, and you have to write the dispatching code manually for each known message extension, or let a number of registered handlers do it, and let the dispatcher class ask each of them if they can parse and handle a specific message extension. — user0042, Sep 29 '17 at 00:07

score 2 · Answer 4 · answered Aug 10 '12 at 14:54

2

I think this is an ideal case for a REST-like implementation. One other way could also be grouping the handler methods based on category/any-other-criteria to several worker classes.

answered Aug 10 '12 at 14:54

Elchin

584
6
19

score 1 · Answer 5 · edited May 23 '17 at 12:00

If the protocol methods can only be grouped by type but methods of the same group do not have anything common in their implementation, possibly the only thing you can do to improve maintainability is distributing methods between different files, one file for a group.

But it is very likely that methods of the same group have some of the following common features:

There may be some data fields in the Worker class that are used by only one group of methods or by several (but not every) group. For example, if m_AuthHandle may be used only by user management and session management methods.
There may be some groups of input parameters, used by every method of some group.
There may be some common data, written to the response by every method of some group.
There may be some common methods, called by several methods of some group.

If some of these facts is true, there is a good reason to group these features into different classes. Not one class per command handler, but one class per event group. Or, if there are features, common to several groups, a hierarchy of classes.

It may be convenient to group instances of all these group classes in one place:

classe UserManagement: public IManagement {...};
classe FileManagement: public IManagement {...};
classe SessionManagement: public IManagement {...};
struct Handlers {
  smartptr<IManagement> userManagement;
  smartptr<IManagement> fileManagement;
  smartptr<IManagement> sessionManagement;
  ...
  Handlers():
    userManagement(new UserManagement),
    fileManagement(new FileManagement),
    sessionManagement(new SessionManagement),
    ...
  {}
};

Instead of new SomeClass, some template like make_unique may be used. Or, if "interchangeable protocol implementations" are needed, one of the possibilities is to use factories instead of some (or all) new SomeClass operators.

m_CommandHandlers.find() should be split into two map searches: one - to find appropriate handler in this structure, other (in the appropriate implementation of IManagement) - to find a member function pointer to the actual handler.

In addition to finding a member function pointer, HandleRequest method of any IManagement implementation may extract common parameters for its event group and pass them to event handlers (one by one if there are just several of them, or grouped in a structure if there are many).

Also IManagement implementation may contain WriteCommonResponce method to simplify writing responce fields, common to all event handlers.

score 1 · Answer 6 · answered Aug 14 '12 at 10:08

The Command Pattern is your solution to both aspects of this problem.

Use it to implement your protocol handler with a generalised IProtocol Interface (and/or abstract base class) and different implementations of protocol handler with a different Classes specialised for each protocol.

Then implement your Commands the same way with an ICommand Interface and each Command Methods implemented in seperate class. You are nearly there with this. Split your existing Methods into new Specialised Classes.

Wrap Your Requests and Responses as Mememento objects

Come on - if it would be as easy as throwing the first patterns that come to mind into a kettle and stir an hour, then I would not have asked at all. The Command pattern is not applicable here (it does not solve any of my problem), and memento pattern makes it all worse, does not help either. — Frunsi, Aug 15 '12 at 22:10

Better structure for request based protocol implementation

6 Answers6