12

Is there a tool to auto-generate the ostream << operator for a struct or class?

Input (taken from One Debug-Print function to rule them all):

typedef struct ReqCntrlT    /* Request control record */
{
  int             connectionID;
  int             dbApplID;
  char            appDescr[MAX_APPDSCR];
  int             reqID;
  int         resubmitFlag;
  unsigned int    resubmitNo;
  char            VCIver[MAX_VCIVER];
  int             loginID;
}   ReqCntrlT;

Output:

std::ostream& operator <<(std::ostream& os, const ReqCntrlT& r) 
{
   os << "reqControl { "
      << "\n\tconnectionID: " << r.connectionID 
      << "\n\tdbApplID: " << r.dbApplID 
      << "\n\tappDescr: " << r.appDescr
      << "\n\treqID: " << r.reqID
      << "\n\tresubmitFlag: " << r.resubmitFlag
      << "\n\tresubmitNo: " << r.resubmitNo
      << "\n\tVCIver: " << r.VCIver
      << "\n\tloginID: " << r.loginID
      << "\n}";
   return os; 
}

Any tool would be fine, Python / Ruby scripts would be preferred.

Community
  • 1
  • 1
Christopher Oezbek
  • 23,994
  • 6
  • 61
  • 85

5 Answers5

3

What is needed for this is a tool that can parse C++ accurately, enumerate the various classes/structs, determine the and generate your "serializations" on a per class/struct basis, and then park the generated code in the "right place" (presumbably the same scope in which the struct was found). It needs a full preprocessor to handle expansion of directives in real code.

Our DMS Software Reengineering Toolkit with its C++11 front end could do this. DMS enables the construction of custom tools by providing generic parsing/AST building, symbol table construction, flow and custom analysis, transformation and source code regeneration capability. The C++ front enables DMS to parse C++ and build accurate symbol tables, as well as to pretty print modified or new ASTs back to compilable source form. DMS and its C++ front end have been used to carry out massive transformations on C++ code.

You have to explain to DMS what you want to do; seems straightforward to enumerate symbol tables entries, ask if struct/class type declarations, determine scope of the declaration (recorded in the symbol table entry), construct an AST by composing surface syntax patterns, and then apply a transformation to insert the constructed AST.

The core surface syntax patterns needed are those for slots and for the function body:

 pattern ostream_on_slot(i:IDENTIFIER):expression =
   " << "\n\t" << \tostring\(\i\) << r.\i "; -- tostring is a function that generates "<name>"

 pattern ostream_on_struct(i:IDENTIFIER,ostream_on_slots:expression): declaration =
   " std::ostream& operator <<(std::ostream& os, const \i& r) 
     { os << \tostring\(\i\) << " { " << \ostream_on_slots << "\n}";
       return os; 
     }

One has to compose the individual trees for ostream_on_slot:

 pattern compound_ostream(e1:expression, e2:expression): expression
     = " \e1 << \e2 ";

With these patterns it is straightforward to enumerate the slots of struct, construct the ostream for the body, and insert that into the overall function for a struct.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
2

There are two main ways to do this:

  • using an external parsing tool (such as a Python script hooked up on Clang bindings)
  • using metaprogramming technics

.. and of course they can be mixed.

I don't have enough knowledge about Clang Python bindings to answer using them, so I will concentrate on metapogramming.


Basically, what you are asking for requires introspection. C++ does not support full introspection, however using metaprogramming tricks (and template matching) it can support a limited subset of introspection technics at compilation time, which is sufficient for our purpose.

In order to easily mix metaprogramming and runtime operation, it's easier to bring a library into play: Boost.Fusion.

If you tweak your structure such that its attributes are descrited in terms of a Boost.Fusion sequence, then you can apply plenty of algorithm on the sequence automatically. Here, an associate sequence is best.

Because we are talking metaprogramming, the map associates a type to a typed value.

You can then iterate over that sequence using for_each.


I'll gloss over the details, simply because it's been a while and I don't remember the syntax involved, but basically the idea is to get to:

// Can be created using Boost.Preprocessor, but makes array types a tad difficult
DECL_ATTRIBUTES((connectionId, int)
                (dbApplId, int)
                (appDescr, AppDescrType)
                ...
                );

which is syntactic sugar to be declaring the Fusion Map and its associated tags:

struct connectionIdTag {};
struct dbApplIdTag {};

typedef boost::fusion::map<
    std::pair<connectionIdTag, int>,
    std::pair<dbApplIdTag, int>,
    ...
    > AttributesType;
AttributesType _attributes;

Then, any operation that need be applied on the attributes can be built simply with:

// 1. A predicate:
struct Predicate {
    template <typename T, typename U>
    void operator()(std::pair<T, U> const&) const { ... }
};

// 2. The for_each function
for_each(_attributes, Predicate());
Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • @MatthieM: Di would prefer an external solution, into which I paste the struct and press generate and then copy back the generated operator code. Most of the structs I want to output are in external libraries. – Christopher Oezbek May 08 '12 at 14:44
  • @ChristopherOezbek: I do not know of any "integrated" tool for this (too specialized I think), but if you don't want to pay for DMS (Ira's answer), then I suggest you get interested in the Clang project. With libclang you can parse C++ files and then operate on the AST. You can ask specific questions on the clang dev mail list. – Matthieu M. May 08 '12 at 14:53
  • @MatthieM: Thanks! Boost::Fusion looks interested for another project of mine! – Christopher Oezbek May 08 '12 at 14:55
1

To achieve that, the only way is to use an external tool that you run on your source files.

First, you could use a c/c++ analysing tool, and use it to retrieve the parse tree from you source code. Then, once you've got the parse tree, you just have to search structures. For each structure, you can now generate an operator<< overload that serialize the fields of the structure. You can also generate de deserialize operator.

But it depends on how many structures you have: for one dozen the better is to write the operators manually, but if you have several hundreds of structures you might want to write the (de)serialize operators generator.

Community
  • 1
  • 1
Synxis
  • 9,236
  • 2
  • 42
  • 64
0

I did understand your question in two ways.

If you want to generate automatic state report of your program I suggest you to check Boost.Serialization. However it will not generate the code at compile-time as a first step or for inspiration. The code below will help you to generate xml or txt files that you can read after.

typedef struct ReqCntrlT    /* Request control record */
{
  int             connectionID;
  int             dbApplID;
  char            appDescr[MAX_APPDSCR];
  int             reqID;
  int         resubmitFlag;
  unsigned int    resubmitNo;
  char            VCIver[MAX_VCIVER];
  int             loginID;

    template<class Archive>
    void serialize(Archive & ar, const unsigned int version)
    {
        ar & connectionID;
        ar & reqID;
        ...
    }
}   ReqCntrlT;

See the tutorial for more detail : http://www.boost.org/doc/libs/1_49_0/libs/serialization/doc/index.html

If you are only trying to "write" the code by just giving parameters name. Then you should give a look to regular expressions in python or perl for example. Main default of this solution is that you are "offline" of your structure i.e. have to run it every time you change something.

Benoit.

Quanteek
  • 254
  • 2
  • 11
  • the main problem is that I don't want to write the serialize method myself, as it is boring and tedious work. Since all the structs I want to output are simple, I want to autogenerate the serialize() or operator<<() code. – Christopher Oezbek May 08 '12 at 13:08
0

You can use LibClang to parse the source code and generate ostream operators:

# © 2020 Erik Rigtorp <erik@rigtorp.se>
# SPDX-License-Identifier: CC0-1.0
import sys
from clang.cindex import *

idx = Index.create()
tu = idx.parse(sys.argv[1], ['-std=c++11'])

for n in tu.cursor.walk_preorder():
    if n.kind == CursorKind.ENUM_DECL:
        print(
            f'std::ostream &operator<<(std::ostream &os, {n.spelling} v) {{\n  switch(v) {{')
        for i in n.get_children():
            print('    case {type}::{value}: os << "{value}"; break;'.format(
                type=n.type.spelling, value=i.spelling))
        print('  }\n  return os;\n}')
    elif n.kind == CursorKind.STRUCT_DECL:
        print(
            f'std::ostream &operator<<(std::ostream &os, const {n.spelling} &v) {{')
        for i, m in enumerate(n.get_children()):
            print(
                f'  os << "{", " if i != 0 else ""}{m.spelling}=" << v.{m.spelling};')
        print('  return os;\n}')

From my article: https://rigtorp.se/generating-ostream-operator/

Erik
  • 770
  • 7
  • 4