What is the C/C++ equivalence of java.io.Serializable?

Question

What is the C/C++ equivalence of java.io.Serializable?

There're references to serialization libraries on:

Serialize Data Structures in C

And there are:

But do such an equivalence even exists?

So if I have an abstract class as follows in Java, how would a serializable class in C/C++ look like?

import java.io.Serializable;

public interface SuperMan extends Serializable{

    /**
     * Count the number of abilities.
     * @return
     */
    public int countAbility();

    /**
     * Get the ability with index k.
     * @param k
     * @return
     */
    public long getAbility(int k);

    /**
     * Get the array of ability from his hand.
     * @param k
     * @return
     */
    public int[] getAbilityFromHand(int k);

    /**
     * Get the finger of the hand.
     * @param k
     * @return
     */
    public int[][] getAbilityFromFinger(int k);

    //check whether the finger with index k is removed.
    public boolean hasFingerRemoved(int k);

    /**
     * Remove the finger with index k.
     * @param k
     */
    public void removeFinger(int k);

}

Could any serializable C/C++ object just be inherited like in Java?

There is none from C++ standard library. Other libraries (like MFC I know of) might have support for serialization. There are serialization libraries for XML, JSON etc, though. — Ajay, Jun 09 '16 at 08:19
It's not quite what you're asking for, but in terms of what's commonly done in C++ (and usually proves quite workable), [boost](http://www.boost.org/doc/libs/1_61_0/libs/serialization/doc/) is worth a look. — Tony Delroy, Jun 09 '16 at 08:24
I'd not look for a C++ equivalent of a basically language level Java feature but concentrate on how serialization in C++ is handled generally. The standard approaches might be similar but don't necessarily have to be - after all, C++ and Java have quite a different memory model etc. — Thomas, Jun 09 '16 at 08:24
With C++ you could create a struct with int arrays, variables etc, and write/read it straight to/from binary file. You could encounter issues with this but you could think about it as a kind of equivalent to Java serialization. At least as a C++ built in feature. — marcinj, Jun 09 '16 at 08:28
There is no "standard approach" in C++. There are several popular third-party libraries, each with their own pros and cons. But really, rather than serializing a complete internal object state in some proprietary fashion, it's probably much better engineering-wise to store your *data* in a well-known wire format and allow your code to load from and store to that format. — Kerrek SB, Jun 09 '16 at 08:29
It is actually a common practice to mix C and C++ and even [view C++ as a federation of languages](http://www.codeproject.com/Articles/713401/View-Cplusplus-as-a-Federation-of-Languages) one of which is C as C++ was highly influenced by C and for better or worse contains legacy constructs from C. It has also been important for C++ to maintain ABI compatibility with C as C++ compilers can build to C ABI. I test C++ code compiled with clang to see the LLVM IR it emits as my own OOP language targets C ABI as well. So in general it is safe to talk in terms of C/C++. — Matthew Sanders, Jun 21 '16 at 02:18

Galik · Answer 1 · 2016-06-09T09:03:14.947

There are no standard library classes that implement serialization the same way Java does. There are some libraries that facilitate serialization but for basic needs you typically make your class serializable by overloading the insertion and extraction operators like this:

class MyType
{
    int value;
    double factor;
    std::string type;

public:
    MyType()
    : value(0), factor(0.0), type("none") {}
    MyType(int value, double factor, const std::string& type)
    : value(value), factor(factor), type(type) {}

    // Serialized output
    friend std::ostream& operator<<(std::ostream& os, const MyType& m)
    {
        return os << m.value << ' ' << m.factor << ' ' << m.type;
    }

    // Serialized input
    friend std::istream& operator>>(std::istream& is, MyType& m)
    {
        return is >> m.value >> m.factor >> m.type;
    }
};

int main()
{
    std::vector<MyType> v {{1, 2.7, "one"}, {4, 5.1, "two"}, {3, 0.6, "three"}};

    std::cout << "Serialize to standard output." << '\n';

    for(auto const& m: v)
        std::cout << m << '\n';

    std::cout << "\nSerialize to a string." << '\n';

    std::stringstream ss;
    for(auto const& m: v)
        ss << m << '\n';

    std::cout << ss.str() << '\n';

    std::cout << "Deserialize from a string." << '\n';

    std::vector<MyType> v2;

    MyType m;
    while(ss >> m)
        v2.push_back(m);

    for(auto const& m: v2)
        std::cout << m << '\n';

}

Output:

Serialize to standard output.
1 2.7 one
4 5.1 two
3 0.6 three

Serialize to a string.
1 2.7 one
4 5.1 two
3 0.6 three

Deserialize from a string.
1 2.7 one
4 5.1 two
3 0.6 three

The serialization format is entirely up to the programmer and you are responsible for making sure that each member of the class that you want to serialize is itself serializable (has an insertion/extraction operator defined). You also have to deal with how fields are separated (spaces or new-lines or zero-terminated?).

All the basic types have serialization (insertion/extraction) operators pre-defined but you still need to be careful with things like std::string that can contain (for example) spaces or new-lines (if you are using spaces or new-lines as your field delimiter).

Comments are not for extended discussion; this conversation has been [moved to chat](http://chat.stackoverflow.com/rooms/115109/discussion-on-answer-by-galik-what-is-the-c-c-equivalence-of-java-io-serializa). — Madara's Ghost, Jun 20 '16 at 12:28

Daniel Frużyński · Answer 2 · 2016-06-18T10:52:09.237

There is not a single standard for this. In fact every library can implement it in different way. Here are some approaches which can be used:

class has to be derived from common base class and implement read() and write() virtual methods:

class SuperMan : public BaseObj
{
public:
    virtual void read(Stream& stream);
    virtual void write(Stream& stream);
};

class should implement special interface - in C++ this is done by deriving class from special abstract class. This is variaton of previous method:

class Serializable
{
public:
    virtual Serializable() {}
    virtual void read(Stream& stream) = 0;
    virtual void write(Stream& stream) = 0;
};

class SuperMan : public Man, public Serializable
{
public:
    virtual void read(Stream& stream);
    virtual void write(Stream& stream);
};

library may allow (or require) to register "serializers" for given type. They can be implemented by creating class from special base class or interface, and then registering them for given type:

#define SUPERMAN_CLASS_ID 111

class SuperMan
{
public:
    virtual int getClassId()
    {
        return SUPERMAN_CLASS_ID;
    }
};

class SuperManSerializer : public Serializer
{
    virtual void* read(Stream& stream);
    virtual void write(Stream& stream, void* object);
};

int main()
{
    register_class_serializer(SUPERMAN_CLASS_ID, new SuperManSerializer());
}

serializers can be also implemented using functors, e.g. lambdas:

int main
{
    register_class_serializer(SUPERMAN_CLASS_ID,
                              [](Stream&, const SuperMan&) {},
                              [](Stream&) -> SuperMan {});
}

instead of passing serializer object to some function, it may be enough to pass its type to special template function:
```
int main
{
    register_class_serializer<SuperManSerializer>();
}
```
class should provide overloaded operators like '<<' and '>>'. First argument for them is some stream class, and second one is out class instance. Stream can be a std::stream, but this causes conflict with default use for these operators - converting to and from user-friendly text format. Because of this stream class is a dedicated one (it can wrap std::stream though), or library will support alternative method if << also has to be supported.
```
class SuperMan
{
public:
    friend Stream& operator>>(const SuperMan&);
    friend Stream& operator<<(const SuperMan&);
};
```

there should be specialization of some class template for our class type. This solution can be used together with << and >> operators - library first will try to use this template, and revert to operators if it will not be specialized (this can be implemented as default template version, or using SFINAE)

// default implementation
template<class T>
class Serializable
{
public:
    void read(Stream& stream, const T& val)
    {
        stream >> val;
    }
    void write(Stream& stream, const T& val)
    {
        stream << val;
    }
};

// specialization for given class
template<>
class Serializable<SuperMan>
{
    void read(Stream& stream, const SuperMan& val);
    void write(Stream& stream, const SuperMan& val);
}

instead of class template library may also use C-style interface with global overloaded functions:

template<class T>
void read(Stream& stream, const T& val);
template<class T>
void write(Stream& stream, const T& val);

template<>
void read(Stream& stream, const SuperMan& val);
template<>
void write(Stream& stream, const SuperMan& val);

C++ language is flexible, so above list is for sure not complete. I am convinced it would be possible to invent another solutions.

score 3 · Answer 3 · answered Jun 20 '16 at 19:42

As other answers have mentioned, C++ does not have nearly the sort of built-in serialization/deserialization capabilities that Java (or other managed languages) have. This is in part due to the minimal run-time type information (RTTI) available in C++. C++ by itself does not have reflection, so each serializable object must be completely responsible for serialization. In managed languages like Java and C#, the language includes enough RTTI for an external class to be able to enumerate the public fields on an object in order to perform the serialization.

Matthew Sanders · Answer 4 · 2016-06-21T02:43:43.717

Luckily... C++ does not impose a default mechanism for serialization of a class hierarchy. (I wouldn't mind it supplying an optional mechanism supplied by a special base type in the standard library or something, but overall this could put limits on existing ABIs)

YES Serialization is incredibly important and powerful in modern software engineering. I use it any time I need to translate a class hierarchy to and from some form of runtime consumable data. The mechanism I always choose is based on some form of reflection. More on this below.

You may also want to look here for an idea of the complexities to consider and if you really wanted to verify against the standard you could purchase a copy here. It looks like the working draft for the next standard is on github.

Application specific systems

C++/C allow the author of the application the freedom to select the mechanics behind many of the technologies people take for granted with newer and often higher level languages. Reflection (RTTI), Exceptions, Resource/Memory Management (Garbage collection, RAII, etc.). These systems can all potentially impact the overall quality of a particular product.

I have worked on everything from real time games, embedded devices, mobile apps, to web applications and the overall goals of the particular project vary between them all.

Often for real time high performance games you will explicitly disable RTTI (it isn't very useful in C++ anyway to be honest) and possibly even Exceptions (Many people don't desire the overhead produced here either and if you were really crazy you could implement your own form from long jumps and such. For me Exceptions create an invisible interface that often creates bugs people wouldn't even expect to be possible, so I often avoid them anyway in favor of more explicit logic. ).

Garbage collection isn't included in C++ by default either and in real time games this is a blessing. Sure you can have incremental GC and other optimized approaches which I have seen many games use (often times it is a modification of an existing GC like that used in Mono for C#). Many games use pooling and often for C++ RAII driven by smart pointers. It isn't unusual to have different systems with different patterns of memory usage either which can be optimized in different ways. The point is some applications care more then others about the nitty gritty details.

General idea of automatic serialization of type hierarchy

The general idea of an automatic serialization system of type hierarchies is to use a reflection system that can query type information at runtime from a generic interface. My solution below relies on building that generic interface by extending upon some base type interfaces with the help of the macros. In the end you basically get a dynamic vtable of sorts that you can iterate by index or query by string names of members/types.

I also use a base reflection reader/writer type that exposes some iostream interfaces to allow derived formatters to override. I currently have a BinaryObjectIO, JSONObjectIO, and ASTObjectIO but it is trivial to add others. The point of this is to remove the responsibly of serializing a particular data format from the hierarchy and put it into the serializer.

Reflection at the language level

In many situations the application knows what data it would like to serialize and there is no reason to build it into every object in the language. Many modern languages include RTTI even in the basic types of the system (if they are type based common intrinsics would be int, float, double, etc.). This requires extra data to be stored for everything in the system regardless of the usage by the application. I'm sure many modern compilers can at times optimize away some with tree shaking and such, but you can't guarantee that either.

A Declarative approach

The methods already mentioned are all valid use cases, although they lack some flexibility by having the hierarchy handle the actual serialization task. This can also bloat your code with boilerplate stream manipulation on the hierarchy.

I personally prefer a more declarative approach via reflection. What I have done in the past and continue to do in some situations is create a base Reflectable type in my system. I end up using template metaprogramming to help with some boilerplate logic as well as the preprocessor for string concatenation macros. The end result is a base type that I derive from, a reflectable macro declaration to expose the interface and a reflectable macro definition to implement the guts (tasks like adding the registered member to the type's lookup table.).

So I normally end up with something that looks like this in the h:

class ASTNode : public Reflectable 
{

...

public:
    DECLARE_CLASS

    DECLARE_MEMBER(mLine,int)
    DECLARE_MEMBER(mColumn,int)

...

};

Then something like this in the cpp:

BEGIN_REGISTER_CLASS(ASTNode,Reflectable);
REGISTER_MEMBER(ASTNode,mLine);
REGISTER_MEMBER(ASTNode,mColumn);
END_REGISTER_CLASS(ASTNode);

ASTNode::ASTNode() 
: mLine( 0 )
, mColumn( 0 )
{
}

I can then use the reflection interface directly with some methods such as:

int id = myreflectedObject.Get<int>("mID");
myreflectedObject.Set( "mID", 6 );

But much more commonly I just iterate some "Traits" data that I have exposed with another interface:

ReflectionInfo::RefTraitsList::const_iterator it = info->getReflectionTraits().begin();

Currently the traits object looks something like this:

class ReflectionTraits
    {
    public:
        ReflectionTraits( const uint8_t& type, const uint8_t& arrayType, const char* name, const ptrType_t& offset );

        std::string getName() const{ return mName; }
        ptrType_t getOffset() const{ return mOffset; }
        uint8_t getType() const{ return mType; }
        uint8_t getArrayType() const{ return mArrayType; }

    private:    
        std::string     mName;
        ptrType_t       mOffset;
        uint8_t         mType;
        uint8_t         mArrayType; // if mType == TYPE_ARRAY this will give the type of the underlying data in the array
    };

I have actually come up with improvements to my macros that allow me to simplify this a bit... but those are taken from an actual project I'm working on currently. I'm developing a programming language using Flex, Bison, and LLVM that compiles to C ABI and webassembly. I'm hoping to open source it soon enough, so if you are interested in the details let me know.

The thing to note here is that "Traits" information is metadata that is accessible at runtime and describes the member and is often much larger for general language level reflection. The information I have included here was all I needed for my reflectable types.

The other important aspect to keep in mind when serializing any data is version information. The above approach will deserialize data just fine until you start changing the internal data structure. You could, however, include a post and possibly pre data serialization hook mechanism with your serialization system so you can fix up data to comply with newer versions of types. I have done this a few times with setups like this and it works really well.

One final note about this technique is that you are explicitly controlling what is serialized here. You can pick and choose the data you want to serialize and the data that may just be keeping track of some transient object state.

C++ Lax guarantees

One thing to note... Since C++ is VERY lax about what data actually looks like. You often have to make some platform specific choices (this is probably one of the main reasons a standard system isn't provided). You can actually do a great deal at compile time with Template metaprogramming, but sometimes it is easier to just assume your char to be 8 bits in length. Yes even this simple assumption isn't 100% universal in C++, luckily in most situations it is.

The approach I use also does some non-standard casting of NULL pointers to determine memory layout (again for my purposes this is the nature of the beast). The following is an example snippet from one of the macro implementations to calculate the member offset in the type where CLASS is provided by the macro.

(ptrType_t)&reinterpret_cast<ptrType_t&>((reinterpret_cast<CLASS*>(0))->member)

A general warning about reflection

The biggest issue with reflection is how powerful it can be. You can quickly turn an easily maintainable codebase into a huge mess with too much inconsistent usage of reflection.

I personally reserve reflection for lower level systems (primarily serialization) and avoid using it for runtime type checking for business logic. Dynamic dispatching with language constructs such as virtual functions should be preferred to reflection type check conditional jumps.

Issues are even harder to track down if the language has inherit all or nothing support for reflection as well. In C# for example you cannot guarantee, given a random codebase, that a function isn't being used simply by allowing the compiler to alert you of any usage. Not only can you invoke the method via a string from the codebase or say from a network packet... you also could break the ABI compatibility of some other unrelated assembly that reflects on the target assembly. So again use reflection consistently and sparingly.

Conclusion

There is currently no standard equivalent to the common paradigm of a serializable class hierarchy in C++, but it can be added much like any other system you see in newer languages. After all everything eventually translates down to simplistic machine code that can be represented by the binary state of the incredible array of transistors included in your CPU die.

I'm not saying that everyone should roll their own here by any means. It is complicated and error prone work. I just really liked the idea and have been interested in this sort of thing for a while now anyways. I'm sure there are some standard fallbacks people use for this sort of work. The first place to look for C++ would be boost as you mentioned above.

If you do a search for "C++ Reflection" you will see several examples of how others achieve a similar result.

A quick search pulled up this as one example.

kudos for the effort of redaction. just don't know if I want to upvote since it doesn't focus enough on the question. — v.oddou, Feb 09 '23 at 08:15