2

I'm trying to measure serialization overhead with following code

    const int message_size=1000;

    std::vector<short> message(message_size);

    std::string s((char*)(&message[0]), message_size * sizeof(short));

    double size= 1000*sizeof(short);
    double size2= s.size();
    double overhead = size2 - size; //is zero

Is it correct? (It was taken from vector serialization)

How can I measure serialization overhead? - main problem is to measure serialized vector. I can use Boost for serialization.

Community
  • 1
  • 1
damian
  • 316
  • 3
  • 7
  • 23
  • 1
    What does this have to do with Boost.Serialization? – Dark Falcon Mar 17 '14 at 16:23
  • I can (or even should) use boost library to serialization. – damian Mar 17 '14 at 16:24
  • 1
    The point is that your question makes no sense. There is no serialization involved, and of course `overhead` is 0 because you just created `s` with a length of `1000*sizeof(short)`. Why do you expect it to be a different size than you requested? Casting is not serialization. – Dark Falcon Mar 17 '14 at 17:02
  • Do you mean the memory overhead of boost text serialization over binary serialization? Otherwise I don't get it too. Binary archive is supposed to have no memory overhead because it is the actual data. – Oleg Andriyanov Mar 17 '14 at 19:16
  • @OlegAndriyanov of course it has overhead because it has metadata as well. – sehe Mar 17 '14 at 19:53
  • I've actually measured the overhead. Keep in mind that if you compress the archives, the vast differences between binary/non-binary should go away in practice (for large enough archives) – sehe Mar 17 '14 at 20:26
  • I've measured the overhead by simply sending 1000 integers in table and then 1000 integers in vector. For my purposes it was enough. – damian May 06 '14 at 14:40

1 Answers1

2

This generic test bed should enable you to decide: see it Live On Coliru

#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/archive/xml_oarchive.hpp>
#include <boost/fusion/adapted/boost_tuple.hpp>
#include <boost/make_shared.hpp>
#include <boost/phoenix.hpp>
#include <boost/serialization/array.hpp>
#include <boost/serialization/shared_ptr.hpp>
#include <boost/serialization/string.hpp>
#include <boost/serialization/vector.hpp>
#include <boost/tuple/tuple.hpp>
#include <iostream>
#include <sstream>

namespace detail
{
    struct add_to_archive_f 
    {
        template <typename, typename> struct result { typedef void type; };
        template <typename Archive, typename T> 
            void operator()(Archive& ar, T const& t) const {
                ar << BOOST_SERIALIZATION_NVP(t);
            }
    };

    static const boost::phoenix::function<add_to_archive_f> add_to_archive { };
}

template <typename Archive = boost::archive::binary_oarchive, typename... Data>
size_t archive_size(Data const&... data)
{
    std::ostringstream oss;
    Archive oa(oss);

    boost::fusion::for_each(boost::make_tuple(data...), 
            detail::add_to_archive(
                boost::phoenix::ref(oa), 
                boost::phoenix::arg_names::arg1
                ));

    return oss.str().size();
}

template <typename Archive = boost::archive::binary_oarchive, typename... Data>
void benchmark(Data const&... data)
{
    std::cout << __PRETTY_FUNCTION__ << ":\t" << archive_size<Archive>(data...) << "\n";
}

struct Base {
    boost::array<double, 1000> data;
    virtual ~Base() {}

  private:
    friend class boost::serialization::access;
    template <typename Archive> void serialize(Archive& ar, unsigned /*version*/) {
        ar & BOOST_SERIALIZATION_NVP(data);
    }
};

struct Derived : Base {
    std::string x;
    Derived() : x(1000, '\0') { }

  private:
    friend class boost::serialization::access;
    template <typename Archive> void serialize(Archive& ar, unsigned /*version*/) {
        ar & boost::serialization::make_nvp("base", boost::serialization::base_object<Base>(*this));
        ar & BOOST_SERIALIZATION_NVP(x);
    }
};

Test driver:

template <typename Archive> 
void some_scenarios()
{
    benchmark<Archive>(std::vector<char>(1000));
    benchmark<Archive>(boost::make_shared<std::vector<char>>(1000));
    benchmark<Archive>(3.14f, 42, 42ull, "hello world");
    benchmark<Archive>(boost::make_shared<Base>());
    benchmark<Archive>(boost::make_shared<Derived>());
}

int main()
{
    some_scenarios<boost::archive::binary_oarchive>();
    some_scenarios<boost::archive::text_oarchive>();
    some_scenarios<boost::archive::xml_oarchive>();
}

The output on my 64-bit Ubuntu with Boost 1.55:

void benchmark(const Data& ...) [with Archive = boost::archive::binary_oarchive; Data = {std::vector<char, std::allocator<char> >}]:    1052
void benchmark(const Data& ...) [with Archive = boost::archive::binary_oarchive; Data = {boost::shared_ptr<std::vector<char, std::allocator<char> > >}]:    1059
void benchmark(const Data& ...) [with Archive = boost::archive::binary_oarchive; Data = {float, int, long long unsigned int, char [12]}]:   76
void benchmark(const Data& ...) [with Archive = boost::archive::binary_oarchive; Data = {boost::shared_ptr<Base>}]: 8069
void benchmark(const Data& ...) [with Archive = boost::archive::binary_oarchive; Data = {boost::shared_ptr<Derived>}]:  9086
void benchmark(const Data& ...) [with Archive = boost::archive::text_oarchive; Data = {std::vector<char, std::allocator<char> >}]:  2037
void benchmark(const Data& ...) [with Archive = boost::archive::text_oarchive; Data = {boost::shared_ptr<std::vector<char, std::allocator<char> > >}]:  2043
void benchmark(const Data& ...) [with Archive = boost::archive::text_oarchive; Data = {float, int, long long unsigned int, char [12]}]: 92
void benchmark(const Data& ...) [with Archive = boost::archive::text_oarchive; Data = {boost::shared_ptr<Base>}]:   2049
void benchmark(const Data& ...) [with Archive = boost::archive::text_oarchive; Data = {boost::shared_ptr<Derived>}]:    3083
void benchmark(const Data& ...) [with Archive = boost::archive::xml_oarchive; Data = {std::vector<char, std::allocator<char> >}]:   16235
void benchmark(const Data& ...) [with Archive = boost::archive::xml_oarchive; Data = {boost::shared_ptr<std::vector<char, std::allocator<char> > >}]:   17307
void benchmark(const Data& ...) [with Archive = boost::archive::xml_oarchive; Data = {float, int, long long unsigned int, char [12]}]:  436
void benchmark(const Data& ...) [with Archive = boost::archive::xml_oarchive; Data = {boost::shared_ptr<Base>}]:    19393
void benchmark(const Data& ...) [with Archive = boost::archive::xml_oarchive; Data = {boost::shared_ptr<Derived>}]: 21508

As you can see, the

  • overhead for XML is considerable
  • for binary, the overhead becomes significant for small archives many elements of differing (e.g. polymorphic) small types
sehe
  • 374,641
  • 47
  • 450
  • 633
  • Added measurements with bzip2 compression: **[Live On Coliru](http://coliru.stacked-crooked.com/a/2918e2c4aec6a862)** (I've randomized the data to make the data itself incompressible) (_interestingly the 'small' data set was more efficient with compressed text archive than in compressed binary archive. All the rest of the cases were pretty much as expected_) – sehe Mar 17 '14 at 21:10