96

Does anyone have any information on the performance characteristics of Protocol Buffers versus BSON (binary JSON) or versus JSON in general?

  • Wire size
  • Serialization speed
  • Deserialization speed

These seem like good binary protocols for use over HTTP. I'm just wondering which would be better in the long run for a C# environment.

Here's some info that I was reading on BSON and Protocol Buffers.

StaxMan
  • 113,358
  • 34
  • 211
  • 239
Jeff Meatball Yang
  • 37,839
  • 27
  • 91
  • 125
  • Some argue(I think this includes a former protobuf author) that it's a better idea to use a larger but cheaper to serialize format and then compress the output with a fast standard compressor. – CodesInChaos Apr 23 '13 at 14:06
  • http://devblog.corditestudios.com/blog/2012/10/29/bson-vs-yaml-vs-protobuf/ – laike9m Sep 22 '14 at 02:27
  • I don't think this should be reopened until a certain comparison method is proposed in the question itself (otherwise this is for rather opinionated discussion/too broad) – YakovL Oct 22 '19 at 15:55
  • Perhaps more in terms of each format's strengths and weaknesses, and the answer might include a decision tree. – Technophile Nov 28 '20 at 00:37

4 Answers4

77

This post compares serialization speeds and sizes in .NET, including JSON, BSON and XML.

alt text

alt text

http://james.newtonking.com/archive/2010/01/01/net-serialization-performance-comparison.aspx

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
James Newton-King
  • 48,174
  • 24
  • 109
  • 130
67

Thrift is another Protocol Buffers-like alternative as well.

There are good benchmarks from the Java community on serialization/deserialization and wire size of these technologies: https://github.com/eishay/jvm-serializers/wiki

In general, JSON has slightly larger wire size and slightly worse DeSer, but wins in ubiquity and the ability to interpret it easily without the source IDL. The last point is something that Apache Avro is trying to solve, and it beats both in terms of performance.

Microsoft has released a C# NuGet package Microsoft.Hadoop.Avro.

s.m.
  • 7,895
  • 2
  • 38
  • 46
Michael Greene
  • 10,343
  • 1
  • 41
  • 43
  • 1
    Small message size doesn't automatically translate into fast perforamnce, see this article http://soa.sys-con.com/node/250512 – vtd-xml-author Mar 03 '10 at 20:07
  • 1
    Good link; the only thing I am not sure about is comment about Avro -- while it could work more efficiently for its core use cases (tons of similar data entries), it does not seem to perform very fast in this benchmark (which tests handling of a single request) – StaxMan Jan 05 '11 at 22:54
  • CoDec, MoDem.... I like "SeDes" better :) – nawfal Jun 15 '15 at 10:00
56

Here are some recent benchmarks showing the performance of popular .NET Serializers.

The Burning Monks benchmarks show the performance of serializing a simple POCO whilst the comprehensive Northwind benchmarks show the combined results of serializing a row in every table of Microsoft's Northwind dataset.

enter image description here

Basically protocol buffers (protobuf-net) is around 7x quicker than the fastest Base class library Serializer in .NET (XML DataContractSerializer). Its also smaller than the competition as it is also 2.2x smaller than Microsofts most compact serialization format (JsonDataContractSerializer).

ServiceStack's Text serializers are the closest to matching the performance of the binary protobuf-net where its Json Serializer is only 2.58x slower than protobuf-net.

Aage
  • 5,932
  • 2
  • 32
  • 57
mythz
  • 141,670
  • 29
  • 246
  • 390
  • 1
    Great post - but if possible you should always put error bars onto your bar charts when showing averages. – jtromans Oct 23 '13 at 07:30
25

protocol buffers is designed for the wire:

  1. very small message size - one aspect is very efficient variable sized integer representation.
  2. Very fast decoding - it is a binary protocol.
  3. protobuf generates super efficient C++ for encoding and decoding the messages -- hint: if you encode all var-integers or static sized items into it it will encode and decode at deterministic speed.
  4. It offers a VERY rich data model -- efficiently encoding very complex data structures.

JSON is just text and it needs to be parsed. hint: encoding a "billion" int into it would take quite a lot of characters: Billion = 12 char's (long scale), in binary it fits in a uint32_t Now what about trying to encode a double ? that would be FAR FAR worse.

Scott Stensland
  • 26,870
  • 12
  • 93
  • 104
Hassan Syed
  • 20,075
  • 11
  • 87
  • 171
  • 4
    It does, however, have the rather unfortunate downside of not handling inheritance and while composition is a valid alternative, I prefer not to be forced by my data transfer object to use composition rather than inheritance. – Barracoder Oct 31 '11 at 13:58
  • 4
    I believe Extensions can be used in a way very similar to inheritance... https://developers.google.com/protocol-buffers/docs/reference/cpp-generated#extension – kralyk Jun 01 '12 at 11:27
  • 1
    Yes, extensions is a very good point. I use it in practice at work every day. – Yngve Sneen Lindal Sep 11 '12 at 06:11
  • "protocol buffers is designed for the wire" What is "the wire"? – Marcos Pereira Jan 18 '19 at 19:47
  • @marcospgp `the wire` means just network. Now when we use so many wireless networks it may sound weird. – Victor Yarema Feb 14 '19 at 16:26
  • @MarcosPereira think data streamed over a wire (or coax, twisted pair, fiber, whatever). – Technophile Nov 28 '20 at 00:43