14

After reading this, there is a quote that stood out:

BSON is also designed to be fast to encode and decode. For example, integers are stored as 32 (or 64) bit integers, so they don't need to be parsed to and from text. This uses more space than JSON for small integers, but is much faster to parse.

From what I am reading, the entire point of using BSON is because it is less taxing on the CPU and faster to encode/process.

But, I did some tests with Node.js and using a native JSON approach blows BSON out of the water. Some tests are showing JSON is around 3 to 5 times faster. (And around 6 to 8 when using more data types.)

Benchmark Code:

var bson = require('bson');
var BSON = new bson.BSONPure.BSON();

var os = require('os');

console.log(" OS: " + os.type() + " " + os.release() + " (" + os.arch() + ")");
console.log("RAM: " + os.totalmem() / 1048576 + " MB (total), " + os.freemem() / 1048576 + " MB (free)");
console.log("CPU: " + os.cpus()[0].speed + " MHz " + os.cpus()[0].model);

for (var r = 1; r < 4; r++) {
    console.log("\nRun #" + r + ":");
    var obj = {
        'abcdef': 1,
        'qqq': 13,
        '19': [1, 2, 3, 4]
    };

    var start = Date.now();
    for (var i = 0; i < 500000; i++) {
        JSON.parse(JSON.stringify(obj));
    }
    var stop = Date.now();
    console.log("\t      JSON: " + (stop - start) + " ms");

    start = Date.now();
    for (var i = 0; i < 500000; i++) {
        BSON.deserialize(BSON.serialize(obj));
    }
    stop = Date.now();
    console.log("\t      Bson: " + (stop - start) + " ms");
}

Results:

OS: Windows_NT 6.1.7601 (x64)
RAM: 8174.1171875 MB (total), 5105.03515625 MB (free)
CPU: 3515 MHz AMD FX(tm)-6300 Six-Core Processor

Run #1:
              JSON: 1820 ms
              Bson: 8639 ms

Run #2:
              JSON: 1890 ms
              Bson: 8627 ms

Run #3:
              JSON: 1882 ms
              Bson: 8692 ms

With that said, I am looking for a binary approach to send and receive data through websockets. And BSON does this perfectly, but, when looking at the benchmark results, how can BSON be less taxing on the CPU when it takes longer to serialize / deserialize objects?

Does BSON make up for the extra CPU usage it utilizes since there will be no conversion to UTF-8 with text based websockets? Would that level out the performance in that regard?

@Joe Clay below, here is the results for stringify and serializing only:

Run #1:
              JSON: 922 ms
              Bson: 355 5ms
jimjim
  • 2,414
  • 2
  • 26
  • 46
NiCk Newman
  • 1,716
  • 7
  • 23
  • 48
  • Hm, have you tried testing the serialization and the deserialization separately? The phrasing of **"much faster to parse"** makes me wonder if it's the serialization that's making the BSON benchmark so much slower, rather than it just being slower overall. – Joe Clay Apr 21 '16 at 10:53
  • 7
    Did you also install [`bson-ext`](https://github.com/christkv/bson-ext), so you are comparing _native_ implementations of both JSON and BSON? Otherwise you're comparing a pure-JS BSON implementation against a native JSON implementation. – robertklep Apr 21 '16 at 10:58
  • @JoeClay Good idea, I added those tests to my post – NiCk Newman Apr 21 '16 at 10:59
  • @robertklep Oh, nope. oops... – NiCk Newman Apr 21 '16 at 11:04
  • @robertklep: Good point, didn't consider that. – Joe Clay Apr 21 '16 at 11:08
  • I think you should do your test again with a `Buffer` object as one of the properties of your test object. Then I think you will see where BSON shines. – Wyck Jun 13 '19 at 22:43
  • That doesn't matter: The JS implementation is actually faster than the native c++ bson-ext extension. – Marc J. Schmidt Aug 03 '20 at 20:49

3 Answers3

25

The question should not be Why is JSON faster than BSON? but Why is JSON faster than BSON in node.js?.

In most environments binary encodings like BSON, MessagePack or CBOR would be easier to encode than the textual JSON encoding. However javascript environments (like v8/node.js) are heavily optimized for JSON handling (because it's a subset of javascript). JSON de/encoding is probably implemented there in native code in optimized fashion directly in the JS VM. The javascript VMs are however not that optimized for representing and manipulating byte arrays (which is used by a BSON library). Nodes native Buffer type might be better than a pure JS array, but working with it (and doing for example the JS string (UTF16) -> UTF8 byte decoding in JS) is still slower then the inbuilt JSON serialization.

In other languages like C++ with direct byte array access and utf8 string types the results might be completely different.

Matthias247
  • 9,836
  • 1
  • 20
  • 29
  • 1
    It's worth noting that since this answer was provided the bson-ext extension has been rewritten; while I haven't tested it myself, the node-mongodb-native page now says "The bson-ext module is an alternative BSON parser that is written in C++. It delivers better deserialization performance and similar or somewhat better serialization performance to the pure javascript parser." Since previously the c++ extension was slower than node that indicates a significant improvement and it might be worth testing again. – taxilian May 01 '19 at 00:03
4

I believe Node.js and most browsers are the exception.

The simple answer is the JSON parser/serializer/deserializer (i.e. V8) are extremely optimized and are written in C/C++. The BSON parser is written in JavaScript. But even if the parser is written native (and I believe BSON has one) JSON still will probably win given how optimized V8 is for JSON.

If you use a platform like Java or C# the BSON format is probably going to be faster.

See @Matthais247 who answered after me but much more completely.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Adam Gent
  • 47,843
  • 23
  • 153
  • 203
3

I think you can't judge the performance only by looking at serialise/deserialize. You've simply chosen the wrong use-case for BSON. BSON shines in databases - where you can do do the calculations on the data, without the need to serialize them. Also storing and retrieving binary data such as images makes BSON more efficient, as you don't need to encode the data as Hex/BASE64 or similar.

Try to make some calculations directly retrieving/storing the values in JSON and BSON. But use random access (not always the same item), so that the chance it is optimized under the hood is small.

Valentin H
  • 7,240
  • 12
  • 61
  • 111