1

I've read everywhere that JSON cannot encode binary data, so I wrote this simple test to check if that's actually true.

function test(elem){
    let reader = new FileReader ;
    reader.onload = ()=>{
        let json = JSON.stringify(reader.result) ;
        let isCorrect = JSON.parse(json) === reader.result ;
        alert('JSON stringification correct: ' + isCorrect) ;
    } ;
    reader.readAsBinaryString(elem.files[0]) ;
}
Choose a binary file: <br>
<input type=file onchange="test(this)">

You have to choose a binary file from your computer and the test function will read that file as a binary string, then it will JSON.stringify that string and then parse it back and compare it with the original binary string.

I have tried with lots and lots of binary files (.exe files mostly), and I just can't find a single file that cannot be JSON-ified.

Can you give an example of something that cannot be converted to a JSON string?

Seth
  • 83
  • 7
  • 1
    `readAsBinaryString` gives you a **string** representation of the data not raw binary data. – Quentin Nov 17 '19 at 22:31
  • @Quentin, Ok, then what would be an example of something that cannot be converted to JSON? – Seth Nov 17 '19 at 22:34
  • I suspect strings that contain (or binary data that represent) invalid escape sequences would pose a problem. – Bergi Nov 17 '19 at 23:02
  • @Seth Try using [`readAsArrayBuffer`](https://developer.mozilla.org/en-US/docs/Web/API/FileReader/readAsArrayBuffer) to get raw binary data. `JSON.stringify` will not work on that (unless you encode it as an array of integers, or a string, etc) – Bergi Nov 17 '19 at 23:04
  • @Quentin, an `ArrayBuffer` object cannot be read directly. You have to use one the [`TypedArray`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/TypedArray)s. – Seth Nov 17 '19 at 23:13

1 Answers1

3

I think you do not understand this correctly.

First of all what do mean with "a JSON string"? Do you mean the result of JSON.stringify() or the data type in a JSON document? Let's look at the latter, because I think this what the statement "cannot contain binary data" is about.

If you look at the spec a JSON string cannot contain every possible character. Especially control characters are not allowed. This means a JSON string cannot contain arbitrary (binary) data directly. However, you can use an escape sequence (\u) to represent these characters, which is a type of encoding. JSON.stringify() does this for you automatically.

For example:

s = String.fromCodePoint(65,0,66); // A "binary" string, 'A', 0x00, 'B'
JSON.stringify(s); // "A\u0000B";

JSON.parse() knows about these escape sequences as well and will restore the binary data.

So a JSON string data type can encode binary data but it cannot contain all binary data directly, without encoding.

Some additional notes:

  • Handling binary data correctly in JavaScript (and many other languages) can be difficult. String data types were not designed for binary data. For example, you have to know the encoding that is used to store the String data internally.
  • Usually, binary data is not encoded using escape sequences but using more efficient encoding schemes such as Base64.
rveerd
  • 3,620
  • 1
  • 14
  • 30
  • In this question https://stackoverflow.com/questions/1443158/binary-data-in-json-string-something-better-than-base64 what do they mean by "**The JSON format natively doesn't support binary data**" – Seth Nov 18 '19 at 02:39
  • Also, on this page https://github.com/nlohmann/json/issues/587 somebody says "**JSON cannot encode arbitrary binary, just sequences of code points**" – Seth Nov 18 '19 at 02:55
  • @Seth The first post mentions the need to escape characters that are not allowed in JSON strings. The second post relates to the fact that it is difficult to handle binary data using JavaScript strings. To avoid all these difficulties, the common approach is to convert binary data to a Base64 encoded JavaScript string and then convert it to JSON. – rveerd Nov 19 '19 at 01:56