How to compare big JSONs?

Question

Have 2 big JSON (~GB) files created from same source by code, that should work exactly same, but there should be some small differences sometimes.

Want to be sure both conversions did same job and properly. Made small project to convert JSONs to kind of CSV with path to elements and its content (single value or arrays/objects). Planning to compare "CSV" JSONs by any text diff then.

Source (on my GitHub) is quite long for limitted space here and is not main part of the question. It is not working properly generally (tested on a big simple JSON and another shown here, did not want to use recursion, code is quite tricky).

Example export of JSON Data Set Sample # Example 4 to ilustrate my idea when ID set to type, Horizontal format and Sort All looks like:

batters\batter\Blueberry\   id  1003
batters\batter\Devil's Food\    id  1004
batters\batter\Chocolate\   id  1002
batters\batter\Regular\ id  1001
donut\  id  0001    name    Cake    ppu 0.55
topping\Glazed\ id  5002
topping\Chocolate with Sprinkles\   id  5006
topping\Chocolate\  id  5003
topping\Maple\  id  5004
topping\None\   id  5001
topping\Powdered Sugar\ id  5007
topping\Sugar\  id  5005

Is it a good idea or are there any better options ?

Processed JSON preview:

{ "type": "donut",
  "id": "0001", "name": "Cake", "ppu": 0.55,
  "batters": { "batter": [ {
        "id": "1001", "type": "Regular"
      },{
        "id": "1002", "type": "Chocolate"
      },{
        "id": "1003", "type": "Blueberry"
      },{
        "id": "1004", "type": "Devil's Food"
  }]},
  "topping": [ {
      "id": "5001", "type": "None"
    },{
      "id": "5002", "type": "Glazed"
    },{
      "id": "5005", "type": "Sugar"
    }, ...

It is a little unclear to me what you are trying to achieve here. Do you want to compare the first snippet and the second? Or do you produce the first snippet to compare it to something else ?? — Fildor, May 09 '19 at 14:21
what's your definition of "compare"? Are you trying to verify that the conversion process from JSON to CSV was correct? — ADyson, May 09 '19 at 15:16
so you want to know if any properties have been added/changed/removed within a JSON object or array? And recursively within that structure too, I assume. — ADyson, May 09 '19 at 15:28
And you're asking if converting it to CSV is useful for doing that? Is that your question? — ADyson, May 09 '19 at 15:28
How about computing a hash? If anything changes, the hashes will be different. Way more efficient, given you only need to know that "something has changed". — Fildor, May 10 '19 at 06:44

Jan · Answer 1 · 2019-05-24T07:41:52.740

0

Have at least 2 options yet - 1st here in another question and new method RemoveTwins added to Gason C++ translated to C# - on my GitHub.

edited May 24 '19 at 07:41

answered May 21 '19 at 07:47

Jan

2,178
3
14
26

How to compare big JSONs?

Is it a good idea or are there any better options ?

1 Answers1

Linked