20

I want to diff two JSON text files. Unfortunately they're constructed in arbitrary order, so I get diffs when they're semantically identical. I'd like to use jq (or whatever) to sort them in any kind of full order, to eliminate differences due only to element ordering.

--sort-keys solves half the problem, but it doesn't sort arrays.

I'm pretty ignorant of jq and don't know how to write a jq recursive filter that preserves all data; any help would be appreciated.

I realize that line-by-line 'diff' output isn't necessarily the best way to compare two complex objects, but in this case I know the two files are very similar (nearly identical) and line-by-line diffs are fine for my purposes.

Using jq or alternative command line tools to diff JSON files answers a very similar question, but doesn't print the differences. Also, I want to save the sorted results, so what I really want is just a filter program to sort JSON.

Community
  • 1
  • 1
Jeff Learman
  • 2,914
  • 1
  • 22
  • 31
  • 2
    Possible duplicate of [Using jq or alternative command line tools to diff JSON files](http://stackoverflow.com/questions/31930041/using-jq-or-alternative-command-line-tools-to-diff-json-files) – Ewan Mellor Jul 08 '16 at 02:12

4 Answers4

19

Here is a solution using a generic function sorted_walk/1 (so named for the reason described in the postscript below).

normalize.jq:

# Apply f to composite entities recursively using keys[], and to atoms
def sorted_walk(f):
  . as $in
  | if type == "object" then
      reduce keys[] as $key
        ( {}; . + { ($key):  ($in[$key] | sorted_walk(f)) } ) | f
  elif type == "array" then map( sorted_walk(f) ) | f
  else f
  end;

def normalize: sorted_walk(if type == "array" then sort else . end);

normalize

Example using bash:

diff <(jq -S -f normalize.jq FILE1) <(jq -S -f normalize.jq FILE2)

POSTSCRIPT: The builtin definition of walk/1 was revised after this response was first posted: it now uses keys_unsorted rather than keys.

peak
  • 105,803
  • 17
  • 152
  • 177
  • Just what I needed, thanks! I see you posted a variation of this solution in the related post, but the simpler example here answered a number of questions. – Jeff Learman Jul 08 '16 at 16:24
14

I want to diff two JSON text files.

Use jd with the -set option:

No output means no difference.

$ jd -set A.json B.json

Differences are shown as an @ path and + or -.

$ jd -set A.json C.json

@ ["People",{}]
+ "Carla"

The output diffs can also be used as patch files with the -p option.

$ jd -set -o patch A.json C.json; jd -set -p patch B.json

{"City":"Boston","People":["John","Carla","Bryan"],"State":"MA"}

https://github.com/josephburnett/jd#command-line-usage

Joe Burnett
  • 1,089
  • 8
  • 4
  • 1
    Love learning about new tools that make my life easier. Thanks for sharing the answer using the "new to me" `jd` :) – TryTryAgain Sep 11 '19 at 17:37
4

I'm surprised this isn't a more popular question/answer. I haven't seen any other json deep sort solutions. Maybe everyone likes solving the same problem over and over.

Here's an wrapper for @peak's excellent solution above that wraps it into a shell script that works in a pipe or with file args.

#!/usr/bin/env bash

# json normalizer function
# Recursively sort an entire json file, keys and arrays
# jq  --sort-keys is top level only
# Alphabetize a json file's dict's such that they are always in the same order
# Makes json diff'able and should be run on any json data that's in source control to prevent excessive diffs from dict reordering.

[ "${DEBUG}" ] && set -x
TMP_FILE="$(mktemp)"
trap 'rm -f -- "${TMP_FILE}"' EXIT

cat > "${TMP_FILE}" <<-EOT
# Apply f to composite entities recursively using keys[], and to atoms
def sorted_walk(f):
  . as \$in
  | if type == "object" then
      reduce keys[] as \$key
        ( {}; . + { (\$key):  (\$in[\$key] | sorted_walk(f)) } ) | f
  elif type == "array" then map( sorted_walk(f) ) | f
  else f
  end;

def normalize: sorted_walk(if type == "array" then sort else . end);

normalize
EOT

# Don't pollute stdout with debug output
[ "${DEBUG}" ] && cat $TMP_FILE > /dev/stderr

if [ "$1" ] ; then
    jq -S -f ${TMP_FILE}  $1
else
    jq -S -f ${TMP_FILE} < /dev/stdin
fi
Bruce Edge
  • 1,975
  • 1
  • 23
  • 31
0

jq has a --sort-keys option you can use.

See the jq manpage for reference.

mozway
  • 194,879
  • 13
  • 39
  • 75