158

How can I test whether two JSON objects are equal in python, disregarding the order of lists?

For example ...

JSON document a:

{
    "errors": [
        {"error": "invalid", "field": "email"},
        {"error": "required", "field": "name"}
    ],
    "success": false
}

JSON document b:

{
    "success": false,
    "errors": [
        {"error": "required", "field": "name"},
        {"error": "invalid", "field": "email"}
    ]
}

a and b should compare equal, even though the order of the "errors" lists are different.

Petter Friberg
  • 21,252
  • 9
  • 60
  • 109
  • 2
    Duplicate of http://stackoverflow.com/questions/11141644/how-to-compare-2-json-in-python – user2085282 Sep 15 '14 at 15:09
  • 1
    Why not just decode them and compare? Or do you mean the the order of the "Array" or `list` elements doesn't matter either? – mgilson Sep 15 '14 at 15:11
  • @user2085282 That question has a different problem going on. – user193661 Dec 10 '15 at 01:29
  • 2
    Please forgive my naivety, but why? List elements have a specific order for a reason. – ATOzTOA Apr 07 '17 at 20:55
  • 1
    As noted in this answer, a JSON array is sorted so these objects containing arrays with different sort orders wouldn't be equal in the strict sense. https://stackoverflow.com/a/7214312/18891 – Eric Ness May 04 '18 at 13:32
  • To @ATOzTOA and others asking why: a common reason would be that you really have sets, and you want to test for set equality, but you had to force your data into lists because JSON doesn't have sets. – Ken Williams Jun 24 '20 at 21:05

9 Answers9

207

If you want two objects with the same elements but in a different order to compare equal, then the obvious thing to do is compare sorted copies of them - for instance, for the dictionaries represented by your JSON strings a and b:

import json

a = json.loads("""
{
    "errors": [
        {"error": "invalid", "field": "email"},
        {"error": "required", "field": "name"}
    ],
    "success": false
}
""")

b = json.loads("""
{
    "success": false,
    "errors": [
        {"error": "required", "field": "name"},
        {"error": "invalid", "field": "email"}
    ]
}
""")
>>> sorted(a.items()) == sorted(b.items())
False

... but that doesn't work, because in each case, the "errors" item of the top-level dict is a list with the same elements in a different order, and sorted() doesn't try to sort anything except the "top" level of an iterable.

To fix that, we can define an ordered function which will recursively sort any lists it finds (and convert dictionaries to lists of (key, value) pairs so that they're orderable):

def ordered(obj):
    if isinstance(obj, dict):
        return sorted((k, ordered(v)) for k, v in obj.items())
    if isinstance(obj, list):
        return sorted(ordered(x) for x in obj)
    else:
        return obj

If we apply this function to a and b, the results compare equal:

>>> ordered(a) == ordered(b)
True
Zero Piraeus
  • 56,143
  • 27
  • 150
  • 160
  • 1
    thank you so much Zero Piraeus. it's exactly the general soolution that I need. but the only problem is that code works only for python 2.x not for python3. I get the following error : TypeError: unorderable types: dict() < dict() Anyway solution is now clear. I will try to make it work for python3. Thanks a lot –  Sep 16 '14 at 09:20
  • 4
    @HoussamHsm I meant to fix this to work with Python 3.x when you first mentioned the unorderable dicts problem, but somehow it got away from me. It now works in both 2.x and 3.x :-) – Zero Piraeus Mar 19 '15 at 06:18
  • when there's an list like `['astr', {'adict': 'something'}]`, I got `TypeError` when trying to sort them. – betteroutthanin Jan 04 '16 at 08:13
  • 1
    @Blairg23 you've misunderstood the question, which is about comparing JSON objects as equal when they contain lists whose elements are the same, but in a different order, *not* about any supposed order of dictionaries. – Zero Piraeus May 21 '17 at 19:49
  • Yes, if you compare two dictionaries with the same elements, but in a different order, you don't need to order them beforehand. JSON objects are equivalent to dictionaries when you import them into Python. Try it, you'll see what I mean. Update: I see what the question asks now. It says "different order of list elements", which is a totally different problem. Two dictionaries will be equivalent regardless of order, unless they contain a list with differently ordered elements. – Blairg23 May 22 '17 at 09:19
  • 1
    @Blairg23 I agree that the question could be more clearly written (although if you look at the [edit history](http://stackoverflow.com/posts/25851183/revisions), it's better than it started out). Re: dictionaries and order – [yes, I know](http://stackoverflow.com/a/25830458) ;-) – Zero Piraeus May 22 '17 at 09:39
  • replace `return obj` for `return str(obj)` if you are ordering objects that contain different types otherwise you will get `TypeError`. – Aikanáro Apr 30 '21 at 02:19
  • I was comparing a data structure loaded from disk with the supposedly equivalent "fresh" data structure (to decide whether it had changed and thus should be written again). Only catch: the new data contains some tuples, while the data from disk has lists in their place, causing the comparison to fail. Solution: test `ordered(data_from_disk) == ordered(json.loads(json.dumps(fresh_data)))` – Shi Jan 26 '22 at 05:21
83

Another way could be to use json.dumps(X, sort_keys=True) option:

import json
a, b = json.dumps(a, sort_keys=True), json.dumps(b, sort_keys=True)
a == b # a normal string comparison

This works for nested dictionaries and lists.

stpk
  • 2,015
  • 1
  • 16
  • 23
  • 1
    ```{"error":"a"}, {"error":"b"}``` vs ```{"error":"b"}, {"error":"a"}``` it won't be able to sort the latter case into first case – ChromeHearts Oct 07 '16 at 14:36
  • @Blairg23 but what would you do if you have lists nested in the dict? You can't just compare the top-level dict and call it a day, this is not what this question is about. – stpk May 23 '17 at 13:19
  • 9
    This doesn't work if you have lists inside. e.g. `json.dumps({'foo': [3, 1, 2]}, sort_keys=True) == json.dumps({'foo': [2, 1, 3]}, sort_keys=True)` – Danil Feb 20 '18 at 21:16
  • 14
    @Danil and probably it shouldn't. Lists are an ordered structure and if they differ only in order, we should consider them different. Maybe for your usecase the order doesn't matter, but we shouldn't assume that. – stpk Feb 22 '18 at 13:24
  • because lists are ordered by index they won't be resorted. [0, 1] shouldn't equal [1, 0] in most situations. So this is a good solution for the normal case, but not for the question above. still +1 – Harrison Oct 05 '18 at 13:37
  • 4
    @stpk given that lists are an ordered structure does not mean there can be no task to check whether two lists contain same elements regardless of their order. Just the same thing applies to a dictionary aka the question – igrek Feb 06 '20 at 13:06
  • At least this is relatively easy, but it really does mask intent. Every time I dive back into python I'm shocked at how "batteries NOT included" it is – Robert Moskal Mar 02 '22 at 17:30
20

Decode them and compare them as mgilson comment.

Order does not matter for dictionary as long as the keys, and values matches. (Dictionary has no order in Python)

>>> {'a': 1, 'b': 2} == {'b': 2, 'a': 1}
True

But order is important in list; sorting will solve the problem for the lists.

>>> [1, 2] == [2, 1]
False
>>> [1, 2] == sorted([2, 1])
True

>>> a = '{"errors": [{"error": "invalid", "field": "email"}, {"error": "required", "field": "name"}], "success": false}'
>>> b = '{"errors": [{"error": "required", "field": "name"}, {"error": "invalid", "field": "email"}], "success": false}'
>>> a, b = json.loads(a), json.loads(b)
>>> a['errors'].sort()
>>> b['errors'].sort()
>>> a == b
True

Above example will work for the JSON in the question. For general solution, see Zero Piraeus's answer.

falsetru
  • 357,413
  • 63
  • 732
  • 636
8

Yes! You can use jycm

from jycm.helper import make_ignore_order_func
from jycm.jycm import YouchamaJsonDiffer

a = {
    "errors": [
        {"error": "invalid", "field": "email"},
        {"error": "required", "field": "name"}
    ],
    "success": False
}
b = {
    "success": False,
    "errors": [
        {"error": "required", "field": "name"},
        {"error": "invalid", "field": "email"}
    ]
}
ycm = YouchamaJsonDiffer(a, b, ignore_order_func=make_ignore_order_func([
    "^errors",
]))
ycm.diff()
assert ycm.to_dict(no_pairs=True) == {} # aka no diff

for a more complex example(value changes in deep structure)

from jycm.helper import make_ignore_order_func
from jycm.jycm import YouchamaJsonDiffer

a = {
    "errors": [
        {"error": "invalid", "field": "email"},
        {"error": "required", "field": "name"}
    ],
    "success": True
}

b = {
    "success": False,
    "errors": [
        {"error": "required", "field": "name-1"},
        {"error": "invalid", "field": "email"}
    ]
}
ycm = YouchamaJsonDiffer(a, b, ignore_order_func=make_ignore_order_func([
    "^errors",
]))
ycm.diff()
assert ycm.to_dict() == {
    'just4vis:pairs': [
        {'left': 'invalid', 'right': 'invalid', 'left_path': 'errors->[0]->error', 'right_path': 'errors->[1]->error'},
        {'left': {'error': 'invalid', 'field': 'email'}, 'right': {'error': 'invalid', 'field': 'email'},
         'left_path': 'errors->[0]', 'right_path': 'errors->[1]'},
        {'left': 'email', 'right': 'email', 'left_path': 'errors->[0]->field', 'right_path': 'errors->[1]->field'},
        {'left': {'error': 'invalid', 'field': 'email'}, 'right': {'error': 'invalid', 'field': 'email'},
         'left_path': 'errors->[0]', 'right_path': 'errors->[1]'},
        {'left': 'required', 'right': 'required', 'left_path': 'errors->[1]->error',
         'right_path': 'errors->[0]->error'},
        {'left': {'error': 'required', 'field': 'name'}, 'right': {'error': 'required', 'field': 'name-1'},
         'left_path': 'errors->[1]', 'right_path': 'errors->[0]'},
        {'left': 'name', 'right': 'name-1', 'left_path': 'errors->[1]->field', 'right_path': 'errors->[0]->field'},
        {'left': {'error': 'required', 'field': 'name'}, 'right': {'error': 'required', 'field': 'name-1'},
         'left_path': 'errors->[1]', 'right_path': 'errors->[0]'},
        {'left': {'error': 'required', 'field': 'name'}, 'right': {'error': 'required', 'field': 'name-1'},
         'left_path': 'errors->[1]', 'right_path': 'errors->[0]'}
    ],
    'value_changes': [
        {'left': 'name', 'right': 'name-1', 'left_path': 'errors->[1]->field', 'right_path': 'errors->[0]->field',
         'old': 'name', 'new': 'name-1'},
        {'left': True, 'right': False, 'left_path': 'success', 'right_path': 'success', 'old': True, 'new': False}
    ]
}

whose results can be rendered as enter image description here

eggachecat
  • 133
  • 2
  • 6
  • 1
    Now JYCM have a cli tool you can directly use to visualize the diff result! https://github.com/eggachecat/jycm – eggachecat Aug 17 '22 at 07:01
  • +1 for the capability to specify ignoring orders for specific keys, and for defining your own diff functions. – Gino Mempin Nov 03 '22 at 10:13
3

You can write your own equals function:

  • dicts are equal if: 1) all keys are equal, 2) all values are equal
  • lists are equal if: all items are equal and in the same order
  • primitives are equal if a == b

Because you're dealing with json, you'll have standard python types: dict, list, etc., so you can do hard type checking if type(obj) == 'dict':, etc.

Rough example (not tested):

def json_equals(jsonA, jsonB):
    if type(jsonA) != type(jsonB):
        # not equal
        return False
    if type(jsonA) == dict:
        if len(jsonA) != len(jsonB):
            return False
        for keyA in jsonA:
            if keyA not in jsonB or not json_equal(jsonA[keyA], jsonB[keyA]):
                return False
    elif type(jsonA) == list:
        if len(jsonA) != len(jsonB):
            return False
        for itemA, itemB in zip(jsonA, jsonB):
            if not json_equal(itemA, itemB):
                return False
    else:
        return jsonA == jsonB
John Doe
  • 354
  • 2
  • 10
Gordon Bean
  • 4,272
  • 1
  • 32
  • 47
2

For the following two dicts 'dictWithListsInValue' and 'reorderedDictWithReorderedListsInValue' which are simply reordered versions of each other

dictObj = {"foo": "bar", "john": "doe"}
reorderedDictObj = {"john": "doe", "foo": "bar"}
dictObj2 = {"abc": "def"}
dictWithListsInValue = {'A': [{'X': [dictObj2, dictObj]}, {'Y': 2}], 'B': dictObj2}
reorderedDictWithReorderedListsInValue = {'B': dictObj2, 'A': [{'Y': 2}, {'X': [reorderedDictObj, dictObj2]}]}
a = {"L": "M", "N": dictWithListsInValue}
b = {"L": "M", "N": reorderedDictWithReorderedListsInValue}

print(sorted(a.items()) == sorted(b.items()))  # gives false

gave me wrong result i.e. false .

So I created my own cutstom ObjectComparator like this:

def my_list_cmp(list1, list2):
    if (list1.__len__() != list2.__len__()):
        return False

    for l in list1:
        found = False
        for m in list2:
            res = my_obj_cmp(l, m)
            if (res):
                found = True
                break

        if (not found):
            return False

    return True


def my_obj_cmp(obj1, obj2):
    if isinstance(obj1, list):
        if (not isinstance(obj2, list)):
            return False
        return my_list_cmp(obj1, obj2)
    elif (isinstance(obj1, dict)):
        if (not isinstance(obj2, dict)):
            return False
        exp = set(obj2.keys()) == set(obj1.keys())
        if (not exp):
            # print(obj1.keys(), obj2.keys())
            return False
        for k in obj1.keys():
            val1 = obj1.get(k)
            val2 = obj2.get(k)
            if isinstance(val1, list):
                if (not my_list_cmp(val1, val2)):
                    return False
            elif isinstance(val1, dict):
                if (not my_obj_cmp(val1, val2)):
                    return False
            else:
                if val2 != val1:
                    return False
    else:
        return obj1 == obj2

    return True


dictObj = {"foo": "bar", "john": "doe"}
reorderedDictObj = {"john": "doe", "foo": "bar"}
dictObj2 = {"abc": "def"}
dictWithListsInValue = {'A': [{'X': [dictObj2, dictObj]}, {'Y': 2}], 'B': dictObj2}
reorderedDictWithReorderedListsInValue = {'B': dictObj2, 'A': [{'Y': 2}, {'X': [reorderedDictObj, dictObj2]}]}
a = {"L": "M", "N": dictWithListsInValue}
b = {"L": "M", "N": reorderedDictWithReorderedListsInValue}

print(my_obj_cmp(a, b))  # gives true

which gave me the correct expected output!

Logic is pretty simple:

If the objects are of type 'list' then compare each item of the first list with the items of the second list until found , and if the item is not found after going through the second list , then 'found' would be = false. 'found' value is returned

Else if the objects to be compared are of type 'dict' then compare the values present for all the respective keys in both the objects. (Recursive comparison is performed)

Else simply call obj1 == obj2 . It by default works fine for the object of strings and numbers and for those eq() is defined appropriately .

(Note that the algorithm can further be improved by removing the items found in object2, so that the next item of object1 would not compare itself with the items already found in the object2)

NiksVij
  • 183
  • 1
  • 9
2

For others who'd like to debug the two JSON objects (usually, there is a reference and a target), here is a solution you may use. It will list the "path" of different/mismatched ones from target to the reference.

level option is used for selecting how deep you would like to look into.

show_variables option can be turned on to show the relevant variable.

def compareJson(example_json, target_json, level=-1, show_variables=False):
  _different_variables = _parseJSON(example_json, target_json, level=level, show_variables=show_variables)
  return len(_different_variables) == 0, _different_variables

def _parseJSON(reference, target, path=[], level=-1, show_variables=False):  
  if level > 0 and len(path) == level:
    return []
  
  _different_variables = list()
  # the case that the inputs is a dict (i.e. json dict)  
  if isinstance(reference, dict):
    for _key in reference:      
      _path = path+[_key]
      try:
        _different_variables += _parseJSON(reference[_key], target[_key], _path, level, show_variables)
      except KeyError:
        _record = ''.join(['[%s]'%str(p) for p in _path])
        if show_variables:
          _record += ': %s <--> MISSING!!'%str(reference[_key])
        _different_variables.append(_record)
  # the case that the inputs is a list/tuple
  elif isinstance(reference, list) or isinstance(reference, tuple):
    for index, v in enumerate(reference):
      _path = path+[index]
      try:
        _target_v = target[index]
        _different_variables += _parseJSON(v, _target_v, _path, level, show_variables)
      except IndexError:
        _record = ''.join(['[%s]'%str(p) for p in _path])
        if show_variables:
          _record += ': %s <--> MISSING!!'%str(v)
        _different_variables.append(_record)
  # the actual comparison about the value, if they are not the same, record it
  elif reference != target:
    _record = ''.join(['[%s]'%str(p) for p in path])
    if show_variables:
      _record += ': %s <--> %s'%(str(reference), str(target))
    _different_variables.append(_record)

  return _different_variables
0
import json

#API response sample
# some JSON:

x = '{ "name":"John", "age":30, "city":"New York"}'

# parse x json to Python dictionary:
y = json.loads(x)

#access Python dictionary
print(y["age"])


# expected json as dictionary
thisdict = { "name":"John", "age":30, "city":"New York"}
print(thisdict)


# access Python dictionary
print(thisdict["age"])

# Compare Two access Python dictionary

if thisdict == y:
    print ("dict1 is equal to dict2")
else:
    print ("dict1 is not equal to dict2")
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
  • https://drive.google.com/file/d/1_nUU_w0mA1Rl9izves-6flSkN7i7bIxI/view?usp=share_link – Kamaraj Kannan Nov 21 '22 at 11:17
  • Hello, please see https://meta.stackoverflow.com/editing-help Thanks! – Eric Aya Nov 21 '22 at 15:55
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Nov 23 '22 at 13:24
-2

With KnoDL, it can match data without mapping fields.

Aahz78
  • 1
  • 1
    This should be a comment, not an answer. If you want to convert to an answer, please add functional code or a deeper explanation. – BLimitless May 17 '21 at 15:16