0

I am trying to compare 2 different pieces of (Javascript/JSON) code using difflib module in Python 3.8,

{"message": "Hello world", "name": "Jack"}

and

{"message": "Hello world", "name": "Ryan"}

Problem: When these 2 strings are prettified and compared using difflib, we get the inline differences, as well as all the common lines.

Is there a way to only show the lines that differ, in a manner thats clearer to look at? This will help significantly when both files are much larger in size, making it more challenging to identify the changed lines.

Thanks!


Actual Output

{
    "message": "Hello world",
    "name": "{J -> Ry}a{ck -> n}"
}

Desired Output

    "name": "{J -> Ry}a{ck -> n}"

Even better will be something like:

    {"name": "Jack"} -> {"name": "Ryan"}

Python Code Used

We use jsbeautifier here instead of json because the files we are comparing may sometimes be malformed JSON. json will throw an error while jsbeautifier still formats it the way we expect it to.

import jsbeautifier

def inline_diff(a, b):
    """
    https://stackoverflow.com/questions/774316/python-difflib-highlighting-differences-inline/47617607#47617607
    """
    import difflib
    matcher = difflib.SequenceMatcher(None, a, b)
    def process_tag(tag, i1, i2, j1, j2):
        if tag == 'replace':
            return '{' + matcher.a[i1:i2] + ' -> ' + matcher.b[j1:j2] + '}'
        if tag == 'delete':
            return '{- ' + matcher.a[i1:i2] + '}'
        if tag == 'equal':
            return matcher.a[i1:i2]
        if tag == 'insert':
            return '{+ ' + matcher.b[j1:j2] + '}'
        assert false, "Unknown tag %r"%tag
    return ''.join(process_tag(*t) for t in matcher.get_opcodes())


# File content to compare
file1 = '{"message": "Hello world", "name": "Jack"}'
file2 = '{"message": "Hello world", "name": "Ryan"}'

# Prettify JSON
f1 = jsbeautifier.beautify(file1)
f2 = jsbeautifier.beautify(file2)

# Print the differences to stdout
print(inline_diff(f1, f2))
Nyxynyx
  • 61,411
  • 155
  • 482
  • 830

1 Answers1

0

For your desired output you can do even without usage of difflib, for example:

def find_diff(a, b):
    result = []
    a = json.loads(a)
    b = json.loads(b)
    for key in a:
        if key not in b:
            result.append(f'{dict({key: a[key]})} -> {"key deleted"}')
        elif key in b and a[key] != b[key]:
            result.append(f'{dict({key: a[key]})} -> {dict({key: b[key]})}')
    return '\n'.join(t for t in result)

# File content to compare
file1 = '{"new_message": "Hello world", "name": "Jack"}'
file2 = '{"message": "Hello world", "name": "Ryan"}'

print(find_diff(f1, f2))

#{'new_message': 'Hello world'} -> key deleted
#{'name': 'Jack'} -> {'name': 'Ryan'}

There are plenty of ways to handle it, try to adapt it for your needs.

funnydman
  • 9,083
  • 4
  • 40
  • 55