I have two versions of a large book in txt format and I'd like to compare them to find significant changes between the versions, ignoring small single character differences.
There are lots of diffing tools that can ignore whitespace differences, but I also want to ignore small typos and single or couple character differences. For example, one version of the book has a repeated misspelling of leige
hundreds of times and this is corrected in the next version to liege
. Some proper nouns have also changed their spelling. (I could make custom workarounds for each misspelling, but would like something more general purpose)
Since I only care about more significant multi-word differences want I really want is to set a filter that ignores changes for a line unless the Levenshtein edit distance is above some threshold.
Looking around all the diff/comparisons tools I find seem to have code in mind so they lack any feature around ignoring small text changes. Google's diff_match_patch library is great for diffing plaintext and ignoring whitespace changes (demo here) but doesn't seem to have an out of the box way to ignore single character non-whitespace differences.
tl;dr; Are there any diff tools that can compare text documents but filter out minor single character non-whitespace differences?