6

I need to highlight the differences between two simple strings with python, enclosing the differing substrings in a HTML span attribute. So I'm looking for a simple way to implement the function illustrated by the following example:

hightlight_diff('Hello world','HeXXo world','red')

...it should return the string:

'He<span style="color:red">XX</span>o world'

I have googled and seen difflib mentioned, but it's supposed to be obsolete and I haven't found any good simple demo.

Jakub M.
  • 32,471
  • 48
  • 110
  • 179
user1069609
  • 863
  • 5
  • 16
  • 30
  • if a difference is found, should it always show the substring of the second string (in your example: `'XX'`)? You're just looking for positional differences right? this means, `s1[0]` with `s2[0]`, `s1[1]` with `s2[1]` and so on .. – juliomalegria Feb 22 '12 at 14:04
  • This is similar to the question answered [Here](http://stackoverflow.com/questions/1576459/generate-pretty-diff-html-in-python) – Bharat B Feb 22 '12 at 14:05
  • @julio.alegria Well, I am interested in highlighting the differing part of the first string as well, 'll' in my example. Indeed I'm looking for positional diffs. – user1069609 Feb 22 '12 at 14:12

1 Answers1

9

Everything that you need comes out of difflib -- for example:

>>> import difflib
>>> d = difflib.Differ()
>>> l = list(d.compare("hello", "heXXo"))
>>> l
['  h', '  e', '- l', '- l', '+ X', '+ X', '  o']

Each element in that list is a character from your two input strings, prefixed with one of

  • " " (2 spaces), character present at that position in both strings
  • "- " (dash space), character present at that position in the first string
  • "+ " (plus space), character present at that position in the second string.

Iterate through that list and you can build exactly the output you're looking to create.

There's no mention of difflib being in any way obsolete or deprecated in the docs.

bgporter
  • 35,114
  • 8
  • 59
  • 65
  • 1
    Thanks, this is exactly the kind of thing I needed! I had the idea that difflib should be obsolete from the book "Python Essential Reference 4th ed." by D. M. Beazley 2009, page 586: "String processing. The following modules are some older, now obsolete, modules used for string processing ... difflib, fpformat, stringprep, textwrap " – user1069609 Feb 22 '12 at 14:41
  • Far from being outdated, difflib is included in the Python standard library, even in Python 11 as of this writing. It's very mature (reliable and fast). – hobs May 19 '23 at 23:51