1

I have a requirement wherein I store html text as string in python and want to compare them.

str1 = '<br> Example1'
str2 = '<br/>     Example1'

If I do a normal str1 == str2, it will be False. But in html they are equal. At the same time

str1 = '<br> Example1'
str2 = '<p> Example1'

is not html equal. Same goes with str2 = '<b> Example </b>' where str1!=str2

Are there any way to do it in python. I know the test case has self.assertInEmail which does html comparison, but I dont want to use test functions in my production code.

Mohan
  • 1,850
  • 1
  • 19
  • 42
  • 4
    Split by whitespace and join again on the empty string? `''.join(str1.split()) == ''.join(str2.split())`. Though this will quickly go wrong. If you want to fully compare HTML, you're probably better off using a library like BeautifulSoup. –  Jan 08 '16 at 07:53
  • I'd like to mention that those two strings are not equal even in HTML. The spaces are kept by browser and are accessible by JavaScript (example: http://pastebin.com/iP9rpEv0 ) so the presence of additional space character can actually affect what's going on. So, as @Evert said, you probably want to use BeautifulSoup (or any other HTML parser) and try to analyze if two DOMs are close enough to be treated as equal. – Vladimir Jan 08 '16 at 08:06
  • 2
    I think the question has been solved in [how to using python to diff two html files](http://stackoverflow.com/questions/9562269/how-to-using-python-to-diff-two-html-files) – Jacky1205 Jan 08 '16 at 08:39

0 Answers0