8

When I compare two Unicode strings in a Python unit test, it gives a nice failure message highlighting which lines and characters are different. However, comparing two 8-bit strings just shows the two strings with no highlighting.

How can I get the highlighting for both Unicode and 8-bit strings?

Here is an example unit test that shows both comparisons:

import unittest

class TestAssertEqual(unittest.TestCase):
    def testString(self):
        a = 'xax\nzzz'
        b = 'xbx\nzzz'
        self.assertEqual(a, b)

    def testUnicode(self):
        a = u'xax\nzzz'
        b = u'xbx\nzzz'
        self.assertEqual(a, b)

if __name__ == '__main__':
    unittest.main()

The results of this test show the difference:

FF
======================================================================
FAIL: testString (__main__.TestAssertEqual)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/mnt/data/don/workspace/scratch/scratch.py", line 7, in testString
    self.assertEqual(a, b)
AssertionError: 'xax\nzzz' != 'xbx\nzzz'

======================================================================
FAIL: testUnicode (__main__.TestAssertEqual)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/mnt/data/don/workspace/scratch/scratch.py", line 12, in testUnicode
    self.assertEqual(a, b)
AssertionError: u'xax\nzzz' != u'xbx\nzzz'
- xax
?  ^
+ xbx
?  ^
  zzz

----------------------------------------------------------------------
Ran 2 tests in 0.001s

FAILED (failures=2)

Update for Python 3

In Python 3, string literals are Unicode by default, so this is mostly irrelevant. assertMultiLineEqual() no longer supports byte strings, so you're pretty much stuck with regular assertEqual() unless you're willing to decode the byte strings to Unicode.

Don Kirkby
  • 53,582
  • 27
  • 205
  • 286

1 Answers1

9

A little digging in the Python source code shows that TestCase registers a bunch of methods to test equality for different types.

self.addTypeEqualityFunc(dict, 'assertDictEqual')
self.addTypeEqualityFunc(list, 'assertListEqual')
self.addTypeEqualityFunc(tuple, 'assertTupleEqual')
self.addTypeEqualityFunc(set, 'assertSetEqual')
self.addTypeEqualityFunc(frozenset, 'assertSetEqual')
try:
    self.addTypeEqualityFunc(unicode, 'assertMultiLineEqual')
except NameError:
    # No unicode support in this build
    pass

You can see that unicode is registered to use assertMultiLineEqual(), but str is not registered for anything special. I have no idea why str is left out, but so far I have been happy with either of the following two methods.

Call Directly

If an 8-bit string isn't registered to use assertMultiLineEqual() by default, you can still call it directly.

def testString(self):
    a = 'xax\nzzz'
    b = 'xbx\nzzz'
    self.assertMultiLineEqual(a, b)

Register String Type

You can also register it yourself. Just add an extra line to your test case's setUp() method. Do it once, and all your test methods will use the right method to test equality. If your project has a common base class for all test cases, that would be a great place to put it.

class TestAssertEqual(unittest.TestCase):
    def setUp(self):
        super(TestAssertEqual, self).setUp()
        self.addTypeEqualityFunc(str, self.assertMultiLineEqual)

    def testString(self):
        a = 'xax\nzzz'
        b = 'xbx\nzzz'
        self.assertEqual(a, b)

    def testUnicode(self):
        a = u'xax\nzzz'
        b = u'xbx\nzzz'
        self.assertEqual(a, b)

Either of these methods will include highlighting when the string comparison fails.

Don Kirkby
  • 53,582
  • 27
  • 205
  • 286
  • Registering the string type doesn't work for me, but calling the method directly does. – 2rs2ts Jul 22 '16 at 22:17
  • 1
    TLDR : Use assertMultiLineEqual for all. – Pierre.Sassoulas Nov 16 '16 at 14:41
  • Wow. Ridiculously complicated. Is there a pytest variant for this? – Zephaniah Grunschlag Apr 29 '21 at 00:30
  • 1
    Unless you're stuck using Python 2, @ZephaniahGrunschlag, this shouldn't be an issue anymore. String literals are Unicode by default in Python 3. – Don Kirkby Apr 29 '21 at 19:12
  • Thanks @DonKirkby. My problem was that I was comparing a multiline `f-string` to a standard multiline string. It was adding a whole bunch of `\n` characters to one and not the other. Then I converted the standard string to an `f-string` and the comparison worked (but I had to but a lint ignore for using an `f-string` without any interpolated variable) – Zephaniah Grunschlag Apr 30 '21 at 00:35
  • 1
    Sounds like you should ask a separate question, @ZephaniahGrunschlag. F-strings shouldn't change white space behaviour, as far as I know. – Don Kirkby Apr 30 '21 at 18:17