I have a function which works with unicode internally, and I would like to test it using py.test
. Currently, I have the following code:
def test_num2word():
assert num2word(2320) == u"dva tisíce tři sta dvacet"
However, the assertion fails with:
E assert u'dva tis\xed...i sta dvacet ' == u'dva tis\xc3\...9i sta dvacet'
E - dva tis\xedce t\u0159i sta dvacet
E ? ^ ^ -
E + dva tis\xc3\xadce t\xc5\x99i sta dvacet
E ?
As I understand, my function correctly returns unicode, which it then tries to compare to an utf-8 encoded string, which (obviously) fails. Yet I thought using u"..."
in my source would also convert the string to the same encoding used internally by Python.
My question is, is there a sane way of comparing these, or do I need to pepper each test statement with a decode('utf-8')
(on the right-hand side) or an encode('utf-8')
(on the left side. Even if I write a wrapper function, this doesn't strike me as ideal -- there must be a way to compare this sanely! No, using Python 3 is not an option.