How to compare two files as part of unittest, while getting useful output in case of mismatch?

Question

As part of some Python tests using the unittest framework, I need to compare two relatively short text files, where the one is a test output file and the other is a reference file.

The immediate approach is:

import filecmp
...
self.assertTrue(filecmp.cmp(tst_path, ref_path, shallow=False))

It works fine if the test passes, but in the even of failure, there is not much help in the output:

AssertionError: False is not true

Is there a better way of comparing two files as part of the unittest framework, so some useful output is generated in case of mismatch?

that will depend a LOT on what the files are expected to contain, I guess... — Jblasco, Feb 28 '17 at 15:01
@Jblasco: Good point; the files are text files, so I will update the question with that info. — EquipDev, Feb 28 '17 at 15:39

score 25 · Answer 1 · edited Aug 11 '23 at 19:28

25

To get a report of which line has a difference, and a printout of that line, use assertListEqual on the contents, e.g

self.assertListEqual(
    list(open(tst_path)),
    list(open(ref_path)))

edited Aug 11 '23 at 19:28

Michael Mior

28,107
9
89
113

answered Jun 10 '19 at 20:09

Ethan Bradford

710
8
10

Under my understanding, this will leave the files open until the garbage-collector notices, which leaves the files locked for too long under Windows. Consider using context managers to limit the time the files are open. – Oddthinking Sep 04 '21 at 06:13
@Oddthinking probably something like: with open(...) as tst, open(...) as ref: ... - open those with with statement, it does list on them as well no need for io.open and such. should close once it leaves 'with' scope – MolbOrg Oct 09 '21 at 18:01
2

Yes, it's not a lot more complicated to include the auto-closing, e.g. with io.open(tst_path) as tst_f, io.open(ref_path) as ref_f: self.assertListEqual(list(tst_f), list(ref_f)) – Ethan Bradford Oct 12 '21 at 16:42

score 10 · Answer 2 · answered Mar 01 '17 at 03:55

10

All you need to do is add your own message for the error condition. doc

self.assertTrue(filecmp(...), 'You error message')

answered Mar 01 '17 at 03:55

Dan

1,874
1
16
21

8

A reminder for those who care: if the two files are different, it prints 'You error message' ONLY. – Tengerye Jun 18 '21 at 08:09

Adrien H · Answer 3 · 2019-12-16T12:47:43.547

2

Comparing the files in the form of arrays bear meaningful assert errors:

assert [row for row in open(actual_path)] == [row for row in open(expected_path)]

You could use that each time you need to compare files, or put it in a function. You could also put the files in the forms of text string instead of arrays.

edited Dec 16 '19 at 12:47

answered Feb 14 '19 at 08:30

Adrien H

643
6
21

1

in the event of multiple rows with mismatches, this will only report the first one. Not ideal. – Clint Eastwood Aug 23 '21 at 21:28
@ClintEastwood You can always join them I guess. Depending on your use case, it might be enough to fail with only one reported line. – Adrien H Aug 25 '21 at 07:52

score 0 · Answer 4 · answered Feb 28 '17 at 15:04

0

Isn't it better to compare the content of the two files. For example if they are text files compare the text of the two files, this will output some more meaningful error message.

answered Feb 28 '17 at 15:04

Bart

496
10
23

1

The intention is to compare the contents, so I added ', shallow=False' to 'filecmp.cmp' to make that clear. – EquipDev Feb 28 '17 at 15:31

score 0 · Answer 5 · answered Aug 05 '23 at 17:18

You can use the built-in difflib module for this.

Use the unified_diff format, which is plain text and will be empty if the contents of the files match. The file contents need to be read into lists first, and the return of unified_diff is a generator, so we wrap it in a list so we can inspect it. Here's a template you can use:

from difflib import unified_diff
with open("my/expected/file.txt, "r") as f:
  expected_lines = f.readlines()
with open("my/actual/file.txt, "r") as f:
  actual_lines = f.readlines()

diff = list(unified_diff(expected_lines, actual_lines))
assert diff == [], "Unexpected file contents:\n" + "".join(diff)

My only complaint here is that I wish I had colors. If you wanted them really badly, you could implement your own diff formatting based on the output of get_grouped_opcodes from the same module.

How to compare two files as part of unittest, while getting useful output in case of mismatch?

5 Answers5