2

IMAP filtering and searching dictates that my script works as follows: Each night I receive an email with a relevant .doc file attached to it. My Python script always selects the last (most recent) email, and does certain operations on that .doc file.

All .doc files sent daily are named exactly the same.

Issue is: Sometimes the sender neglects to send a new file. The result is that my script does the operation on the wrong file (the one from the day before). If I can somehow check that two files are actually (copies) of the same file, then I can avoid doing the operation.

How is this most easily/effectively achievable in Python?

zerohedge
  • 3,185
  • 4
  • 28
  • 63

1 Answers1

2

To compare files binary-wise, the best/quickest way is to use the filecmp module:

>>> import filecmp
>>> filecmp.cmp("first.doc","second.doc",shallow=False)

returns True if both files share the exact file contents. shallow is set to False so the file contents are analysed regardless of a difference of date (which you'll have when you extract both files)

Moinuddin Quadri
  • 46,825
  • 13
  • 96
  • 126
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219