1

I am writing a python script to monitor changes in webpage. I have the diff command implemented in python and I have the diff output files in a folder.

I have 260 diff output files. Logically i cannot check all 260 to know which file has changes.

Is there a python solution to read all the diff files and alert me with the filenames which have the changes.

sample filename in my diff output folder: ['4streaming', 'net-log-2016-09- 26-12:29:32']-diff-output-2016-09-27-13:07:32.html



Required output: 4streaming has changed

Forgive me if my way of asking question is wrong. I am new to stackoverflow forum wrt asking questions.

nits
  • 73
  • 2
  • 8

1 Answers1

4

To check if two files have the same content you can use the filecmp module:

>>> import filecmp
>>> filecmp.cmp('a_file.txt', 'another_file.txt')
True

So in your case where you have a lot of files, you could store their names on a list (ex.File_list), and using itertools compare each item==file on the list only once with the others:

import itertools
for i,j in itertools.combinations(File_list, 2):
    filecmp.cmp(i, j)   #where i,j are actual file names
    # do something based on the result

*To get a list with all the file names in a directory take a look at this post.

Another way would be by hashing them and comparing the hashes.

Community
  • 1
  • 1
coder
  • 12,832
  • 5
  • 39
  • 53