Compare 2 files in python and print non-matching text

Question

I have two files:

Resp.txt:
vrf XXX
 address-family ipv4 unicast
  import route-target
   123:45
   212:43
  !
  export route-policy ABCDE
  export route-target
   9:43
  !
  maximum prefix 12 34
  spanning tree enable
  bandwidth 10
 !
!

and sample.txt

vrf
address-family ipv4 unicast
import route-target
export route-target
maximum prefix

I want to match resp.txt and sample.txt such that if contents of sample are not present in resp, I get those lines of text. The output should be like:

spanning tree enable
bandwidth 10

I am using :

t2=open('sample.txt','r')
abc=open('resp.txt','r')
for x in t2:
  for line in abc:

         if x.strip() in line.strip():
          print 'yes'
         else:
          print line

But it's matching every line in both the text files and hence, not showing the correct result.

What is `resp`? Note you can only iterate over a file handle once, so only the first iteration of your loop over `t2` will actually do anything. — jonrsharpe, May 29 '17 at 08:05
@jonrsharpe resp is a file against which I have to match. Is there any alternate way of achieving my output? — alisha, May 29 '17 at 08:06
strip() is empty so it's useless, do you want to skip whitespaces? then use strip(' ') — Ale, May 29 '17 at 08:07
@Ale, I am able to match but not getting the desired output. — alisha, May 29 '17 at 08:11
@jonrsharpe resp.txt is my file and I am reading it using resp[1]. — alisha, May 29 '17 at 08:12
@alisha it's strange if they match but the output is wrong, try some of this comparisons: https://stackoverflow.com/a/5473150/7094875 — Ale, May 29 '17 at 08:12
@alisha resp[1] is gramatically a list because you are using an index '[1]', so maybe the problem is in there — Ale, May 29 '17 at 08:13
Your indentation is still completely wrong. People should be able to copy and paste this and replicate the issue. Did you *try* using readlines? — jonrsharpe, May 29 '17 at 08:20
@Ale - [`strip()`](https://docs.python.org/3.6/library/stdtypes.html#str.strip) without arguments removes whitespace. — Darkstarone, May 29 '17 at 08:53

Darkstarone · Answer 1 · 2017-05-29T08:59:50.693

So the simplest solution to get all the strings not in sample.txt is to use set difference:

file_1 = set()
file_2 = set()

with open('Resp.txt', 'r') as f:
    for line in f:
        file_1.add(line.strip())

with open('Sample.txt', 'r') as f:
    for line in f:
        file_2.add(line.strip())

print(file_1 - file_2)

Which returns:

{'export route-policy ABCDE', 'vrf XXX', 'spanning tree enable', '!', '212:43', 'bandwidth 10', 'maximum prefix 12 34', '9:43', '123:45'}

However, this doesn't include certain rules applied to Resp.txt, for example:

If line is "maximum prefix" ignore the numbers.

These rules can be applied while reading Resp.txt:

import re

file_1 = set()
file_2 = set()

with open('Resp.txt', 'r') as f:
    for line in f:
        line = line.strip()
        if line == "!":
            continue
        elif re.match( r'\d+:\d+', line): # Matches times.
            continue
        elif line.startswith("vrf"):
            line = "vrf"
        elif line.startswith("maximum prefix"):
            line = "maximum prefix"
        file_1.add(line)

with open('Sample.txt', 'r') as f:
    for line in f:
        file_2.add(line.strip())

print(file_1) - file_2)

Which returns:

{'export route-policy ABCDE', 'bandwidth 10', 'spanning tree enable'}

Which is correct because sample.txt does not contain route-policy.

These rules could be made more robust, but they should be illustrative enough.

Keep in mind set will only find unique differences, and not all (say you have multiple 'spanning tree enable' lines and would like to know how many times these are seen. In that case, you could do something more in line with your original code:

import re

file_1 = []
file_2 = []

with open('Resp.txt', 'r') as f:
    for line in f:
        line = line.strip()
        if line == "!":
            continue
        elif re.match( r'\d+:\d+', line):
            continue
        elif line.startswith("vrf"):
            line = "vrf"
        elif line.startswith("maximum prefix"):
            line = "maximum prefix"
        file_1.append(line)

with open('Sample.txt', 'r') as f:
    for line in f:
        file_2.append(line.strip())

diff = []

for line in file_1:
    if line not in file_2:
        diff.append(line)

print(diff)

Result:

['export route-policy ABCDE', 'spanning tree enable', 'bandwidth 10']

While this method is slower (although you probably won't notice), it can find duplicate lines and maintains the order of the lines found.

Compare 2 files in python and print non-matching text

1 Answers1