1

Hello I have this code I have been working on,

I have two files standard.txt, new.txt

standard.txt has: ABC123 ABC003 ABC004 new.txt has: ABC123 ABC004

I was able to display the difference in files but I am interested in actually displaying what line has the difference. If someone could help take a look, and perhaps give me an example of what I am doing wrong, that would be very helpful my code is:

def open_file_and_return_list(file_path):
    list = []
    with open(file_path, 'r') as f:
        line = f.readline()
        while line:
            list.append(line)
            line = f.readline()
    return list

def clean_new_line(list):
    for i in range(len(list)):
        if "\n" in list[i]:
            list[i] = list[i].replace("\n", "")
    return list


if __name__ == "__main__":
    list1 = open_file_and_return_list(r"C:\Users\a\a\b\file_compare\new.txt")
    list2 = open_file_and_return_list(r"C:\Users\a\a\b\file_compare\standard.txt")
    list1 = clean_new_line(list1)
    list2 = clean_new_line(list2)
    diff = []
    for obj in list1:
        if obj not in list2:
            diff.append(obj)
    for obj in list2:
        if obj not in list1:
            diff.append(obj)

    print(diff)

    diff_file = input("\nINFO: Select what to name the difference(s) : ")
    with open(diff_file, 'w') as file_out:
        for line in diff:
            file_out.write("** WARNING: Difference found in New Config:\n " + line + "\n")
            print("WARNING: Difference in file: " + line)

For example the files I am comparing are two config files, so the differences might be shown on two different lines, and therefore I do not want to show each difference as 1, 2, 3, but instead say for example Difference found on Line 105: *****difference***

Maybe I need to do something lime this?

for i,lines2 in enumerate(hosts1):
if lines2 != lines1[i]:
    print "line ", i, " in hosts1 is different \n"
    print lines2
else:
    print "same"

and use enumerate?

Alehandro
  • 39
  • 8
  • I mean, what you're doing wrong is not including the line number. Create a counter and print that along with the line itself. – Michael Bianconi Feb 05 '20 at 15:52
  • Would I use something along these lines? https://pypi.org/project/line-counter/ I am not the best with Python, but I am learning sorry for asking such redundant questions, or do you recommend i use something that comes with python, or create the line counter my self? What is the best method? Thx. – Alehandro Feb 05 '20 at 15:56
  • There's no need to import anything. As you loop through the lines, increment a counter. When you check ```if obj not in list1``` and ```if obj not in list2```, add the line number + the line to the diff. – Michael Bianconi Feb 05 '20 at 15:58
  • I should mention that line numbers become more complicated if you remove empty lines (as you do with clean_new_line()). – Michael Bianconi Feb 05 '20 at 16:35
  • In Unix/Linux the `diff` command is great at this sort of thing. You could run it programmatically ... – Mike Robinson Feb 05 '20 at 16:40
  • The problem is I am going to create line numer #1 or number #2, based on the difference but the difference migh tbe on line 10 and I want to display that the difference was on line 10 instead of showing each difference as line 1,2,3,4,5,6, as the first couple of lines or other lines might be the same. – I tried adding if obj in list2: if obj not in list1: counter =+ 1 and adding counter, but its giving me errors. – Alehandro Feb 05 '20 at 16:43
  • Only problem is this is going to be executed as a python .exe for Windows computers, or I would have used diff, would have saved my life plenty. Problem is I'm being told to reinvent the wheel. – Alehandro Feb 05 '20 at 16:46
  • You can use [FC](https://stackoverflow.com/questions/6877238/what-is-the-windows-equivalent-of-the-diff-command) on windows. – Michael Bianconi Feb 05 '20 at 16:53

2 Answers2

1

enumerate and zip are your friends here. To obtain the differences I would do something like:

# Make sure both lists of lines are same length (for zip)
maxl = max(len(list1), len(list2))                                          
list1 += [''] * (maxl - len(list1))                                         
list2 += [''] * (maxl - len(list2))                                         

for iline, (l1, l2) in enumerate(zip(list1, list2)):
    if l1 != l2:
        print(iline, l1, l2)

Also, (1) you should never use list as a variable name, as it is a built-in class name in python, and (2) to obtain all lines from a file there is a one-liner as:

lines = open('path_to_file').read().splitlines()
  • This is perfect! But I am always off by +1 line. so for example if the difference is on line 4 it keeps saying 3. How can I fix this? How would I also write this to file if the difference.txt file is expecting a string and instead getting int... (sorry newbie questions) – Alehandro Feb 05 '20 at 17:19
  • Indexes are starts from 0. – Ekrem Dinçel Feb 05 '20 at 17:20
  • That's because you can't read from a file opened in write mode `'w'`. Either you write, or you read – Francisco Leon Feb 05 '20 at 17:29
  • Okay so how would I add plus one to the index, in order to get the correct number and than again how would I write the results to the file, sorry I am very new and confused when it comes to this but appreciate all your advice!! – Alehandro Feb 05 '20 at 17:37
0

I was able to accomplish what I wanted mixing a bunch of your approaches. THANK YOU!

def open_file_and_return_list(file_path):
    list = []
    with open(file_path, 'r') as f:
        line = f.readline()
        while line:
            list.append(line)
            line = f.readline()
    return list


def clean_new_line(list):
    for i in range(len(list)):
        if "\n" in list[i]:
            list[i] = list[i].replace("\n", "")
    return list


if __name__ == "__main__":
    list1 = open_file_and_return_list(r"new.txt")
    list2 = open_file_and_return_list(r"standard.txt")
    maxl = max(len(list1), len(list2))
    list1 += [''] * (maxl - len(list1))
    list2 += [''] * (maxl - len(list2))
    diff = []
    diff_file = input("\nINFO: Select what to name the difference(s) : ")
    open('diff.txt', 'w').close()

    for iline, (l1, l2) in enumerate(zip(list1, list2)):
        if l1 != l2:
            print(iline, l1, l2)
            print(iline, l1, l2, file=open('diff.txt', 'a'))
Alehandro
  • 39
  • 8