-2

I'm trying to learn python :) and sorry if it is a really noob question. I tried to search but couldn't find exactly what i was looking for.

I'm trying to compare 2 files. but not just straight foward like this. I want something more precise

there will be more then 200 name entry and i need to extract new entries from file 2

let say file one has those entry:

some random text -Name: Item_01_01- some random text
some random text -Name: Thing_01_01- some random text

and file 2 has
some random text -Name: Item_01- some random text
some random text -Name: Thing_01- some random text
some random text -Name: Object_02- some random text

I want to do something that will compare the 2 files and extract the new item in file 2

so i want the info Object_02 appear in my output file

searching the info into -Name: XXXXX-

I know how to read file and write files in python, it the info i'm not sure how to get it.

And yes file 1 has more number at then ends of each items

I hope it's clear

(sorry English not my main language)

Thanks a lot in advance for help.

Stator
  • 1
  • 1
  • 1
    Welcome to SO. Please take the time to read [ask] and the other links found on that page. – wwii Jan 29 '19 at 20:24
  • Will the lines on `file 2` always be in the order you showed: will the *new* `Object_xxxx` line always be two lines after the ` -Name: Item_xx` line? do you know how to [iterate over files](https://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects)? – wwii Jan 29 '19 at 20:26
  • well, let me explain the use. file1 has a list of object that have been previously placed there. somme object will be added later on. i create a list to generate all object in a database. and want to compare of what have been placed in file1. so the order may change when asset are added into file2. So i want to do the comparaison to find new object in file2 so they can be added in file1. hope it helped. so the output must be asset that are in file2 that are not in file1. in my example the result should be object_02 @wwii – Stator Jan 30 '19 at 17:57

2 Answers2

0

Well assuming the order in which it is written does not matter, I think the most effective way to do this is to use a set to store values from both files and then write it over to file 2. Here is my code.

data = set() # using a set to store distinct data
f = open("file1.txt", "r")
for line in f:
    data.add(line) # store data from file 1
f.close()
f = open("file2.txt", "r")
for line in f:
    data.add(line) # store data from file 2
f.close()
f = open("file2.txt","w")
for line in data:
    f.write(data) # finally write the distinct values to the file
f.close()
0

here is the final working result of my code. I had to adjust it to make sure that my file2 don't get that extra weird space that I was not awarded of. I did some custom adjustment so it work nicely with my needs.

instead if using name i used ID that are 16 digit numbers.

Thanks to @davedwards for precious help

the code:

with open('file1.txt', 'r') as f:
    f1_data = f.readlines()
with open('file2.txt', 'r') as f:
    f2_data = f.readlines()
f2_data = f2_data[0].split(" ")
new_data = []
for terms in f2_data:
    found = False
    for line in f1_data:
        if line.find(terms) != -1:
            found = True
            break
    if not found:
        new_data.append(terms)
with open('new_data.txt', 'w') as f:
    f.writelines(" OR ".join(new_data))
Stator
  • 1
  • 1