1

I am trying to copy string line from file2 to file1, if this line is not exist in file1. I am using symmetric_difference but it gives me unordered result. Content of files in this example is not actual. There is no numbers in my actual files, just strings, but I used numbers to show the problem. I could probably add numbers to file 2 and sort it as list, but file 2 randomly getting information from other program, that I am not familiar with, and don't want to interfere.

content of file1:

'1\n','2\n','3\n'

content of file2:

'1\n','2\n','3\n','4\n'`,'5\n','6\n','7\n','8\n','9\n','10\n'

it's just string on every line

diff = set(file1).symmetric_difference(file2)

set(['8\n', '10\n', '9\n', '6\n', '7\n', '4\n', '5\n'])

My goal is

set(['4\n', '5\n', '6\n', '7\n', '8\n', '9\n', '10\n'])

Pavel Vanchugov
  • 383
  • 1
  • 3
  • 16

5 Answers5

1

use join() and split()

line1 = "'1\n','2\n','3\n'"
line2 = "'1\n','2\n','3\n','4\n'`,'5\n','6\n','7\n','8\n','9\n','10\n'"

''.join([i for i in line2.split(',') if i not in line1.split(',')])
plasmon360
  • 4,109
  • 1
  • 16
  • 19
1

You can do this from the set obtained by symmetic_difference:

a_list = list(set_instance)
a_list.sort()

then you have a sorted list and you can append to the file1

Mikedev
  • 316
  • 1
  • 8
1

If you aren't married to python, this can be done very easily with the comm Unix executable (if you're on a Unix based system):

$ comm -13 file1.txt file2.txt
4
5
6
7
8
9
10

This assumes the files are pre-sorted.

You could easily call this from Python.

More info on how to use comm

Community
  • 1
  • 1
Julien
  • 5,243
  • 4
  • 34
  • 35
0

For now I am using

with open("file2") as f:
    with open("file1", "r+") as f1:
        for line in f:
            if line not in f1:
                f1.write(line) 
Pavel Vanchugov
  • 383
  • 1
  • 3
  • 16
0

If you transform a list into a set, the order of the elements will be lost. That's totally normal because mathematically the order is meaningless into a set. You will have to reorder it afterwards if you are using set.symmetric_difference. If this wont give you the a satisfactory result, than you should write your own algorithm.

Ervin Szilagyi
  • 14,274
  • 2
  • 25
  • 40