suppose your file is called test.txt
,
first_names = set()
last_names = set()
for line in open('test.txt', 'r'):
if line.startswith( 'last:' ):
last_names.add( line.split()[1] )
if line.startswith( 'first:' ):
first_names.add( line.split()[1] )
output_names = []
output_names = [name for name in first_names if name in last_names]
with open('new.txt', 'w' ) as f:
for name in output_names:
f.write('last: '+name+'\n')
f.write('first: '+name+'\n')
To explain a little, the first part creates two empty sets for first_names
and last_names
. You can use lists for these, but checking for membership (which is what happens later with if name in last_names
) faster for a set. Its O(1) for a set and O(n) for a list where n is the size of the list.
A nice feature of Python is that you can naturally iterate over the lines of a file object. The line.split()[1]
part splits the lines using white space and takes the second element (Python indexes from 0).
While the sets are faster for membership checking, they are unordered so wont preserve the order of names in the file. To construct output_names
I use what's called a list comprehension. The last part writes the results to file.