Problem in merging two texts, erasing duplicates, mantaining words order

Question

I wanted to do the action described in Compare two text of words, check and erase duplicates, and merge with Python using exactly the code:

# open files a.txt and b.txt and get the content as a list of lines
with open('a.txt') as f:
    a = f.readlines()

with open('b.txt') as f:
    b = f.readlines()

# get the string from the list
a_str = ''.join(a)
b_str = ''.join(b)

# get sets of unique words
a_set = set(a_str.split(" "))
b_set = set(b_str.split(" "))

# merge sets
c_set = a_set.union(b_set)

# write to a new file
with open('c.txt', 'w') as f:
    f.write(' '.join(c_set))

It perfectly works according to the action of merging two texts erasing duplicates, but how can I create the new text c in the way that the words are reordered in columns and in alphabetic order?

To be more clear I have text one that is

text a:

    NewYork     London       Paris
    Rome        Tokyo        Berlin
    Edinburgh   LosAngeles   Madrid

text b:

    Madrid      Cracow       Porto
    Rome        Berlin       Barcelona
    Manchester  Tokyo        Dublin

I would like that the text c could be organized in 3 columns and with words in alphabetic order as:

 Barcelona   Berlin     Cracow 
 Dublin      Edinburgh  London 
 LosAngeles  Madrid     Manchester 
 NewYork     Paris      Porto
 Rome        Tokyo

can you provide a meaningful example for a.txt and b.txt and the matching expected output? — mozway, Sep 23 '22 at 13:46
Alphabetical is easy -> `sorted(c_set)`. I do not quite understand what you mean by columns for your output. — g.d.d.c, Sep 23 '22 at 13:47
The question title says "maintaining order", but the question says "in sorted order"; which is it? — Karl Knechtel, Sep 23 '22 at 14:26

mozway · Answer 1 · 2022-09-23T14:21:50.103

Here would be my approach:

# open files a.txt and b.txt and get the content as a list of lines
with open('a.txt') as f:
    a = f.readlines()

with open('b.txt') as f:
    b = f.readlines()

# get the string from the list
a_str = ''.join(a)
b_str = ''.join(b)

# get sets of unique words
a_set = set(a_str.split())  ## keep default parameter for better split
b_set = set(b_str.split())

# merge sets
c_set = a_set.union(b_set)

# sort the words alphabetically (a set is UNORDERED!)
c_lst = sorted(c_set)

# get max word length
max_len = max(len(x) for x in c_lst)

# right justify strings to the max_len
N = 3
spacer = '  '
with open('c.txt', 'w') as f:
    f.write('\n'.join([spacer.join(s.ljust(max_len) for s in c_lst[i:i+N])
                      for i in range(0, len(c_lst), N)])
           )

output file:

Barcelona   Berlin      Cracow    
Dublin      Edinburgh   London    
LosAngeles  Madrid      Manchester
NewYork     Paris       Porto     
Rome        Tokyo

Problem in merging two texts, erasing duplicates, mantaining words order

1 Answers1