Delete the 1st instance of duplicated items and keep others in .txt file in Python

Question

I have a .txt file include many duplicate lines, and I want to replace the first line and keep the others. Can anyone help me?

the original test.txt content

the file I want

I have tried this method

Search and replace a line in a file in Python

But this method will replace all the duplicate lines.

Anyway, I get the answer. It's really simple.

flag = 1
for line in fileinput.input(filename, inplace = 1): 
    if "111" in line and flag==1:
        print(line.replace("111",  "22222").rstrip() )
        flag = 2
    else:
        print(line.replace("111",  "111").rstrip() )

I think that is not efficient and hope your answer.

Please post what you've tried so far and where did you encounter problems. [\[SO\]: How to create a Minimal, Complete, and Verifiable example (mcve)](https://stackoverflow.com/help/mcve). — CristiFati, Oct 23 '18 at 08:51

score 0 · Answer 1 · answered Oct 23 '18 at 19:42

You could use collections.defaultdict and create a dictionary with all the indexes of each value in the document. If the there is more than one index, you can write only the values beginning after the first items by slicing the dictionary values and appending them to a new list.

from collections import defaultdict

with open('test.txt') as f:
    content = (f.read()).split()

dd = defaultdict(list)

for i, v in enumerate(content):
    dd[v].append(i)

res = []

for v in dd.values():
    if len(v) == 1:
        res.append(content[v[0]])
    else:
        for i in v[1:]:
            res.append(content[i])

with open('out.txt', 'w') as f:
    f.write('\n'.join(map(str,res)))

Delete the 1st instance of duplicated items and keep others in .txt file in Python

1 Answers1