I want to print the unique lines present within the text file.
For example: if the content of my text file is:
12345
12345
12474
54675
35949
35949
74564
I want my Python program to print:
12474
54675
74564
I'm using Python 2.7.
I want to print the unique lines present within the text file.
For example: if the content of my text file is:
12345
12345
12474
54675
35949
35949
74564
I want my Python program to print:
12474
54675
74564
I'm using Python 2.7.
try this:
from collections import OrderedDict
seen = OrderedDict()
for line in open('file.txt'):
line = line.strip()
seen[line] = seen.get(line, 0) + 1
print("\n".join([k for k,v in seen.items() if v == 1]))
prints
12474
54675
74564
Update: thanks to the comments below, this is even nicer:
from collections import Counter, OrderedDict
class OrderedCounter(Counter, OrderedDict):
pass
with open('file.txt') as f:
seen = OrderedCounter([line.strip() for line in f])
print("\n".join([k for k,v in seen.items() if v == 1]))
You may use OrderedDict
and Counter
for removing the duplicates and maintaining order as:
from collections import OrderedDict, Counter
class OrderedCounter(Counter, OrderedDict):
pass
with open('/tmp/hello.txt') as f:
ordered_counter = OrderedCounter(f.readlines())
new_list = [k.strip() for k, v in ordered_counter.items() if v==1]
# ['12474', '54675', '74564']
Use count()
to check the number of occurrences of each element in the list, and remove each occurrence using index()
in a for loop:
with open("file.txt","r")as f:
data=f.readlines()
for x in data:
if data.count(x)>1: #if item is a duplicate
for i in range(data.count(x)):
data.pop(data.index(x)) #find indexes of duplicates, and remove them
with open("file.txt","w")as f:
f.write("".join(data)) #write data back to file as string
file.txt:
12474
54675
74564
Not the most efficient since it uses count
but simple:
with open("input.txt") as f:
orig = list(f)
filtered = [x for x in orig if orig.count(x)==1]
print("".join(filtered))