-4

I want to print the unique lines present within the text file.

For example: if the content of my text file is:

12345
12345
12474
54675
35949
35949
74564

I want my Python program to print:

12474
54675
74564

I'm using Python 2.7.

Moinuddin Quadri
  • 46,825
  • 13
  • 96
  • 126
  • And your own attempt? – Willem Van Onsem Jan 21 '17 at 20:41
  • 2
    It looks like you want us to write some code for you. While many users are willing to produce code for a coder in distress, they usually only help when the poster has already tried to solve the problem on their own. A good way to demonstrate this effort is to include the code you've written so far, example input (if there is any), the expected output, and the output you actually get (output, tracebacks, etc.). The more detail you provide, the more answers you are likely to receive. Check the [FAQ](http://stackoverflow.com/tour) and [How to Ask](http://stackoverflow.com/questions/how-to-ask). – TigerhawkT3 Jan 21 '17 at 20:44
  • @TigerhawkT3 I hope you don't mind I closed the question. I feel you wanted to provide an answer :) – Jean-François Fabre Jan 21 '17 at 20:44
  • 2
    @Jean-FrançoisFabre - It does deserve closure, but I don't think that's an accurate dupe. This question wants entries with a count greater than one to be removed entirely. – TigerhawkT3 Jan 21 '17 at 20:46
  • You may also check: [How to return unique words from the text file using Python](http://stackoverflow.com/questions/22978602/how-to-return-unique-words-from-the-text-file-using-python) – Moinuddin Quadri Jan 21 '17 at 20:46
  • Right! Do you want me to reopen it so you can close it with the proper original question? – Jean-François Fabre Jan 21 '17 at 20:48
  • @Jean-FrançoisFabre : It's certainly not the duplicate of the linked question. – Eric Duminil Jan 21 '17 at 20:53
  • 1
    okay, reopening. After that I cannot close anymore. Don't complain about duplicate answers :) – Jean-François Fabre Jan 21 '17 at 20:57
  • @Jean-FrançoisFabre I think that's justified, since we have fun down there trying to solve the riddle with different approaches.. even though it probably defeats the purpose of giving the fish instead of the fishing rod – hansaplast Jan 21 '17 at 21:10
  • 1
    Okay, I try to give the fish, but before OP tries to eat it, the fish explains the solution :) – Jean-François Fabre Jan 21 '17 at 21:12

4 Answers4

2

try this:

from collections import OrderedDict

seen = OrderedDict()
for line in open('file.txt'):
    line = line.strip()
    seen[line] = seen.get(line, 0) + 1

print("\n".join([k for k,v in seen.items() if v == 1]))

prints

12474
54675
74564

Update: thanks to the comments below, this is even nicer:

from collections import Counter, OrderedDict

class OrderedCounter(Counter, OrderedDict):
    pass

with open('file.txt') as f:
    seen = OrderedCounter([line.strip() for line in f])
    print("\n".join([k for k,v in seen.items() if v == 1]))
hansaplast
  • 11,007
  • 2
  • 61
  • 75
  • indeed! didn't catch that, hold on – hansaplast Jan 21 '17 at 20:49
  • Yes Eric is right. This code isn't what i want – Altay Karakalpaklı Jan 21 '17 at 20:49
  • @AltayKarakalpaklı I updated the code so it now does what it should – hansaplast Jan 21 '17 at 20:56
  • @AltayKarakalpaklı the comments above are of course right: did you actually try anything before you posted the question? – hansaplast Jan 21 '17 at 20:56
  • 1
    @hansaplast : Your code works now, but you shouldn't write any code if the OP didn't bother to. Some text explaining what your method would do would have been better IMHO. – Eric Duminil Jan 21 '17 at 20:57
  • @EricDuminil: I will try to hold back next time. At least this question was formulated clearly which is somehow an exception on SO these days :) – hansaplast Jan 21 '17 at 20:58
  • Better way would have been to use `Counter` along with `OrderedDict` together to get order count of each word/line. No need of set here – Moinuddin Quadri Jan 21 '17 at 20:59
  • For reference, here's what I wrote before you updated the answer : You didn't provide any code, so you'll only get pointers. You could use a dictionary, with strings as keys and count as values. The default value would be 0. You could iterate over your file, and for every line, you increase the value of the corresponding string by 1. Once you read all the file, you can iterate over the values in your dictionary : if it's 1, you can output the key. – Eric Duminil Jan 21 '17 at 21:00
  • thank you it worked – Altay Karakalpaklı Jan 21 '17 at 21:01
  • @MoinuddinQuadri: I thought of combining `Counter` and `OrderedDict` but thought it would not be possible, turns out it is, this is a lot nicer of course, thanks for the tip – hansaplast Jan 21 '17 at 21:05
2

You may use OrderedDict and Counter for removing the duplicates and maintaining order as:

from collections import OrderedDict, Counter

class OrderedCounter(Counter, OrderedDict):
    pass

with open('/tmp/hello.txt') as f:
    ordered_counter = OrderedCounter(f.readlines())

new_list = [k.strip() for k, v in ordered_counter.items() if v==1]
# ['12474', '54675', '74564']
Moinuddin Quadri
  • 46,825
  • 13
  • 96
  • 126
1

Use count() to check the number of occurrences of each element in the list, and remove each occurrence using index() in a for loop:

with open("file.txt","r")as f:
    data=f.readlines()
    for x in data:
        if data.count(x)>1:   #if item is a duplicate
            for i in range(data.count(x)):  
                data.pop(data.index(x))  #find indexes of duplicates, and remove them 
with open("file.txt","w")as f:
    f.write("".join(data)) #write data back to file as string

file.txt:

12474
54675
74564
Trelzevir
  • 767
  • 6
  • 11
0

Not the most efficient since it uses count but simple:

with open("input.txt") as f:
    orig = list(f)
    filtered = [x for x in orig if orig.count(x)==1]

print("".join(filtered))
  • convert the file to a list of lines
  • create list comprehension: keep only lines occurring once
  • print the list (joining with empty string since linefeeds are still in the lines)
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219