0

I have one.txt file with data:

 822.25 111.48 883.59 256.68
 822.25 111.48 883.59 256.68
 8.6 123.68 467.27 276.69
 0.0 186.77 165.62 375.0
 0.0 186.77 165.62 375.0
 724.76 177.83 923.52 316.78
 724.76 177.83 923.52 316.78
 724.76 177.83 923.52 316.78
 724.76 177.83 923.52 316.78
 724.76 177.83 923.52 316.78
 438.03 148.5 540.88 198.54
 511.99 170.97 571.74 215.81
 511.99 170.97 571.74 215.81

For lines that are repeated I want to write only one line for them. For instance:

724.76 177.83 923.52 316.78

is repeated 5 times, I want to write it only one time and do the same thing for other lines as well and write new data to a file.

My code:

with open('one.txt', 'r') as infile:
    with open('output.txt', 'w') as outfile:
        for line in infile:
            #how to do this?
            if line are repeated remove and replace them with only one line
               outfile.write(line)
martineau
  • 119,623
  • 25
  • 170
  • 301
  • 1
    Do you mean "repeated immediately after" or "repeated anywhere"? For the former, it's a simple loop with "last_input" as a loop variable. For the latter, it could be done with a sorted map, or, if you don't care about the order of the uniquified lines, you could sort the lines first and then use the "repeated immediately after" version of the code. – Mark Lavin Nov 09 '21 at 21:19
  • 1
    If it is one-off exercise you could use linux command. sort yourfile.txt | uniq -u – sydadder Nov 09 '21 at 21:21

4 Answers4

2

This can be done with the linux utility uniq, simply type in the terminal uniq <infile.txt >outfile.txt. Here symbols > and < tell shell to use provided files instead of standard input and output.

To reinvent this utility in python one could write:

with open('one.txt', 'r') as infile:
    with open('output.txt', 'w') as outfile:
        prev_line = infile.readline()  # read first line
        outfile.write(prev_line)
        for line in infile:
            if line != prev_line:  # if the line is a different one, print it
                prev_line = line
                outfile.write(line)
zaabson
  • 151
  • 12
1

you probably want itertools.groupby, without a comparison function it just returns a 'group' per unique line so you can just skip the group entirely and just write one line from each grouping.

with open('one.txt', 'r') as infile:
    with open('output.txt', 'w') as outfile:
        for line, _ in itertools.groupby(infile):
            outfile.write(line)

This would only replace groups that occur in the same area, if repeated lines may appear in multiple places in the file (e.g. a a b a would write a b a) then you can keep a set of lines you have seen already

seen_lines = set()
with open('one.txt', 'r') as infile:
    with open('output.txt', 'w') as outfile:
        for line in infile:
            if line in seen_lines:
                continue
            outfile.write(line)
            seen_lines.add(line)
Tadhg McDonald-Jensen
  • 20,699
  • 5
  • 35
  • 59
1

You may leverage itertools.groupby() as in:

from itertools import groupby

data = """
822.25 111.48 883.59 256.68
 822.25 111.48 883.59 256.68
 8.6 123.68 467.27 276.69
 0.0 186.77 165.62 375.0
 0.0 186.77 165.62 375.0
 724.76 177.83 923.52 316.78
 724.76 177.83 923.52 316.78
 724.76 177.83 923.52 316.78
 724.76 177.83 923.52 316.78
 724.76 177.83 923.52 316.78
 438.03 148.5 540.88 198.54
 511.99 170.97 571.74 215.81
 511.99 170.97 571.74 215.81
"""

ones = [key 
        for key, _ in groupby(
            (line.strip() for line in data.split("\n") if line)
        )]
print(ones)

Which would yield

[
 '822.25 111.48 883.59 256.68',
 '8.6 123.68 467.27 276.69', 
 '0.0 186.77 165.62 375.0', 
 '724.76 177.83 923.52 316.78', 
 '438.03 148.5 540.88 198.54', 
 '511.99 170.97 571.74 215.81'
]
Jan
  • 42,290
  • 8
  • 54
  • 79
-2

Maybe try to make an If statement where if it is printed once it's okay then maybe put Elif to delete any string that is similar or the same.

Viridian
  • 25
  • 3
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Nov 09 '21 at 22:40