2

I am trying to format text in a .txt file. The content is also in an xml, but I copied to a text file and I am trying to for it. It is currently set up like:

Pufferfish  Ocean
Anchovy Ocean
Tuna    Ocean
Sardine Ocean
Bream   River
Largemouth_Bass Mountain_Lake
Smallmouth_Bass River
Rainbow_Trout   River

I am trying to figure out how to open the file and for each line convert it to:

('Pufferfish', 'Ocean')

Is there a way to do this?

This is what I am trying so far, which I know is wrong, and I am trying to look up the correct syntax and way change 'str':

f1 = open('fish.txt', 'r')
f2 = open('fish.txt.tmp', 'w')

for line in f1:
    f2.write(line.replace(' ', ','))
    for word in line:
        f2.write(word.append('(', [0]))
        f2.write(word.append(')', (len(word))))
f1.close()
f2.close()
martineau
  • 119,623
  • 25
  • 170
  • 301
Babeeshka
  • 105
  • 1
  • 4
  • 21
  • 3
    What did you try and what doesn't work? – vallentin Apr 18 '17 at 00:28
  • 1
    Are all of the words in a single line or different lines? – stanleyli Apr 18 '17 at 00:28
  • See [this](http://stackoverflow.com/questions/3277503/how-do-i-read-a-file-line-by-line-into-a-list), while it may or may not give you a ready made solution to your issue, it will certainly teach how to (and how not to) read a file and put the contents into a container. If all your elements are on the same line, you'd just need to add a call to `split` and convert the lists to tuples, if you so wish. – Paul Rooney Apr 18 '17 at 00:31
  • There are two words per line, such as "Pufferfish Ocean". I am trying to write something, and I will upload it. But it has been unsuccessful so far. – Babeeshka Apr 18 '17 at 00:32

4 Answers4

4

you may need something like:

with open('input.txt') as input, open("output.txt", "a") as output:
    for line in input:
        output.write(str(tuple(line.split()))+"\n")

Output:

('Pufferfish', 'Ocean')
('Anchovy', 'Ocean')
('Tuna', 'Ocean')
('Sardine', 'Ocean')
('Bream', 'River')
('Largemouth_Bass', 'Mountain_Lake')
('Smallmouth_Bass', 'River')
('Rainbow_Trout', 'River')
Pedro Lobito
  • 94,083
  • 31
  • 258
  • 268
3

A variation to Pedro Lobito's answer using str.format for more precise control of the output string format:

with open('old.txt') as f_in, open("new.txt", "a") as f_out:
    for line in f_in:
        a, b = line.split()
        f_out.write("('{}', '{}')\n".format(a, b))

Version with comma at the end of each line except the last line:

with open('old.txt') as f_in, open("new.txt", "a") as f_out:
    for n, line in enumerate(f_in):
        a, b = line.split()
        if n > 0:
            f_out.write(",\n")
        f_out.write("('{}', '{}')".format(a, b))
    # do not leave the last line without newline ("\n"):
    f_out.write("\n")

enumerate does this: list(enumerate(["a", "b", "c"])) returns [(0, "a"), (1, "b"), (2, "c")]

Messa
  • 24,321
  • 6
  • 68
  • 92
  • Is there a way to add a comma to the end of each line except the last? Could you do: `new_file.write("('{}', '{}', ',')\n".format(a, b, c))`? – Babeeshka Apr 18 '17 at 00:43
  • 1
    Add comma to the end of each line except last? Sometimes it is better to add comma _before_ the line :) `",\n('{}', '{}')".format(a, b)` – Messa Apr 18 '17 at 00:45
  • Of course there would be a line only with a comma at the beginning... Can be fixed with some `if`. – Messa Apr 18 '17 at 00:46
  • That's clever, worked better than what I was trying to do with it just now. That first one can always be deleted manually in no time. I'll see what I can do with an 'if' statement about that though. Thanks, Messa! – Babeeshka Apr 18 '17 at 00:47
  • 1
    @Babeeshka I've updated the answer with code example – Messa Apr 18 '17 at 00:48
1

Probably the most important tidbit you should learn from this exercise is: an str object does not have any method like append() or insert() or the like. This is because str objects- strings- are immutable objects in Python. You cannot CHANGE a string. You can only use it to make another new string (and throw away the old one).

Since your file format looks like the first space only appears at the locations where you want your comma inserted, you could use the replace() method like you are trying to do, like so:

line = line.replace(' ', ', ', 1)

Note that the replace() method on a string does not modify the original string; instead, it returns a new string. That is why you have to use the line = part at the beginning of the line, thereby replacing the old string.

The third argument- the number 1- at the end makes sure that only the first space in the line is affected. If there are multiple spaces or any hanging spaces at the end, they will not be replaced.

Rick
  • 43,029
  • 15
  • 76
  • 119
1

There are shorter ways of writing it, but here is one way to solve your problem of taking a simple text file and writing as you asked. Save your text file as something like ocean.txt

output = ""
with open("ocean.txt" ) as f:
    for line in f:
        line.strip()
        line_fmt = ",".join( '"' + item + '"' for item in line.split())
        output +=  ( "({})".format( line_fmt ) ) + "\n"

print(output)
# To save as a file:
with open('formatted.txt', 'w') as outfile:
    outfile.write( output)

This opens a text file, and reads in each line. Then it strips off the newline characters. Then it splits the line apart, and adds " + item + ". Then it take this word in quotes, and joines all of them together with a comma

",".join(

Last, it adds this string to the overall output, and prints it out at the end.

Nik Roby
  • 144
  • 4