34

I'm new to Python. I want to be able to open a file and replace every instance of certain words with a given replacement via Python. as an example say replace every word 'zero' with '0', 'temp' with 'bob', and say 'garbage' with 'nothing'.

I had first started to use this:

for line in fileinput.input(fin):
        fout.write(line.replace('zero', '0'))
        fout.write(line.replace('temp','bob'))
        fout.write(line.replace('garbage','nothing'))

but I don't think this is an even remotely correct way to do this. I then thought about doing if statements to check if the line contains these items and if it does, then replace which one the line contains, but from what I know of Python this also isn't truly an ideal solution. I would love to know what the best way to do this. Thanks ahead of time!

shadonar
  • 1,114
  • 3
  • 16
  • 40
  • I'll be doing a lot more, but this would give me the best practice for doing this sort of thing. – shadonar Oct 26 '12 at 14:52
  • 1
    In your current approach, every input line is written to the output three times. Is that what you intended to do? – Junuxx Oct 26 '12 at 14:52
  • 1
    Also, you're missing an apostrophe after `'bob`. – Junuxx Oct 26 '12 at 14:53
  • thanks about the apostrophe. and @Junuxx I did not intend to do this (my stupidity is showing). as mentioned I'm new to Python and from the code experience i have with other languages, reading line by line is standard. Is this the same with Python or is there a better way to search through a file and replace those particular words with others? – shadonar Oct 26 '12 at 14:59
  • 1
    related: [How to search and replace text in a file using Python?](http://stackoverflow.com/q/17140886/4279) – jfs Sep 17 '14 at 07:49

7 Answers7

84

This should do it

replacements = {'zero':'0', 'temp':'bob', 'garbage':'nothing'}

with open('path/to/input/file') as infile, open('path/to/output/file', 'w') as outfile:
    for line in infile:
        for src, target in replacements.items():
            line = line.replace(src, target)
        outfile.write(line)

EDIT: To address Eildosa's comment, if you wanted to do this without writing to another file, then you'll end up having to read your entire source file into memory:

lines = []
with open('path/to/input/file') as infile:
    for line in infile:
        for src, target in replacements.items():
            line = line.replace(src, target)
        lines.append(line)
with open('path/to/input/file', 'w') as outfile:
    for line in lines:
        outfile.write(line)

Edit: If you are using Python 2.x, use replacements.iteritems() instead of replacements.items()

Mayou36
  • 4,613
  • 2
  • 17
  • 20
inspectorG4dget
  • 110,290
  • 27
  • 149
  • 241
  • You probably want to move `outfile.write(line)` out of the inner loop...(although this is a literal translation of OP's code) – mgilson Oct 26 '12 at 15:03
  • What do you really gain here using by using dictionary (as opposed to a list of 2-tuples)? That's effectively what you get with the `iteritems()` anyway ... – mgilson Oct 26 '12 at 15:06
  • 1
    @Thales: thanks for the bug report (esp. after so long). I've edited my answer – inspectorG4dget Jul 19 '13 at 18:40
  • 2
    does this solution really work? the file is overwritten immediately when calling `outfile = open('path/to/input/file', 'w')` so `line` is always empty – Nimrod Dayan Jun 18 '14 at 12:58
  • @inspectorG4dget I really don't think it's fixed at all. The line that CodePond.org has mentioned still breaks it – user1567453 Apr 09 '15 at 07:55
  • @user1567453: Look closely at the filepaths, and you'll notice that `infile` and `outfile` refer to two different file paths. This is how it was fixed. Please let me know if I've missed something – inspectorG4dget Apr 10 '15 at 01:24
  • 1
    How do I do this using only infile? I don't want a second file – Heetola May 29 '15 at 14:18
  • I think the append needs to be inside the for loop, otherwise lines will only consist of the last line. – coyot Aug 18 '20 at 16:26
9

If your file is short (or even not extremely long), you can use the following snippet to replace text in place:

# Replace variables in file
with open('path/to/in-out-file', 'r+') as f:
    content = f.read()
    f.seek(0)
    f.truncate()
    f.write(content.replace('replace this', 'with this'))
John Calcote
  • 793
  • 1
  • 8
  • 15
7

I might consider using a dict and re.sub for something like this:

import re
repldict = {'zero':'0', 'one':'1' ,'temp':'bob','garage':'nothing'}
def replfunc(match):
    return repldict[match.group(0)]

regex = re.compile('|'.join(re.escape(x) for x in repldict))
with open('file.txt') as fin, open('fout.txt','w') as fout:
    for line in fin:
        fout.write(regex.sub(replfunc,line))

This has a slight advantage to replace in that it is a bit more robust to overlapping matches.

sajjadG
  • 2,546
  • 2
  • 30
  • 35
mgilson
  • 300,191
  • 65
  • 633
  • 696
  • If OP wants to do absolute string replacements, `re` might be overkill… or am I missing something? – inspectorG4dget Oct 26 '12 at 15:01
  • 3
    @inspectorG4dget -- If there are overlapping matches, it's necessary. (`line.replace('bob','robert').replace('robert','foo')`) changes `bob` to `foo` which might not be desireable, but you avoid that with `re`. Also, since it's all done in 1 go, it might be more efficient (unlikely to matter for small files, but important for big ones). – mgilson Oct 26 '12 at 15:03
5

The essential way is

  • read(),
  • data = data.replace() as often as you need and then
  • write().

If you read and write the whole data at once or in smaller parts is up to you. You should make it depend on the expected file size.

read() can be replaced with the iteration over the file object.

glglgl
  • 89,107
  • 13
  • 149
  • 217
3

Faster way of writing it would be...

finput = open('path/to/input/file').read()
out = open('path/to/input/file', 'w')
replacements = {'zero':'0', 'temp':'bob', 'garbage':'nothing'}
for i in replacements.keys():
    finput = finput.replace(i, replacements[i])
out.write(finput)
out.close

This eliminated a lot of the iterations that the other answers suggest, and will speed up the process for longer files.

ouroboros1
  • 9,113
  • 3
  • 7
  • 26
Matt Olan
  • 1,911
  • 1
  • 18
  • 27
  • 1
    But it reads the whole file (and essentially duplicates it for each replacement) -- which is a big downside for large files. – mgilson Oct 26 '12 at 15:10
0

Reading from standard input, write 'code.py' as follows:

import sys

rep = {'zero':'0', 'temp':'bob', 'garbage':'nothing'}

for line in sys.stdin:
    for k, v in rep.iteritems():
        line = line.replace(k, v)
    print line

Then, execute the script with redirection or piping (http://en.wikipedia.org/wiki/Redirection_(computing))

python code.py < infile > outfile
satomacoto
  • 11,349
  • 2
  • 16
  • 13
-1

This is a short and simple example I just used:

If:

fp = open("file.txt", "w")

Then:

fp.write(line.replace('is', 'now'))
// "This is me" becomes "This now me"

Not:

line.replace('is', 'now')
fp.write(line)
// "This is me" not changed while writing
AmazingDayToday
  • 3,724
  • 14
  • 35
  • 67