15

How can you replace a string match inside a file with the given replacement, recursively, inside a given directory and its subdirectories?

Pseudo-code:

import os
import re
from os.path import walk
for root, dirs, files in os.walk("/home/noa/Desktop/codes"):
        for name in dirs:
                re.search("dbname=noa user=noa", "dbname=masi user=masi")
                   // I am trying to replace here a given match in a file
smci
  • 32,567
  • 20
  • 113
  • 146
Léo Léopold Hertz 준영
  • 134,464
  • 179
  • 445
  • 697

5 Answers5

26

Put all this code into a file called mass_replace. Under Linux or Mac OS X, you can do chmod +x mass_replace and then just run this. Under Windows, you can run it with python mass_replace followed by the appropriate arguments.

#!/usr/bin/python

import os
import re
import sys

# list of extensions to replace
DEFAULT_REPLACE_EXTENSIONS = None
# example: uncomment next line to only replace *.c, *.h, and/or *.txt
# DEFAULT_REPLACE_EXTENSIONS = (".c", ".h", ".txt")

def try_to_replace(fname, replace_extensions=DEFAULT_REPLACE_EXTENSIONS):
    if replace_extensions:
        return fname.lower().endswith(replace_extensions)
    return True


def file_replace(fname, pat, s_after):
    # first, see if the pattern is even in the file.
    with open(fname) as f:
        if not any(re.search(pat, line) for line in f):
            return # pattern does not occur in file so we are done.

    # pattern is in the file, so perform replace operation.
    with open(fname) as f:
        out_fname = fname + ".tmp"
        out = open(out_fname, "w")
        for line in f:
            out.write(re.sub(pat, s_after, line))
        out.close()
        os.rename(out_fname, fname)


def mass_replace(dir_name, s_before, s_after, replace_extensions=DEFAULT_REPLACE_EXTENSIONS):
    pat = re.compile(s_before)
    for dirpath, dirnames, filenames in os.walk(dir_name):
        for fname in filenames:
            if try_to_replace(fname, replace_extensions):
                fullname = os.path.join(dirpath, fname)
                file_replace(fullname, pat, s_after)

if len(sys.argv) != 4:
    u = "Usage: mass_replace <dir_name> <string_before> <string_after>\n"
    sys.stderr.write(u)
    sys.exit(1)

mass_replace(sys.argv[1], sys.argv[2], sys.argv[3])

EDIT: I have changed the above code from the original answer. There are several changes. First, mass_replace() now calls re.compile() to pre-compile the search pattern; second, to check what extension the file has, we now pass in a tuple of file extensions to .endswith() rather than calling .endswith() three times; third, it now uses the with statement available in recent versions of Python; and finally, file_replace() now checks to see if the pattern is found within the file, and doesn't rewrite the file if the pattern is not found. (The old version would rewrite every file, changing the timestamps even if the output file was identical to the input file; this was inelegant.)

EDIT: I changed this to default to replacing every file, but with one line you can edit to limit it to particular extensions. I think replacing every file is a more useful out-of-the-box default. This could be extended with a list of extensions or filenames not to touch, options to make it case insensitive, etc.

EDIT: In a comment, @asciimo pointed out a bug. I edited this to fix the bug. str.endswith() is documented to accept a tuple of strings to try, but not a list. Fixed. Also, I made a couple of the functions accept an optional argument to let you pass in a tuple of extensions; it should be pretty easy to modify this to accept a command-line argument to specify which extensions.

steveha
  • 74,789
  • 21
  • 92
  • 117
  • I like the logic of your code in separating the logical parts to functions. – Léo Léopold Hertz 준영 Oct 20 '09 at 23:14
  • In `file_replace` I had to change `os.rename` to `shutil.move` for it to work for me in Windows. – paul Jul 18 '12 at 18:26
  • On my system (python 2.7.5), I got `TypeError: endswith first arg must be str, unicode, or tuple, not list`. Changing the list to a tuple worked e.g. `[".fudge", "pancake"]` -> `(".fudge", "pancake")`. – asciimo Oct 29 '13 at 00:14
  • 1
    @asciimo, thank you for pointing that out. Usually I'm good about testing code before posting it here, but I guess I was sloppy when I wrote that! Fixed now. – steveha Oct 29 '13 at 01:14
  • 2
    Two fixes for Windows: 1) Unindent `os.rename(out_fname, fname)` so that it is outside the `with` scope. 2) Precede this line with `os.remove(fname)` so that `rename()` succeeds. – dolphin Nov 14 '14 at 12:53
9

Do you really need regular expressions?

import os

def recursive_replace( root, pattern, replace )
    for dir, subdirs, names in os.walk( root ):
        for name in names:
            path = os.path.join( dir, name )
            text = open( path ).read()
            if pattern in text:
                open( path, 'w' ).write( text.replace( pattern, replace ) )
kurosch
  • 2,292
  • 16
  • 17
4

Of course, if you just want to get it done without coding it up, use find and xargs:

find /home/noa/Desktop/codes -type f -print0 | \
xargs -0 sed --in-place "s/dbname=noa user=noa/dbname=masi user=masi"

(And you could likely do this with find's -exec or something as well, but I prefer xargs.)

retracile
  • 12,167
  • 4
  • 35
  • 42
  • 3
    The find and sed solution is fine when you have a simple task, such as "replace the string in every *.txt file". Once you have a more complicated set of files to match, and if you have multiple replacements to do, the Python solution really wins. – steveha Oct 21 '09 at 02:52
3

This is how I would find and replace strings in files using python. This is a simple little function that will recursively search a directories for a string and replace it with a string. You can also limit files with a certain file extension like the example below.

import os, fnmatch
def findReplace(directory, find, replace, filePattern):
    for path, dirs, files in os.walk(os.path.abspath(directory)):
        for filename in fnmatch.filter(files, filePattern):
            filepath = os.path.join(path, filename)
            with open(filepath) as f:
                s = f.read()
            s = s.replace(find, replace)
            with open(filepath, "w") as f:
                f.write(s)

This allows you to do something like:

findReplace("some_dir", "find this", "replace with this", "*.txt")
David Sulpy
  • 2,277
  • 2
  • 19
  • 22
2

this should work:

import re, os
import fnmatch
for path, dirs, files in os.walk(os.path.abspath(directory)):
       for filename in fnmatch.filter(files, filePattern):
           filepath = os.path.join(path, filename)
           with open("namelist.wps", 'a') as out:
               with open("namelist.wps", 'r') as readf:
                   for line in readf:
                       line = re.sub(r"dbname=noa user=noa", "dbname=masi user=masi", line)
                       out.write(line)
Léo Léopold Hertz 준영
  • 134,464
  • 179
  • 445
  • 697
Nepomuzem
  • 21
  • 1
  • I forgot the:import fnmatch – Nepomuzem Feb 15 '16 at 15:57
  • 1
    Welcome to Stackoverflow! You can edit your answer by clicking `edit` at the bottom of the body of your question. Please, see if the edit is correct. I like your `with open("namelist.wps", 'r') as readf` which is much more clearer than breaking the thing to two lines. The last for -loop is also clear. Very good additions! – Léo Léopold Hertz 준영 Feb 15 '16 at 16:54