3

I am fairly new to this stuff, and I need a shell file to loop through all ".xml" files in a folder, and do some text replacements. So far I have come up with this:

sed "s/old_text/new_text/g" testfile.xml -i

However, I want this to run on all xml files in the current folder, not just on "testfile.xml". Furthermore, how can I make a backup of the original file ?

Any input is more than welcomed! Thankls a lot!

horace_vr
  • 3,026
  • 6
  • 26
  • 48

3 Answers3

9

To run sed on all the xml files, just specify the wildcard

sed "s/old_text/new_text/g" *.xml -i

To create a backup, just specify the extension after -i:

sed "s/old_text/new_text/g" *.xml -i~

Note that's usually better to use XML aware tools to handle XML.

choroba
  • 231,213
  • 25
  • 204
  • 289
  • 1
    ...and just pray that neither `old_text` nor `new_text` contain any of `$, /, \1, &, ?, *, (, ), [, ], \+, ., etc`. Just be aware that sed does NOT operate on strings, it operates on regexps with a restricted character set. See http://stackoverflow.com/questions/29613304/is-it-possible-to-escape-regex-metacharacters-reliably-with-sed/29626460#29626460 and consider using a tool that does operate on strings, e.g. awk. You can reduce the risk of breakage slightly by using single quotes instead of double around the command. – Ed Morton May 19 '15 at 12:57
  • @EdMorton: True, but awk doesn't parse XML, either :) – choroba May 19 '15 at 12:58
  • It does if you use the XML library, see https://www.gnu.org/software/gawk/manual/html_node/gawkextlib.html, but mainly I just wanted to give the self-declared newbie a heads up that he will not be using strings with sed. – Ed Morton May 19 '15 at 12:59
  • @EdMorton: Interesting, I didn't know they exist. Could you provide links? – choroba May 19 '15 at 13:01
  • just updated my previous comment to include a link. See also http://sourceforge.net/projects/gawkextlib/ – Ed Morton May 19 '15 at 13:03
  • I have about 150 substitutions that I need to make, and yes, they involve ampersants, hashtag, semicolon and some others. I tried via batch file, but I could not find an easy workable solution. I am using sed because this is what I found by googling around, but if you have a safer (easy) idea, I would be delighted to research it. Just point me in a direction :) – horace_vr May 19 '15 at 16:17
  • See my first comment for direction. Without sample input/output there's not much more to be said and you've already selected an answer so I doubt if anyone's looking at this any more anyway. It was dumb luck that I happened across this again just now while I was looking for something else. – Ed Morton May 19 '15 at 18:50
3

For all .xml files that lie in the current directory:

sed -i.bak 's/old_text/new_text/g' *.xml

To recurse into subdirectories, combine with find:

find . -name '*.xml' -exec sed -i.bak 's/old_text/new_text/g' '{}' \;

The backup files will end in .xml.bak this way (the parameter to -i is appended to the original file name).

Wintermute
  • 42,983
  • 5
  • 77
  • 80
2

a practical shell script, if you intend to sanitize a bunch of files with a number of measures – things that will get a little impractical on a single line...

# only take files form certain subfolders and certain extensions

# be careful to not tamper with .git or .svn folders 
# - thus excluding all hidden folders as an extra precaution
# - also tampering with node_modules is a bad idea

FILES=$(find . -type f -regextype posix-extended     \
    -regex "^\./(public|source)/.*\.(scss|js)$"         \
    -not -regex ".*\/(\.|node_modules).*")

for f in $FILES
do
echo "Processing $f file..."

# all files: prune trailing whitespace on each file.
sed -i 's/ *$//' $f

if [[ $f =~ \.js$ ]]; then
    echo "javascript file!"
    # DO stuff
fi

if [[ $f =~ \.scss$ ]]; then
    echo "scss file!"
    # \b whole word matching – stackoverflow.com/a/1032039/444255
    sed -i 's/\#000\b/black/g' $f
    sed -i 's/\#000000\b/black/g' $f
    sed -i 's/\#fff\b/white/g' $f
    sed -i 's/\#ffffff\b/white/g' $f
fi

done

caveat: with great power comes great responsibility, and mass-replacement means great power...

Frank N
  • 9,625
  • 4
  • 80
  • 110