2

I've found a way to replace words in a file inputfile.txt with matching words from substitutes.txt in bash with sed.

For example:
File substitutes.txt contains word pairs to be replaced:
Good=ok
sat=cat

I use the following code:

sed -e 's/^/s%/' -e 's/=/%/' -e 's/$/%g/' substitutes.txt |
sed -f - inputfile.txt >outputfile.txt

This replacement ist a bit aggressive and replaces Goodyear with okyear or saturday with caturday, but it should leave those words alone.

Here's the question:
How can word boundaries (\b) be implemented into this replacement, so that only words (and not parts of words) will be replaced?

Inian
  • 80,270
  • 14
  • 142
  • 161
kabauter
  • 99
  • 8

1 Answers1

0

If your search and replace list only contains letter words, just enclosing the LHS with \b word boundaries should work:

sed -e 's/^/s%\\b/' -e 's/=/\\b%/' -e 's/$/%g/' substitutes.txt
#             ^^^           ^^^

The list of regex commands will then look like

root@ip-172-30-0-77:/home/ubuntu# sed -e 's/^/s%\\b/' -e 's/=/\\b%/' -e 's/$/%g/' substitutes.txt
s%\bGood\b%ok%g
s%\bsat\b%cat%g

Note you may add more preprocessing in case the terms may contain special chars, see Is it possible to escape regex metacharacters reliably with sed. Then, you will have to also reconsider = as a delimiter between the search-replacement pair (a mulitcharacter separator is a better bet).

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • In fact the replacements only contain letter words, so your example works pretty well. Thanks for the good hint about escaping! – kabauter Sep 03 '19 at 08:07
  • 1
    @Kabauter Glad it does, I had to add that bit since dynamic regex creation is fraught with potential bottlenecks. – Wiktor Stribiżew Sep 03 '19 at 08:11