1

I am trying to replace multiple words in a txt file.

An example with two lines would be

phone number: 123 addr: xyz
phone no: 456 home address: abc 

Lets say that I'd like to replace "phone number" and "phone no" with phonenum and "addr" & "address" with address1.

Currently I only know how to do it with running multiple sed commands and I'm looking for some guidance to find a more efficient way.

Thank you!

Ryszard Czech
  • 18,032
  • 4
  • 24
  • 37
Franky
  • 13
  • 2

1 Answers1

1

Use Perl, with the hash whose keys are the words to be replaced, and values - their desired replacements. The string of keys joined on a pipe serves as the pattern in substitution operator s///g, with the /g modifier to enable multiple substitutions per line.

echo "phone number: 123 addr: xyz\nphone no: 456 home address: abc" > in.txt

perl -lpe '
BEGIN {
    %re = (
        q{phone number}  => q{phonenum},
        q{phone no}      => q{phonenum},
        q{addr}          => q{address1},
        q{address}       => q{address1},
    );
    $re_str = join q{|}, keys %re; # "phone number|phone no|addr|address"
}
s/\b($re_str)\b/$re{$1}/g;
' in.txt > out.txt

Output in file out.txt:

phonenum: 123 address1: xyz
phonenum: 456 home address1: abc

The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-p : Loop over the input one line at a time, assigning it to $_ by default. Add print $_ after each loop iteration.
-l : Strip the input line separator ("\n" on *NIX by default) before executing the code in-line, and append it when printing.

q{...} : an alternative method to specify single-quoted strings ('...' are not used here because they have to be escaped within single-quoted Perl one-liner).

s/\b($re_str)\b/$re{$1}/g; : Parenthesis around $re_str capture the pattern into $1 variable. \b means word break, that is, either the start or the end of a word. Adding \b is needed to prevent replacing, for example, home address with home address1ess, depending on the order of evaluation.

SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
perldoc perlre: Perl regular expressions (regexes)
perldoc perlrequick: Perl regular expressions quick start

Timur Shtatland
  • 12,024
  • 2
  • 30
  • 47
  • 2
    It may convert `home address` into `home address1ess` depending on the order of evaluation. – tshiono Oct 19 '20 at 05:35
  • @tshiono Thank you for pointing out the bug in the regex that caused, for example, replacing `home address` with `home address1ess`, depending on the order of evaluation. Fixed by adding `\b`. – Timur Shtatland Oct 19 '20 at 13:18
  • 1
    Thank you for the update. Now it works fine! I've ++ed for your nice answer. – tshiono Oct 20 '20 at 01:35