I have a file containing a list of regular expressions and replacement literal strings in the following format :
OLD_REGEXP_1 NEW_STRING_1
OLD_REGEXP_2 NEW_STRING_2
...
I want to replace all of the strings that match OLD_REGEXP_X
with NEW_STRING_X
in multiple files *.txt
.
I believe that this is a common question and someone should have already done something similar before, but I just couldn't find an existing solution written in bash.
For example :
Tom Thompson
Billy Bill&Ted
goog1e\.com google.com
https?://www\.google\.com https://google.com
Input :
Tom and Billy are visiting http://www.goog1e.com
Expected output :
Thompson and Bill&Ted are visiting https://google.com
The major challenges are :
- The strings to be replaced are described by POSIX Extended Regular Expressions, not literal, and any character that is not a POSIX ERE metacharacter, including
/
which is often used as a regexp delimiter by some tools, must be treated as literal. - The replacement strings are literal and can contain any literal character, including chars like
&
and\1
that are often used as backreference metacharacters in replacement strings but must be literal in this case. - Replacements must occur in the order they appear in the mapping file so if we have A->B and B->C in that order in the mapping file and A appears in the text file that is to be changed, then the output will contain "C" in place of "A", not "B".