I have a file, index.html
, containing data like this:
<li><a href="/battered-fried-chicken-breast-no-skin.html">battered fried chicken breast, no skin</a></li>
<li><a href="/bbq-short-ribs-with-sauce.html">bbq short ribs with sauce</a></li>
<li><a href="/bbq-spareribs-&-sauce-eat-lean-&-fat.html">bbq spareribs & sauce (eat lean & fat)</a></li>
<li><a href="/bbq-spareribs-&-sauce-eat-lean-only.html">bbq spareribs & sauce (eat lean only)</a></li>
I need to strip the & symbols from the URLs, such that "/bbq-spareribs-&-sauce-eat-lean-&-fat.html"
becomes "/bbq-spareribs--sauce-eat-lean--fat.html"
. However, I do not wish to remove the & symbol from the parts of the file which are not URLs, such as the text of the link, bbq spareribs & sauce (eat lean & fat)
.
How would I accomplish this on a standard Linux install? It doesn't matter to me what specific tool/language is used to achieve the result so long as it works.