I am attempting to replace:
<td id="logo_divider"><a href="http://www.the-site.com"><img src=
"/ART/logo.140.gif" width="140" height="84" alt="logo" border=
"0" id="logo" name="logo" /></a></td>
with:
<td id="logo_divider"><span itemscope itemtype="http://schema.org/Organization"><a itemprop="url" href="http://www.the-site.com"><img itemprop="logo" src=
"/ART/logo.140.gif" width="140" height="84" alt="logo" border=
"0" id="logo" name="logo" /></a></span></td>
The sed command I've written:
sed -E s#\(\<td id=\"logo_divider\"\>\)\(\<a \)\(href=\"http://www\.the-site\.com\"\>\<img \)\(src=\n\"/ART/logo\.140\.gif\".*?\n.*?\>\)#\1\<span itemscope itemtype=\"http://schema\.org/Organization\"\>\2itemprop=\"url\"\3itemprop=\"logo\"\4\</span\>\5#g default.ctp
There are two problems. The first is the command fails with:
sed: 1: "s#(<td": unterminated substitute pattern
The second is that, even if it were to succeed, matching needs to be robust to line breaks. A more robust solution would first remove any line breaks between:
<td id="logo_divider">
and:
</td>
Then execute the replacement against the cleaned file. Something like:
sed -E s#\n##g | ...