0

Having difficulty getting a sed search and replace working with an html file.

I have multiple sections that look like this:

<TABLE class="cattable">   
  <TBODY>
    <TR>
      <TH colspan="2">Header</TH></TR>
    <TR>
      <TD>Random Amount of Data</TD>
      <TD>3</TD></TR>
    <TR>
      <TD>Moar Data</TD>
      <TD>3</TD></TR>
    <TR>
      <TD>Yup, More</TD>
      <TD>4</TD></TR></TBODY></TABLE>

I need to:

replace with xxxxFOOxxxx:

<TABLE class="cattable">
  <TBODY>
    <TR>
     <TH colspan="2">

keep this:

Header

Replace this with yyyyFOOyyyy:

</TH></TR>

Keep this:

<TR>
   <TD>Random Amount of Data</TD>
   <TD>3</TD></TR>
<TR>
   <TD>Moar Data</TD>
   <TD>3</TD></TR>
<TR>
   <TD>Yup, More</TD>
   <TD>4</TD></TR>

replace this with zzzzFOOzzzz:

</TBODY></TABLE>

Heres what I’ve tried in vim, but cant limit the greedy .* properly:

s:\(<TABLE class="cattable">\_s\s*<TBODY>\_s\s*<TR>\_s\s*<TH colspan="2">\)\(.*\)\(<\/TH><\/TR>\)\(\_.*[^<]*\)\(<\/TABLE>\):xxxxFOOxxxx\2yyyyFOOyyyy\4zzzzFOOzzz<br>:g

tia

Cabal
  • 47
  • 7
  • 1
    Regex is not a good choice of tools for this task. Also, vim does not use the standard lazy/greedy regex syntax. – jahroy Mar 20 '13 at 17:43
  • 2
    use one of the many good perl packages to deal with xml. Otherwise, if you insist on using a regexp, see http://stackoverflow.com/a/1732454/1841533 ^^ – Olivier Dulac Mar 20 '13 at 19:30

1 Answers1

1

Replace * with \{-} to get non-greedy match (same as *? in PCRE/Perl regexes). For more complicated cases you will have to use something with negative look-aheads/look-behinds: like \(\(<\/TH><\/TR>\)\@!.\)* in place of .*, here .\{-} is probably enough.

Note: won’t work in sed.

Note 2: vim is not using BRE or ERE (basic/extended regular expressions) like sed does, vim :s is not invoking any external programs (including sed) and neither your attempt is suitable for sed. Thus if you did not mean to ask how to do this in sed remove sed from tags.

ZyX
  • 52,536
  • 7
  • 114
  • 135
  • Thanks this helped. I was wanting to use sed because I wasnt having luck in vim. The vim code was just for reference about what I was trying to do. I do see the wisdom in what jahroy and Olivier say, and have moved on to python an beautifulsoup. – Cabal Mar 20 '13 at 23:32