0

I have an html file that uses the tag <table> multiple times throughout the script. I want to use sed to grab and print to console just the first instance that the <table> tag is used.

This is a snippet of the html that I am trying to parse. There are over 10 instances of the <table> tag.

My HTML:

<table border="0" class="first">
  <tr class="a">
     <th>Tests</th>
     <th>Errors </th>
  </tr>
  <tr class="b">
     <td>32</td>
     <td>0</td>
  </tr>
</table>
<table border="0" class="second">
  <tr class="c">
     <th>Tests</th>
     <th>Errors </th>
  </tr>
  <tr class="d">
     <td>32</td>
     <td>0</td>
  </tr>
</table>

Here is the code I'm running

sed -n 's:.*<table\(.*\)</table>.*:\1:p' surefire-report.html

I want to be able to grab everything within the first <table> div. So output should be just this:

<table border="0" class="first">
  <tr class="a">
     <th>Tests</th>
     <th>Errors </th>
  </tr>
  <tr class="b">
     <td>32</td>
     <td>0</td>
  </tr>
</table>
Mr Lister
  • 45,515
  • 15
  • 108
  • 150
dooge
  • 51
  • 1
  • 9

1 Answers1

0

If I understand you correctly, it should work...

FILE=surefire-report.html

START=$(grep -n -m1  "<table" $FILE | cut -d ':' -f1)
END=$(grep -n -m1 "</table" $FILE | cut -d ':' -f1)

sed -n -e "$START,$END p" $FILE