I have a html file that I process using a bash script and want to remove empty tables. The file is generated from a sql statement, but contains the table header when no records are found. I want to remove the header where no records are found.
<table border="1">
<caption>Table with data</caption>
<tr>
<th align="center">type</th>
<th align="center">column1</th>
<th align="center">column2</th>
<th align="center">column3</th>
<th align="center">column4</th>
</tr>
Data rows exists here
</table>
<table border="1">
<caption>Empty Table To Remove</caption>
<tr>
<th align="center">type</th>
<th align="center">column1</th>
<th align="center">column2</th>
<th align="center">column3</th>
<th align="center">column4</th>
<th align="center">column5</th>
<th align="center">column6</th>
<th align="center">column7</th>
</tr>
</table>
<table border="1">
<caption>Table with data</caption>
<tr>
<th align="center">type</th>
<th align="center">column1</th>
<th align="center">column2</th>
<th align="center">column3</th>
<th align="center">column4</th>
</tr>
Data rows exists here
</table>
I tried to use a combination of grep and sed to remove the empty table. I was able to accomplish this when the tables contained an equal number of columns. I am having issues now that I have tables with a different number of columns.
When the table had an equal number of columns, I was able to loop through based on the caption, do a count and then remove. This is not working since the number of columns vary.