15

I am trying to use sed to replace every other occurrence of an html element of a file so I can make alternating color rows.

Here is what I have tried and it doesn't work.

sed 's/<tr valign=top>/<tr valign=top bgcolor='#E0E0E0'>/2' untitled.html
tim
  • 917
  • 3
  • 14
  • 24

4 Answers4

11

I'd solve it with awk:

awk '/<tr valign=top>/&&v++%2{sub(/<tr valign=top>/, "<tr valign=top bgcolor='#E0E0E0'>")}{print}' untitled.html 

First, it verifies if the line contains <tr valign=top>

/<tr valign=top>/&&v++%2

and whether the <tr valign=top> is an odd found instance:

v++%2

If so, it replaces the <tr valign=top> in the line

{sub(/<tr valign=top>/, "<tr valign=top bgcolor='#E0E0E0'>")}

Since all lines are to be printed, there is a block that always will be executed (for all lines) and will print the current line:

{print}
Undo
  • 25,519
  • 37
  • 106
  • 129
brandizzi
  • 26,083
  • 8
  • 103
  • 158
4

This works for me:

sed -e "s/<tr/<TR bgcolor='#E0E0E0'/g;n" simpletable.htm

sample input:

<table>
  <tr><td>Row1 / col1</td><td>col2</td><td>col3</td></tr>
  <tr><td>Row2 / col1</td><td>col2</td><td>col3</td></tr>
  <tr><td>Row3 / col1</td><td>col2</td><td>col3</td></tr>
  <tr><td>Row4 / col1</td><td>col2</td><td>col3</td></tr>
  <tr><td>Row5 / col1</td><td>col2</td><td>col3</td></tr>
</table>

sample output:

<table>
  <TR bgcolor='#E0E0E0'><td>Row1 / col1</td><td>col2</td><td>col3</td></tr>
  <tr><td>Row2 / col1</td><td>col2</td><td>col3</td></tr>
  <TR bgcolor='#E0E0E0'><td>Row3 / col1</td><td>col2</td><td>col3</td></tr>
  <tr><td>Row4 / col1</td><td>col2</td><td>col3</td></tr>
  <TR bgcolor='#E0E0E0'><td>Row5 / col1</td><td>col2</td><td>col3</td></tr>
</table>

The key is to use the n command in sed, which advances to the next line. This works only if the TR occupy distinct lines. It will break with nested tables, or if there are multiple TR's on a single line.

Undo
  • 25,519
  • 37
  • 106
  • 129
Cheeso
  • 189,189
  • 101
  • 473
  • 713
  • Also, this command works only if there is nothing except `tr`s in row, right? The `tr` should start and end in the same line and cannot have empty lines between them. Am I right? (Nonetheless, I fount your solution very instructive because I'm not used to the `n` command). – brandizzi May 02 '11 at 14:37
  • 2
    Its replaces every odd occurrence, but how to alter it for every even occurrence? – Offenso Aug 16 '16 at 21:17
0

According to http://www.linuxquestions.org/questions/programming-9/replace-2nd-occurrence-of-a-string-in-a-file-sed-or-awk-800171/

Try this.

sed  '0,/<tr/! s/<tr/<TR bgcolor='#E0E0E0'/' file.txt

The exclamation mark negates everything from the beginning of the file to the first "Jack", so that the substitution operates on all the following lines. Note that I believe this is a gnu sed operation only.

If you need to operate on only the second occurrence, and ignore any subsequent matches, you can use a nested expression.

sed  '0,/<tr/! {0,/<tr/ s/<tr/<TR bgcolor='#E0E0E0'/}' file.txt

Here, the bracketed expression will operate on the output of the first part, but in this case, it will exit after changing the first matching "Jack".

PS, I've found the sed faq to be very helpful in cases like this.

Offenso
  • 277
  • 3
  • 13
0

you can use python script to fix the html

from bs4 import BeautifulSoup

html_doc = """
<table>
   <tr><td>Row1 / col1</td><td>col2</td><td>col3</td></tr>
   <tr><td>Row2 / col1</td><td>col2</td><td>col3</td></tr>
   <tr><td>Row3 / col1</td><td>col2</td><td>col3</td></tr>
   <tr><td>Row4 / col1</td><td>col2</td><td>col3</td></tr>
   <tr><td>Row5 / col1</td><td>col2</td><td>col3</td></tr>
</table>
 """

soup = BeautifulSoup(html_doc, 'html.parser')

index=0
for tr in soup.find_all('tr'):
    if tr.find('td'):
         if index % 2: 
             tr.find('td').attrs['style'] = 'background-color: #ff0000;'
         else:
             tr.find('td').attrs['style'] = 'background-color: #00ff00;'
     index+=1

 print(soup)
Golden Lion
  • 3,840
  • 2
  • 26
  • 35