0

I have html that looks like

<tr>
<td align="left"><a target="_blank" href="/tags/ref_color_tryit.asp?color=Yellow">Yellow</a>&nbsp;</td>
<td align="left"><a target="_blank" href="/tags/ref_color_tryit.asp?hex=FFFF00">#FFFF00</a></td>
<td bgcolor="#FFFF00">&nbsp;</td>
<td align="left"><a href="/tags/ref_colorpicker.asp?colorhex=FFFF00">Shades</a></td>
<td align="left"><a href="/tags/ref_colormixer.asp?colorbottom=FFFF00&colortop=FFFFFF">Mix</a></td>
</tr>


<tr>
<td align="left"><a target="_blank" href="/tags/ref_color_tryit.asp?color=YellowGreen">YellowGreen</a>&nbsp;</td>
<td align="left"><a target="_blank" href="/tags/ref_color_tryit.asp?hex=9ACD32">#9ACD32</a></td>
<td bgcolor="#9ACD32">&nbsp;</td>
<td align="left"><a href="/tags/ref_colorpicker.asp?colorhex=9ACD32">Shades</a></td>
<td align="left"><a href="/tags/ref_colormixer.asp?colorbottom=9ACD32&colortop=FFFFFF">Mix</a></td>
</tr>

What I am wanting to do is

filter the html so I only end up with

<td bgcolor="#XXXXXX">&nbsp;</td>

Then Filter that so I end up with a whole pile of rows of

XXXXXX
XXXXXX

How would I do that?

Hailwood
  • 89,623
  • 107
  • 270
  • 423
  • You wouldn't. This is a terrible place to use any regex... Any other possibilities? – Blender Dec 03 '10 at 03:59
  • paging dr. bobince? http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – harpo Dec 03 '10 at 04:00
  • 3,983 people agree: [Don't parse HTML with regexes!](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) – John Kugelman Dec 03 '10 at 04:02

4 Answers4

1

Hi you can use following regex.

\<td bgcolor\=\"([^\"]+\)">\&nbsp\;\<\/td\>

Use group option to capture "XXXXXX"

Vkalal
  • 36
  • 3
0

First regex to match the right tags:

\<td bgcolor="#[0-9A-Fa-f]{6}">&nbsp;\</td\>

Then, you can filter that data again with (or use a group option, depends on what language as to which is more convenient):

[0-9A-Fa-f]{6}

That is, if you want to use regex (don't shoot me, the question is what regular expression can I use for this)

Rafe Kettler
  • 75,757
  • 21
  • 156
  • 151
0

if you must use regex, here is a demo using Ruby's irb:

>> %Q{
<tr>
<td align="left"><a target="_blank" href="/tags/ref_color_tryit.asp?color=Yellow">Yellow</a>&nbsp;</td>
<td align="left"><a target="_blank" href="/tags/ref_color_tryit.asp?hex=FFFF00">#FFFF00</a></td>
<td bgcolor="#FFFF00">&nbsp;</td>
<td align="left"><a href="/tags/ref_colorpicker.asp?colorhex=FFFF00">Shades</a></td>
<td align="left"><a href="/tags/ref_colormixer.asp?colorbottom=FFFF00&colortop=FFFFFF">Mix</a></td>
</tr>


<tr>
<td align="left"><a target="_blank" href="/tags/ref_color_tryit.asp?color=YellowGreen">YellowGreen</a>&nbsp;</td>
<td align="left"><a target="_blank" href="/tags/ref_color_tryit.asp?hex=9ACD32">#9ACD32</a></td>
<td bgcolor="#9ACD32">&nbsp;</td>
<td align="left"><a href="/tags/ref_colorpicker.asp?colorhex=9ACD32">Shades</a></td>
<td align="left"><a href="/tags/ref_colormixer.asp?colorbottom=9ACD32&colortop=FFFFFF">Mix</a></td>
</tr>
}.scan(/<td[^>]*>&nbsp;<\/td>/).map {|s| s[/#([a-f0-9]+)/i, 1]}

=> ["FFFF00", "9ACD32"]
nonopolarity
  • 146,324
  • 131
  • 460
  • 740
0

I wouldn't parse HTML with regex's either, but if I did I'd do it like this ;)

var html = '<tr>\n<td align="left"><a target="_blank" href="/tags/ref_color_tryit.asp?color=Yellow">Yellow</a>&nbsp;</td>\n<td align="left"><a target="_blank" href="/tags/ref_color_tryit.asp?hex=FFFF00">#FFFF00</a></td>\n<td bgcolor="#FFFF00">&nbsp;</td>\n<td align="left"><a href="/tags/ref_colorpicker.asp?colorhex=FFFF00">Shades</a></td>\n<td align="left"><a href="/tags/ref_colormixer.asp?colorbottom=FFFF00&colortop=FFFFFF">Mix</a></td>\n</tr>\n\n\n<tr>\n<td align="left"><a target="_blank" href="/tags/ref_color_tryit.asp?color=YellowGreen">YellowGreen</a>&nbsp;</td>\n<td align="left"><a target="_blank" href="/tags/ref_color_tryit.asp?hex=9ACD32">#9ACD32</a></td>\n<td bgcolor="#9ACD32">&nbsp;</td>\n<td align="left"><a href="/tags/ref_colorpicker.asp?colorhex=9ACD32">Shades</a></td>\n<td align="left"><a href="/tags/ref_colormixer.asp?colorbottom=9ACD32&colortop=FFFFFF">Mix</a></td>\n</tr>'
        .split('\n'),    
    colors = [],
    i, l,
    match;

for(i = 0, l = html.length; i < l; i++) {
    if(match = /<td bgcolor="#([\da-fA-F]{6})">&nbsp;<\/td>/.exec(html[i])) {
        colors.push(match[1]);
    }
}

console.log(colors);
Cameron Jordan
  • 759
  • 3
  • 5