1

I am trying to extract the contents of the table using Regex.

I have removed most of the tags from the table, i am stuck with <br> , <a href >, <img > & <b> How to remove them ??

for <b> tag i tried this Regex

 \s*<b[^>]*>\s* 
(?<value>.*?)
 \s* </b>\s*

it worked for some lines and some its giving the out put as

<b class="saadirheader">Email:</b>

Can anyone help me removing these tags

<br> , <a href >, <img > and  <b>

Full Tags :-

<img src="Newrecord_files/spacer.gif" alt="" border="0" height="1" width="5">

<a href="mailto:first.last@email.org">

Thanking you,

Naveen HS

user596712
  • 11
  • 1
  • 3
  • Do you already know [`strip_tags`](http://php.net/strip_tags)? – Gumbo Jul 30 '10 at 09:48
  • 3
    Also, obligatory link: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Amadan Jul 30 '10 at 09:49
  • You may also want to learn about the difference between greedy and non-greedy expressions. I.e. in vs ]*> – relet Jul 30 '10 at 09:52

1 Answers1

1

Use the following Regex:

(?:<br|<a href|<img|<b)(?:.(?!>))*.>

This Regex will match all the tags you mentioned above, and if there are more tags you forgot to mention just add a "|" sign with the tag you want to add, and insert it into the first parentheses.