Possible Duplicate:
How to parse and process HTML with PHP?
I need to parse blocks of HTML, replacing some hrefs with the link description based on whether the description meets certain criteria.
The regex I'm using to identify specific strings is used elsewhere in my application:
$regex = "/\b[FfGg][\.][\s][0-9]{1,4}\b/";
preg_match_all($regex, $html, $matches, PREG_SET_ORDER);
I'm using the following SO question as a starting point for extracting href descriptions:
Replacing html link tags with a text description
The idea is to convert any link having a "FfGg.xxxx" type identifier, and leave the rest in tact (ie, the google link).
What I have so far is:
$html = 'Ten reports <a href="http://google.com">Google!</a> on 14 mice with ABCD
show that low plasma BCAA, particularly ABC and to a lesser extent DEF, can result in
severe but reversible epithelial damage to the skin, eye and gastrointestinal tract.
</li><li>Symptoms were reported in conjunction with low plasma ABC levels in 9 case
reports. In two case reports, ABC levels were between 1.9 and 48 µmol/L (<a
href="/docpage.php?obscure==100" target="F.100">F.100</a>, <a
href="/docpage.php?obscure==68" target="F.68">F.68</a>, <a href="/docpage.php?obscure==67"
target="F.67">F.67</a>, <a href="/docpage.php?obscure==71" target="F.71">F.71</a>, <a
href="/docpage.php?obscure==122" target="F.122">F.122</a>, <a
href="/docpage.php?obscure==92" target="F.92">F.92</a>, <a href="/docpage.php?obscure==96"
target="F.96">F.96</a>);';
This converts all links, including google:
$html = preg_replace("/<a.*?href=\"(.*?)\".*?>(.*?)<\/a>/i", "$2", $html);
This returns a blank HTML string:
$html = preg_replace("/<a.*?href=\"(.*?)\".*?>[FfGg][\.][\s][0-9]{1,4}<\/a>/i", "$2", $html);
I believe the problem is in how I'm embedding this regex in the second (non-working) example above:
[FfGg][\.][\s][0-9]{1,4}
What is the correct way of embedding the FfGg expression in HTML found in my preg_replace example above?