0

Please suggest me the PHP regex for preg_replace to remove all the attributes from tags in HTML code without removing the tags. But in hyper links all the attributes such as href, terget, rel should remain as is

Please refer the below example:

I already tried below regex with preg_replace:

$htmltext = '<p style="float: left;">
<span style="color: #ff0000;">
<b>Some text here</b>
</span>
<a target="_blank" rel="nofollow" href="http://thebankexam.com/page/7017">Clickable Text</a>
</p>';
$my_output = preg_replace("/<([a-z][a-z0-9]*)[^>]*?(\/?)>/i",'<$1$2>',$htmltext);
echo $my_output;

Filtered output ($my_output):

<p>
<span>
<b>Some text here</b>
</span>
<a>Clickable Text</a>  <!-- Check this hyper link href, rel and target gone -->
</p>

The intended output should look like:

<p>
<span>
<b>Some text here</b>
</span>
<a target="_blank" rel="nofollow" href="http://thebankexam.com/page/7017">Clickable Text</a>
</p>
Hrishi
  • 13
  • 1
  • 6
  • yes Owen, except the clickable link attributes – Hrishi Oct 15 '13 at 07:53
  • http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – Sergio Oct 15 '13 at 07:54
  • I have below regex, but it removes href, rel and target as well and keeps only `Clickable Text` REGEX: `preg_replace("/<([a-z][a-z0-9]*)[^>]*?(\/?)>/i",'<$1$2>',$htmltext)` – Hrishi Oct 15 '13 at 07:58
  • That has some of the most mental answers I have seen on so. Apparently this is impossible without Chuck norris – The Humble Rat Oct 15 '13 at 07:59
  • When you mention only from clickable links, does that mean you would like to remove those attributes (href, target, rel) if they are not in a clickable link? – Jerry Oct 15 '13 at 08:18

1 Answers1

4
preg_replace('/<a\s+[^>]*href\s*=\s*"([^"]+)"[^>]*>/', '<a href="\1">', $html);
Marek
  • 7,337
  • 1
  • 22
  • 33