2

Okay I might not be going about this the right way, but here goes..

I have this string that takes a link and extracts the text between the tags...

$string = $item;
$pattern = '/\<a([^>]*)\>([^<]*)\<\/a\>/i';
$replacement = '$2';
$message = preg_replace($pattern, $replacement, $string);

There are a few items in this string that have ampersands (in the text portion, not the tag portion), however most don't. I'm trying to figure out a way to either incorporate the ampersand into the current pattern or do another preg_replace on the $message to remove the ampersand after the tags are striped away.

THANKS!

Tiffany Israel
  • 460
  • 1
  • 7
  • 22
  • If you're thinking about multiple regex's, you're either overcomplicating things, or you're a slave of [Cthulu](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454)... really: save yourself a lot of trouble and _PARSE_ the html. As you said yourself: you're not going about it the right way, so mend your ways – Elias Van Ootegem Sep 05 '12 at 19:30

2 Answers2

7

There's always $message = str_replace('&', '', $message);

Incidentally, if you are trying to strip tags from html input, there is also strip_tags

for example, if your input is

$text = '<a href="google.com">Text</a>';

Then strip_tags($text) will produce Text.

Kevin
  • 1,666
  • 9
  • 10
  • 1
    Thanks for your answer, which worked! I chose the other answer since it was a little bit cleaner. But I'll keep `strip_tags($text)` in mind for next time! THANKS! – Tiffany Israel Sep 05 '12 at 19:36
2

Do you want to remove everything after the ampersand? Then it's

'/\<a([^>]*)\>([^<&]*)[^<]*\<\/a\>/i';

Otherwise, you'll need a 2nd operation.

BTW: Your regex will also match other tags starting with <a, such as the <author> or the <audio> tag.

Mark
  • 6,033
  • 1
  • 19
  • 14