Say I have the following link:

<li class="hook">
      <a href="i_have_underscores">I_have_underscores</a>

How would I, remove the underscores only in the text and not the href? I have used str_replace, but this removes all underscores, which isn't ideal.

So basically I would be left with this output:

<li class="hook">
      <a href="i_have_underscores">I have underscores</a>

Any help, much appreciated

Keith Donegan
  • 26,213
  • 34
  • 94
  • 129
  • *(related)* [Best Methods to parse HTML](http://stackoverflow.com/questions/3577641/best-methods-to-parse-html/3577662#3577662) – Gordon Nov 27 '10 at 21:41

2 Answers2


You can use a HTML DOM parser to get the text within the tags, and then run your str_replace() function on the result.

Using the DOM Parser I linked, it is as simple as something like this:

$html = str_get_html(
    '<li class="hook"><a href="i_have_underscores">I_have_underscores</a></li>');
$links = $html->find('a');   // You can use any css style selectors here

foreach($links as $l) {
    $l->innertext = str_replace('_', ' ', $l->innertext)

echo $html
//<li class="hook"><a href="i_have_underscores">I have underscores</a></li>

That's it.

  • 2,184
  • 1
  • 17
  • 31

It's safer to parse HTML with DOMDocument instead of regex. Try this code:


function replaceInAnchors($html)
    $dom = new DOMDocument();
    // loadHtml() needs mb_convert_encoding() to work well with UTF-8 encoding
    $dom->loadHtml(mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8"));

    $xpath = new DOMXPath($dom);

    foreach($xpath->query('//text()[(ancestor::a)]') as $node)
        $replaced = str_ireplace('_', ' ', $node->wholeText);
        $newNode  = $dom->createDocumentFragment();
        $node->parentNode->replaceChild($newNode, $node);

    // get only the body tag with its contents, then trim the body tag itself to get only the original content
    return mb_substr($dom->saveXML($xpath->query('//body')->item(0)), 6, -7, "UTF-8");

$html = '<li class="hook">
      <a href="i_have_underscores">I_have_underscores</a>
echo replaceInAnchors($html);
István Ujj-Mészáros
  • 3,228
  • 1
  • 27
  • 46