0

I use the below function found in Highlight keywords in a paragraph for highlighting keywords in a string. Thus it generates this warning:

Warning: DOMDocument::loadHTML() [domdocument.loadhtml]: htmlParseEntityRef: expecting ';' in Entity, line: 1 in /../ on line 118

Following this thread Warning: DOMDocument::loadHTML(): htmlParseEntityRef: expecting ';' in Entity, the answers suggests using html entitiy encoding but doing that misses the whole purpose of using DOM to search through the string and highlighting without breaking the tags. E.g. a htmlentities and html_entity_decode would highlight alla occurences.

How should I tackle this? Or is there some other problem with the function that I am missing?

function highlight($string,$query){
    $keywords = explode(" ",$query);
    //define
    $keywordsCIS = array();
    foreach($keywords as $value){
        $lcValue = strtolower($value);
        $keywordsCIS[] = $value;
        $keywordsCIS[] = $lcValue;
        $keywordsCIS[] = ucfirst($lcValue);
        $keywordsCIS[] = strtoupper($lcValue);
    }
    $dom = new DomDocument();
    $dom ->recover = true;
    $dom -> strictErrorChecking = false;
    $dom -> loadHtml($string);
    $xpath = new DomXpath($dom);
    foreach ($keywordsCIS as $keyword) {
        $elements = $xpath->query('//*[contains(.,"' . $keyword . '")]');
        foreach ($elements as $element) {
            foreach ($element->childNodes as $child) {
                if (!$child instanceof DomText) continue;
                $fragment = $dom->createDocumentFragment();
                $text = $child->textContent;
                $stubs = array();
                while (($pos = stripos($text, $keyword)) !== false) {
                    $fragment->appendChild(new DomText(substr($text, 0, $pos)));
                    $word = substr($text, $pos, strlen($keyword));
                    $highlight = $dom->createElement('strong');
                    $highlight->appendChild(new DomText($word));
                    $highlight->setAttribute('class', 'kw');
                    $fragment->appendChild($highlight);
                    $text = substr($text, $pos + strlen($keyword));
                }
                if (!empty($text)) $fragment->appendChild(new DomText($text));
                $element->replaceChild($fragment, $child);
            }
        }
    }
    //$string = $dom->saveXml($dom->getElementsByTagName('body')->item(0)->firstChild);
    $string = $dom->saveHTML();
    return $string;
}
Cœur
  • 37,241
  • 25
  • 195
  • 267
Joseph
  • 1,734
  • 6
  • 29
  • 51

2 Answers2

0

I believe the warning you are getting is from the html the DomDocument is attempting to parse. I assume that you do not actually wish to change the html content in $string that is being parsed, prior to parsing.

Try using the @ operator on the loadHTML line to avoid the warning:

@$dom->loadHtml($string);
ghbarratt
  • 11,496
  • 4
  • 41
  • 41
0

If your HTML contains this & that, for example, or anything else with a &, the parser will be looking for an entity. It really should be &. This applies to HTML validation as well.

You can ignore errors with @$dom->loadHTML($string);, which in this case won't be too much of an issue. That said, you should be careful to properly format your HTML when using a parser like that.

Niet the Dark Absol
  • 320,036
  • 81
  • 464
  • 592