1

How can I do a case insensitive comparison on the keyword's appearance in my content in the script below?

If I use this...

$keyword = strtolower(rseo_getKeyword($post));

$nodes = $x->query("//text()[
    contains(
    translate(.,'ABCDEFGHJIKLMNOPQRSTUVWXYZ',
                'abcdefghjiklmnopqrstuvwxyz'),
                '$keyword')

The replacement is only made on keyword matches within the content that is already lowercase. It does not appear to be doing a case insensitive lookup.

    $keyword = rseo_getKeyword($post);
    $content = $postarray['post_content']; //error: Empty string supplied in loadHTML() when I use this.
    //$content = "this is a test phrase";
    @$d = new DOMDocument();
    @$d->loadHTML($content);
    @$x = new DOMXpath($d);
    @$nodes = $x->query("//text()[contains(.,'$keyword') 
        and not(ancestor::h1) 
        and not(ancestor::h2) 
        and not(ancestor::h3) 
        and not(ancestor::h4) 
        and not(ancestor::h5) 
        and not(ancestor::h6)]");
    if ($nodes && $nodes->length) {
        $node = $nodes->item(0);
        // Split just before the keyword
        $keynode = $node->splitText(strpos($node->textContent, $keyword));
        // Split after the keyword
        $node->nextSibling->splitText(strlen($keyword));
        // Replace keyword with <b>keyword</b>
        $replacement = $d->createElement('b', $keynode->textContent);
        $keynode->parentNode->replaceChild($replacement, $keynode);
    }
    echo $d->saveHTML();die;
poke
  • 369,085
  • 72
  • 557
  • 602
Scott B
  • 38,833
  • 65
  • 160
  • 266
  • 2
    Pretty much duplicates http://stackoverflow.com/questions/625986/how-can-i-use-xpath-to-perform-a-case-insensitive-search-and-support-non-english – rik Feb 02 '11 at 18:38
  • @rik, I've attempted to substitute the translate routing into my xquery (and updated my question with that info) but when I do, the replacement is not made at all. – Scott B Feb 02 '11 at 19:37
  • possible duplicate of [case insensitive xpath searching in php](http://stackoverflow.com/questions/3238989/case-insensitive-xpath-searching-in-php/3240155#3240155) – Gordon Feb 02 '11 at 23:23
  • Good question, +1. See my answer for a complete and easy solution. – Dimitre Novatchev Feb 03 '11 at 05:11

2 Answers2

2
//text()
    [contains(translate(.,'ABCDEFGHJIKLMNOPQRSTUVWXYZ',
                        'abcdefghjiklmnopqrstuvwxyz'),                 
              '$keyword') 
    ] 

The correct expression must test if the lowercased text contains the lowercased keyword:

//text()
    [contains(translate(.,'ABCDEFGHJIKLMNOPQRSTUVWXYZ',
                          'abcdefghjiklmnopqrstuvwxyz'),                 
              translate('$keyword','ABCDEFGHJIKLMNOPQRSTUVWXYZ',
                                   'abcdefghjiklmnopqrstuvwxyz')                 
              ) 
    ] 
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
1

The text() function returns all text node children of the context node. When you call it as a parameter to translate(), the context node is a text node and so will have no text node children. Instead, use . to properly select the context node itself as you really want.

Replace your try:

contains(translate(text(), 'ABC…

with

contains(translate(., 'ABC…
salathe
  • 51,324
  • 12
  • 104
  • 132
  • I've changed it in my question, but I'm still doing something wrong. I'm only getting matches when the keyword is lowercase within the content. – Scott B Feb 02 '11 at 21:03
  • @Scott B, maybe provide a reproducible example and we can move from there. Upper/lower/MixEd case keywords in the content work fine with the short snippet of HTML I used just to check the notes above. – salathe Feb 02 '11 at 21:20