4

I would like to highlight search terms on a page, but not mess with any HTML tags. I was thinking of something like:

$('.searchResult *').each(function() {
    $(this.html($(this).html().replace(new RegExp('(term)', 'gi'), '<span class="highlight">$1</span>'));
)};

However, $('.searchResult *').each matches all elements, not just leaf nodes. In other words, some of the elements matched have HTML inside them. So I have a few questions:

  1. How can I match only leaf nodes?
  2. Is there some built-in jQuery RegEx function to simplify things? Something like: $(this).wrap('term', $('<span />', { 'class': 'highlight' }))
  3. Is there a way to do a simple string replace and not a RegEx?
  4. Any other better/faster way of doing this?

Thanks so much!

Nelson Rothermel
  • 9,436
  • 8
  • 62
  • 81

7 Answers7

8

[See it in action]

// escape by Colin Snover
// Note: if you don't care for (), you can remove it..
RegExp.escape = function(text) {
    return text.replace(/[-[\]{}()*+?.,\\^$|#\s]/g, "\\$&");
}

function highlight(term, base) {
  if (!term) return;
  base = base || document.body;
  var re = new RegExp("(" + RegExp.escape(term) + ")", "gi"); //... just use term
  var replacement = "<span class='highlight'>" + term + "</span>";
  $("*", base).contents().each( function(i, el) {
    if (el.nodeType === 3) {
      var data = el.data;
      if (data = data.replace(re, replacement)) {
        var wrapper = $("<span>").html(data);
        $(el).before(wrapper.contents()).remove();
      }
    }
  });
}

function dehighlight(term, base) {
  var text = document.createTextNode(term);
  $('span.highlight', base).each(function () {
    this.parentNode.replaceChild(text.cloneNode(false), this);
  });
}
gblazex
  • 49,155
  • 12
  • 98
  • 91
  • The `See it in action` example isn't working for me. However, I had forgotten about the `:contains` selector which should help with selecting the "leaf" nodes and not doing a replace unnecessarily. I'll give this a try. – Nelson Rothermel Jul 13 '10 at 21:10
  • I'm guessing it would be more efficient to create the RegExp variable once before the `each` and reuse it inside `each`? – Nelson Rothermel Jul 13 '10 at 21:13
  • Yes it will be precompiled and should be faster. **Check** the link now :) – gblazex Jul 13 '10 at 21:26
  • @Nelson - `contains` would do a case insensitive search, so it might be better to do a regex search on `text()` instead. Also, any solution where `html` is being overwritten suffers from two problems - existing behavior such as events will get overwritten, and the search term may collide with the html. See http://jsfiddle.net/BcsQG/1/ – Anurag Jul 13 '10 at 23:02
  • using the "highlight(term, base) {}" function on ajax response data... I had to add this just before the closing brace: ` return base;`, to make it work. Otherwise there was no output. – govinda Jan 16 '13 at 05:24
  • Awesome! One problem -- while this will highlight matches with different case, it will _lowercase them_. You can fix it by changing the replacement to `"$1"`. – samson Apr 03 '18 at 14:00
3

Use contents()1, 2, 3 to get all nodes including text nodes, filter out the non-text nodes, and finally replace the nodeValue of each remaining text node using regex. This would keep the html nodes intact, and only modify the text nodes. You have to use regex instead of simple string substitutions as unfortunately we cannot do global replacements when the search term is a string.

function highlight(term) {
    var regex = new RegExp("(" + term + ")", "gi");
    var localRegex = new RegExp("(" + term + ")", "i");
    var replace = '<span class="highlight">$1</span>';

    $('body *').contents().each(function() {
        // skip all non-text nodes, and text nodes that don't contain term
        if(this.nodeType != 3 || !localRegex.test(this.nodeValue)) {
            return;
        }
        // replace text node with new node(s)
        var wrapped = $('<div>').append(this.nodeValue.replace(regex, replace));
        $(this).before(wrapped.contents()).remove();
    });
}

We can't make it a one-liner and much shorter easily now, so I prefer it like this :)

See example here.

Anurag
  • 140,337
  • 36
  • 221
  • 257
  • dis is buggy, we can't set the `nodeValue` of a text node and hope it will work :). have to replace the text node with a span element. – Anurag Jul 13 '10 at 20:48
  • Fixed the bugs, now only does text node replacements. Does **not** replace the entire `html`. – Anurag Jul 13 '10 at 22:45
  • it will fail for things like `(this)` – gblazex Jul 14 '10 at 01:25
  • @galambalazs - could you elaborate more on why `(this)` would be a breaking input? – Anurag Jul 14 '10 at 01:38
  • ah I see, thanks for pointing that out. I am tempted to use your `RegExp.escape` solution, but will let this bug pass instead :) – Anurag Jul 14 '10 at 02:37
2

I've made a pure JavaScript version of this, and packaged it into a Google Chrome plug-in, which I wish to be helpful to some people. The core function is shown below:

GitHub Page for In-page Highlighter

function highlight(term){
    if(!term){
        return false;
    }

    //use treeWalker to find all text nodes that match selection
    //supported by Chrome(1.0+)
    //see more at https://developer.mozilla.org/en-US/docs/Web/API/TreeWalker
    var treeWalker = document.createTreeWalker(
        document.body,
        NodeFilter.SHOW_TEXT,
        null,
        false
        );
    var node = null;
    var matches = [];
    while(node = treeWalker.nextNode()){
        if(node.nodeType === 3 && node.data.indexOf(term) !== -1){
            matches.push(node);
        }
    }

    //deal with those matched text nodes
    for(var i=0; i<matches.length; i++){
        node = matches[i];
        //empty the parent node
        var parent = node.parentNode;
        if(!parent){
            parent = node;
            parent.nodeValue = '';
        }
        //prevent duplicate highlighting
        else if(parent.className == "highlight"){
            continue;
        }
        else{
            while(parent && parent.firstChild){
                parent.removeChild(parent.firstChild);
            }
        }

        //find every occurance using split function
        var parts = node.data.split(new RegExp('('+term+')'));
        for(var j=0; j<parts.length; j++){
            var part = parts[j];
            //continue if it's empty
            if(!part){
                continue;
            }
            //create new element node to wrap selection
            else if(part == term){
                var newNode = document.createElement("span");
                newNode.className = "highlight";
                newNode.innerText = part;
                parent.appendChild(newNode);
            }
            //create new text node to place remaining text
            else{
                var newTextNode = document.createTextNode(part);
                parent.appendChild(newTextNode);
            }
        }

    }
}
jasonslyvia
  • 2,529
  • 1
  • 24
  • 33
2

I'd give the Highlight jQuery plugin a shot.

Chris Doggett
  • 19,959
  • 4
  • 61
  • 86
  • I saw that, but it does a temporary highlight. I need to keep the terms highlighted. Also, the fade effect probably wouldn't be a good idea with dozens or hundreds of matches on a page. – Nelson Rothermel Jul 13 '10 at 21:14
  • You may have clicked it before I edited it. I had the wrong link originally. The one in jQueryUI is indeed temporary, but the one on johannburkard.de is permanent until you call removeHighlight(), and doesn't have a fade effect. – Chris Doggett Jul 13 '10 at 21:26
  • The new link works as expected. I may end up using this in the end, but galambalazs answered my questions more directly. – Nelson Rothermel Jul 13 '10 at 21:44
1

I spent hours searching the web for code that could highlight search terms as the user types, and none could do what I wanted until I combined a bunch of stuff together to do this (jsfiddle demo here):

$.fn.replaceText = function(search, replace, text_only) {
    //http://stackoverflow.com/a/13918483/470749
    return this.each(function(){  
        var v1, v2, rem = [];
        $(this).find("*").andSelf().contents().each(function(){
            if(this.nodeType === 3) {
                v1 = this.nodeValue;
                v2 = v1.replace(search, replace);
                if(v1 != v2) {
                    if(!text_only && /<.*>/.test(v2)) {  
                        $(this).before( v2 );  
                        rem.push(this);  
                    } else {
                        this.nodeValue = v2;  
                    }
                }
            }
        });
        if(rem.length) {
            $(rem).remove();
        }
    });
};

function replaceParentsWithChildren(parentElements){
    parentElements.each(function() {
        var parent = this;
        var grandparent = parent.parentNode;
        $(parent).replaceWith(parent.childNodes);
        grandparent.normalize();//merge adjacent text nodes
    });
}

function highlightQuery(query, highlightClass, targetSelector, selectorToExclude){
    replaceParentsWithChildren($('.' + highlightClass));//Remove old highlight wrappers.
    $(targetSelector).replaceText(new RegExp(query, "gi"), function(match) {
        return '<span class="' + highlightClass + '">' + match + "</span>";
    }, false);
    replaceParentsWithChildren($(selectorToExclude + ' .' + highlightClass));//Do not highlight children of this selector.
}
Ryan
  • 22,332
  • 31
  • 176
  • 357
0

My reputation is not high enough for a comment or adding more links, so I am sorry to write a new answer without all references.

I was interested in the performance of the mentioned solutions above and added some code for measurement. To keep it simple I added only these lines:

var start = new Date();
// hightlighting code goes here ...
var end = new Date();
var ms = end.getTime() - start.getTime();
jQuery("#time-ms").text(ms);

I have forked the solution of Anurag with these lines and this resulted in 40-60ms in average.

So I forked this fiddle and made some improvements to fit my needs. One thing was the RegEx-escaping (plz see the answer from CoolAJ86 in "escape-string-for-use-in-javascript-regex" in stackoverflow). Another point was the prevention of a second 'new RegExp()', as the RegExp.test-function should ignore the global flag and return on the first matching (plz see javascript reference on RegExp.test).

On my machine (chromium, linux) I have runtimes about 30-50ms. You can test this by yourself in this jsfiddle.

I also added my timers to the highest rated solution of galambalazs, you can find this in this jsFiddle. But this one has runtimes of 60-100ms.

The values in milliseconds become even higher and of much more importance when running (e.g. in Firefox about a quarter of a second).

Sbl
  • 67
  • 6
0

Here's a naive implementation that just blasts in HTML for any match:

<!DOCTYPE html>
<html lang"en">
<head>
    <title>Select Me</title>
    <style>
        .highlight {
            background:#FF0;
        }
    </style>
    <script type="text/javascript" src="http://ajax.microsoft.com/ajax/jquery/jquery-1.4.2.min.js"></script>
    <script type="text/javascript">

        $(function () {

            hightlightKeyword('adipisicing');

        });

        function hightlightKeyword(keyword) {

            var replacement = '<span class="highlight">' + keyword + '</span>';
            var search = new RegExp(keyword, "gi");
            var newHtml = $('body').html().replace(search, replacement);
            $('body').html(newHtml);
        }

    </script>
</head>
<body>
    <div>

        <p>Lorem ipsum dolor sit amet, consectetur <b>adipisicing</b> elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>
        <p>Lorem ipsum dolor sit amet, <em>consectetur adipisicing elit</em>, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>
        <p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>

    </div>
</body>
</html>
a7drew
  • 7,801
  • 6
  • 38
  • 39