0

I'm trying to parse a text as html, making some changes, and then again turning in back to plain text.

The changes that I'm going to make is bassically replacing some term with another (e.g. 'old' with 'new').

I only want to change the text types, not the attributes and properties of html elements (e.g. the href values should not be changed.).

Please see the code:

h= '<span myAtrib="something"><a href="old(this SHOULD NOT be changed)">old(this SHOULD be changed)</a></span>';

h=$("<div>"+ h +"</div>").find('span[myAtrib="something"]').html(function (i, oldHtml) {
    oldHtml = oldHtml.replace(/old/g, 'new');
return oldHtml;
}).end().html();

How should I make the replacement happen only on text nodes, or at least, how should I filter-out the anchor elements?

Iryn
  • 255
  • 2
  • 5
  • 13
  • 1
    [TH̘Ë͖́̉ ͠P̯͍̭O̚​N̐Y̡ H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – adeneo Dec 09 '13 at 03:07
  • 1
    @adeneo , yeah, I have read that answer and got to love it very much. But I didn't figured out how is that related to my question. I don't think I'm parsing html with regex... – Iryn Dec 09 '13 at 05:09

2 Answers2

2

How should I make the replacement happen only on text nodes

By recursively iterating over all child nodes/descendants and testing whether the node is a text node or not:

function replace($element, from, to) {
    $element.contents().each(function() {
        if (this.nodeType === 3) { // text node
            this.nodeValue = this.nodeValue.replace(from, to);
        }
        else if (this.nodeType === 1) { // element node
            replace($(this), from, to);
        }
    });
}
var $ele = $("<div>"+ h +"</div>").find('span[myAtrib="something"]')
replace($ele, /old/g, "new");
var html = $ele.end().html();

DEMO

You can easily do this without jQuery as well.

Felix Kling
  • 795,719
  • 175
  • 1,089
  • 1,143
  • do you see any major performance issues with http://jsfiddle.net/arunpjohny/nawSW/1/ - removed the filter + loop http://jsfiddle.net/arunpjohny/nawSW/2/ – Arun P Johny Dec 09 '13 at 03:13
  • Looks OK. I just don't like `.find('*')` in general, especially if you don't know how the HTML is structured. – Felix Kling Dec 09 '13 at 03:24
1

Another approach to do the same is

h = $("<div id='root'>" + h + "</div>").find('span[myAtrib="something"]').find('*').addBack().contents().each(function () {
    if (this.nodeType == 3 && $.trim(this.nodeValue) != '') {
        this.nodeValue = this.nodeValue.replace(/old/g, 'new')
    }
}).end().end().end().end().html();

Demo: Fiddle

Arun P Johny
  • 384,651
  • 66
  • 527
  • 531
  • I don't want to totally escape elements. Please see my question again: `old(this SHOULD be changed)`. – Iryn Dec 09 '13 at 03:18
  • @Iryn the o/p generated by this is `new(this SHOULD be changed) `, I hope this is what you are looking for – Arun P Johny Dec 09 '13 at 03:21
  • Yes, but when I use it with my filter (`span[myAtrib="something"]`), then it does not work. – Iryn Dec 09 '13 at 03:24
  • @Iryn can you edit & share the fiddle to show what you are trying – Arun P Johny Dec 09 '13 at 03:26
  • Thanks, Arun, but there seems to be cases where the code cannot handle properly. Please see [this](http://jsfiddle.net/nawSW/5/) and [this](http://jsfiddle.net/nawSW/4/). I would be glad to know your comments on them. Thanks again. – Iryn Dec 09 '13 at 05:04
  • @Iryn see http://jsfiddle.net/arunpjohny/nawSW/6/ and http://jsfiddle.net/arunpjohny/nawSW/7/ – Arun P Johny Dec 09 '13 at 05:20
  • I believe that four `.end()` should be used in that case (see http://jsfiddle.net/nawSW/8/). Now, will that cover all the other cases, as well? Anyway, thanks a lot. – Iryn Dec 09 '13 at 05:41
  • @Iryn yes... it will in the previous cases the direct children of `span` was omitted – Arun P Johny Dec 09 '13 at 05:42
  • So far, I have not found any cases where this code fails, and a jspref test showed me that its faster than the other answer (please let me know if I am wrong). So I choose this as the answer. For others reading this quesion, here is latest version: [http://jsfiddle.net/nawSW/9/](http://jsfiddle.net/nawSW/9/) – Iryn Dec 09 '13 at 06:59
  • @Iryn also [this](http://jsfiddle.net/arunpjohny/nawSW/10/) might be little more cleaner – Arun P Johny Dec 09 '13 at 07:11