How do I select text nodes with jQuery?

Question

I would like to get all descendant text nodes of an element, as a jQuery collection. What is the best way to do that?

Tim Down · Accepted Answer · 2015-02-12T09:47:04.257

273

jQuery doesn't have a convenient function for this. You need to combine contents(), which will give just child nodes but includes text nodes, with find(), which gives all descendant elements but no text nodes. Here's what I've come up with:

var getTextNodesIn = function(el) {
    return $(el).find(":not(iframe)").addBack().contents().filter(function() {
        return this.nodeType == 3;
    });
};

getTextNodesIn(el);

Note: If you're using jQuery 1.7 or earlier, the code above will not work. To fix this, replace addBack() with andSelf(). andSelf() is deprecated in favour of addBack() from 1.8 onwards.

This is somewhat inefficient compared to pure DOM methods and has to include an ugly workaround for jQuery's overloading of its contents() function (thanks to @rabidsnail in the comments for pointing that out), so here is non-jQuery solution using a simple recursive function. The includeWhitespaceNodes parameter controls whether or not whitespace text nodes are included in the output (in jQuery they are automatically filtered out).

Update: Fixed bug when includeWhitespaceNodes is falsy.

function getTextNodesIn(node, includeWhitespaceNodes) {
    var textNodes = [], nonWhitespaceMatcher = /\S/;

    function getTextNodes(node) {
        if (node.nodeType == 3) {
            if (includeWhitespaceNodes || nonWhitespaceMatcher.test(node.nodeValue)) {
                textNodes.push(node);
            }
        } else {
            for (var i = 0, len = node.childNodes.length; i < len; ++i) {
                getTextNodes(node.childNodes[i]);
            }
        }
    }

    getTextNodes(node);
    return textNodes;
}

getTextNodesIn(el);

edited Feb 12 '15 at 09:47

answered Dec 09 '10 at 15:13

Tim Down

318,141
75
454
536

Can the element passed in, be the name of a div? – crosenblum Feb 10 '11 at 15:56
@crosenblum: You could call `document.getElementById()` first, if that's what you mean: `var div = document.getElementById("foo"); var textNodes = getTextNodesIn(div);` – Tim Down Feb 10 '11 at 16:43
1

Because of a bug in jQuery if you have any iframes in el you'll need to use .find(':not(iframe)') instead of .find('*') . – bobpoekert Feb 03 '12 at 00:29
@rabidsnail: I think, the use of `.contents()` anyways implies it will search through the iframe as well. I don't see how it could be a bug. – Robin Maben Feb 06 '12 at 11:52
http://bugs.jquery.com/ticket/11275 Whether this is actually a bug seems to be up for debate, but bug or not if you call find('*').contents() on a node that contains an iframe which hasn't been added to the dom you'll get an exception at an undefined point. – bobpoekert Feb 13 '12 at 21:31
@rabidsnail: OK, I think that's at least an annoyance (if not a bug) in jQuery and a point in favour of the plain DOM version. I'll edit my answer. Thanks. – Tim Down Feb 13 '12 at 21:57
[`andSelf()`](http://api.jquery.com/andSelf/) was deprecated in jQuery 1.8, you can use [`addBack()`](http://api.jquery.com/addBack/) instead. – Mottie Feb 14 '13 at 14:03
You could consider `nonWhitespace = /\S/` and `if (includeWhitespaceNodes || nonWhitespace.test(node.nodeValue)) {` which at least boasts greater simplicity (though it would respond differently to empty text nodes, if those are possible). I also think there could be improvement in the regex variable name... something like `whitespaceMatcher` or *something* to indicate what the variable is. – ErikE Nov 18 '13 at 23:29
@ErikE: I like descriptive variable names. I have a feeling I picked `whitespace` to avoid the code having horizontal scrollbars on my browser. – Tim Down Nov 18 '13 at 23:42
@ErikE: I agree with you on both counts and have edited my answer. Empty text nodes are indeed possible but will be treated the same by both `!/^\s*$/.test()` and `/\S/.test()` so there's no problem there. – Tim Down Nov 18 '13 at 23:55
oh, right, it was `*` not `+` so empty nodes were matched before. Glad you liked my suggestions! – ErikE Nov 19 '13 at 00:03
Great suggestion. I would recommend using `Node.TextNode` in place of `3` for better readability. – Ben Siver Mar 28 '14 at 14:58
@BenS: I would but `Node.TEXT_NODE` isn't supported in IE <= 8. – Tim Down Mar 28 '14 at 16:31
This code has a bug in it. Right now when you pass false for including whitespace, it ONLY modifies whitespace nodes instead of excluding them. The line `if (includeWhitespaceNodes || !nonWhitespaceMatcher.test(node.nodeValue))` should instead read: `if (includeWhitespaceNodes || nonWhitespaceMatcher.test(node.nodeValue))`. – Brian Geihsler Jun 11 '14 at 18:55
@BrianGeihsler: You're right, thanks. I simplified the regular expression last November but failed to negate the condition. Wish I'd tested it now. – Tim Down Jun 11 '14 at 22:35
@TimDown I tried your method but it gives the nodes out of order. What must be done to have tags in order? I asked a separate question here https://stackoverflow.com/questions/63270123/extracting-text-tags-in-order-how-can-this-be-done – Amanda Aug 06 '20 at 04:16
@Amanda: I think the non-jQuery version will give you nodes in document order. – Tim Down Aug 06 '20 at 08:05
@TimDown You mean the jquery version? I am using cheerio and I am not getting so. – Amanda Aug 06 '20 at 09:01
@TimDown I came across an answer by AKX at https://stackoverflow.com/questions/63270123/extracting-text-tags-in-order-how-can-this-be-done . This seems to work. How is it different from what you proposed in the answer? Could you please explain/help? – Amanda Aug 06 '20 at 09:17

score 222 · Answer 2 · edited Aug 13 '12 at 14:37

222

Jauco posted a good solution in a comment, so I'm copying it here:

$(elem)
  .contents()
  .filter(function() {
    return this.nodeType === 3; //Node.TEXT_NODE
  });

edited Aug 13 '12 at 14:37

James Westgate

11,306
8
61
68

answered Nov 18 '08 at 13:47

Christian Oudard

48,140
25
66
69

36

actually $(elem) .contents() .filter(function() { return this.nodeType == Node.TEXT_NODE; }); is enough – Jauco Jul 11 '09 at 13:53
38

IE7 doesn't define the Node global, so you have to use this.nodeType == 3, unfortunately: http://stackoverflow.com/questions/1423599/node-textnode-and-ie7 – Christian Oudard Dec 29 '09 at 20:00
18

Does this not only return the text nodes that are the direct children of the element rather than descendants of the element as the OP requested? – Tim Down Oct 15 '10 at 14:12
I've just noticed that your first answer from 2008 was almost exactly what I independently came up with much later. Why did you edit it? – Tim Down Oct 10 '12 at 22:35
add `.text()` at the end if you want it so be a string. Otherwise it's still an object. Trying to show it in the document will end up displaying [Object object]. – Shahar Sep 29 '13 at 08:14
@ChristianOudard That would be really easy to polyfill, no? Would make your code a bit more legible. – mpen Jun 10 '14 at 21:28
8

this will not work when the text node is deep nested inside other elements, because the contents() method only returns the immediate children nodes, https://api.jquery.com/contents/ – MinhajulAnwar Oct 16 '15 at 16:08
2

@Jauco, nope, not enough! as .contents() returns only the immediate children nodes – MinhajulAnwar Oct 16 '15 at 16:14

He Nrik · Answer 3 · 2012-10-12T15:31:27.630

19

$('body').find('*').contents().filter(function () { return this.nodeType === 3; });

edited Oct 12 '12 at 15:31

answered Oct 19 '11 at 16:07

He Nrik

1,670
16
17

Salman A · Answer 4 · 2016-01-04T20:48:16.853

jQuery.contents() can be used with jQuery.filter to find all child text nodes. With a little twist, you can find grandchildren text nodes as well. No recursion required:

$(function() {
  var $textNodes = $("#test, #test *").contents().filter(function() {
    return this.nodeType === Node.TEXT_NODE;
  });
  /*
   * for testing
   */
  $textNodes.each(function() {
    console.log(this);
  });
});

div { margin-left: 1em; }

<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>

<div id="test">
  child text 1<br>
  child text 2
  <div>
    grandchild text 1
    <div>grand-grandchild text 1</div>
    grandchild text 2
  </div>
  child text 3<br>
  child text 4
</div>

jsFiddle

I tried this. It prints tag names out of order. Is there a way to print tag names in the order they occur? I asked a separate question here https://stackoverflow.com/questions/63276378/in-order-traversal-of-html-tags-how-can-i-do-this — Amanda, Aug 06 '20 at 04:27

score 4 · Answer 5 · answered Oct 13 '14 at 18:03

I was getting a lot of empty text nodes with the accepted filter function. If you're only interested in selecting text nodes that contain non-whitespace, try adding a nodeValue conditional to your filter function, like a simple $.trim(this.nodevalue) !== '':

$('element')
    .contents()
    .filter(function(){
        return this.nodeType === 3 && $.trim(this.nodeValue) !== '';
    });

http://jsfiddle.net/ptp6m97v/

Or to avoid strange situations where the content looks like whitespace, but is not (e.g. the soft hyphen  character, newlines \n, tabs, etc.), you can try using a Regular Expression. For example, \S will match any non-whitespace characters:

$('element')
        .contents()
        .filter(function(){
            return this.nodeType === 3 && /\S/.test(this.nodeValue);
        });

I tried this. It prints tag names out of order. Is there a way to print tag names in the order they occur? I asked a separate question here https://stackoverflow.com/questions/63276378/in-order-traversal-of-html-tags-how-can-i-do-this — Amanda, Aug 06 '20 at 04:28

colllin · Answer 6 · 2011-04-20T14:59:09.517

3

If you can make the assumption that all children are either Element Nodes or Text Nodes, then this is one solution.

To get all child text nodes as a jquery collection:

$('selector').clone().children().remove().end().contents();

To get a copy of the original element with non-text children removed:

$('selector').clone().children().remove().end();

edited Apr 20 '11 at 14:59

answered Apr 20 '11 at 14:52

colllin

9,442
9
49
65

1

Just noticed Tim Down's comment on another answer. This solution only gets the direct children, not all descendents. – colllin Apr 20 '11 at 14:58

iConnor · Answer 7 · 2014-02-23T20:19:46.267

For some reason contents() didn't work for me, so if it didn't work for you, here's a solution I made, I created jQuery.fn.descendants with the option to include text nodes or not

Usage

Get all descendants including text nodes and element nodes

jQuery('body').descendants('all');

Get all descendants returning only text nodes

jQuery('body').descendants(true);

Get all descendants returning only element nodes

jQuery('body').descendants();

Coffeescript Original:

jQuery.fn.descendants = ( textNodes ) ->

    # if textNodes is 'all' then textNodes and elementNodes are allowed
    # if textNodes if true then only textNodes will be returned
    # if textNodes is not provided as an argument then only element nodes
    # will be returned

    allowedTypes = if textNodes is 'all' then [1,3] else if textNodes then [3] else [1]

    # nodes we find
    nodes = []


    dig = (node) ->

        # loop through children
        for child in node.childNodes

            # push child to collection if has allowed type
            nodes.push(child) if child.nodeType in allowedTypes

            # dig through child if has children
            dig child if child.childNodes.length


    # loop and dig through nodes in the current
    # jQuery object
    dig node for node in this


    # wrap with jQuery
    return jQuery(nodes)

Drop In Javascript Version

var __indexOf=[].indexOf||function(e){for(var t=0,n=this.length;t<n;t++){if(t in this&&this[t]===e)return t}return-1}; /* indexOf polyfill ends here*/ jQuery.fn.descendants=function(e){var t,n,r,i,s,o;t=e==="all"?[1,3]:e?[3]:[1];i=[];n=function(e){var r,s,o,u,a,f;u=e.childNodes;f=[];for(s=0,o=u.length;s<o;s++){r=u[s];if(a=r.nodeType,__indexOf.call(t,a)>=0){i.push(r)}if(r.childNodes.length){f.push(n(r))}else{f.push(void 0)}}return f};for(s=0,o=this.length;s<o;s++){r=this[s];n(r)}return jQuery(i)}

Unminified Javascript version: http://pastebin.com/cX3jMfuD

This is cross browser, a small Array.indexOf polyfill is included in the code.

Mr_Green · Answer 8 · 2013-10-18T05:28:15.263

1

Can also be done like this:

var textContents = $(document.getElementById("ElementId").childNodes).filter(function(){
        return this.nodeType == 3;
});

The above code filters the textNodes from direct children child nodes of a given element.

edited Oct 18 '13 at 05:28

answered Jun 19 '13 at 11:24

Mr_Green

40,727
45
159
271

1

... but not all the *descendant* child nodes (e.g. a text node that is the child of an element that is a child of the original element). – Tim Down Oct 17 '13 at 21:15

score 0 · Answer 9 · answered Aug 23 '13 at 14:41

For me, plain old .contents() appeared to work to return the text nodes, just have to be careful with your selectors so that you know they will be text nodes.

For example, this wrapped all the text content of the TDs in my table with pre tags and had no problems.

jQuery("#resultTable td").content().wrap("<pre/>")

score 0 · Answer 10 · answered Jun 22 '11 at 18:36

0

if you want to strip all tags, then try this

function:

String.prototype.stripTags=function(){
var rtag=/<.*?[^>]>/g;
return this.replace(rtag,'');
}

usage:

var newText=$('selector').html().stripTags();

answered Jun 22 '11 at 18:36

Rahen Rangan

715
5
8

score 0 · Answer 11 · edited Feb 16 '14 at 05:22

0

I had the same problem and solved it with:

Code:

$.fn.nextNode = function(){
  var contents = $(this).parent().contents();
  return contents.get(contents.index(this)+1);
}

Usage:

$('#my_id').nextNode();

Is like next() but also returns the text nodes.

edited Feb 16 '14 at 05:22

iConnor

19,997
14
62
97

answered Jul 30 '11 at 17:47

Guillermo

17,772
1
18
12

.nextSibling is from Dom specification: https://developer.mozilla.org/en/Document_Object_Model_(DOM)/Node.nextSibling – Guillermo Feb 14 '12 at 10:15

score 0 · Answer 12 · answered Dec 03 '22 at 17:44

This gets the job done regardless of the tag names. Select your parent.

It gives an array of strings with no duplications for parents and their children.

$('parent')
.find(":not(iframe)")
.addBack()
.contents()
.filter(function() {return this.nodeType == 3;})
//.map((i,v) => $(v).text()) // uncomment if you want strings

How do I select text nodes with jQuery?

12 Answers12

Linked

Related