How to get the text node of an element?

Question

<div class="title">
   I am text node
   <a class="edit">Edit</a>
</div>

I wish to get the "I am text node", do not wish to remove the "edit" tag, and need a cross browser solution.

this question is pretty much identical to http://stackoverflow.com/questions/3172166/getting-the-contents-of-an-element-without-its-children - see those answers for a plain JS version of James' answer — Mala, Aug 16 '11 at 04:15

score 94 · Accepted Answer · edited Feb 20 '17 at 06:14

94

var text = $(".title").contents().filter(function() {
  return this.nodeType == Node.TEXT_NODE;
}).text();

This gets the contents of the selected element, and applies a filter function to it. The filter function returns only text nodes (i.e. those nodes with nodeType == Node.TEXT_NODE).

edited Feb 20 '17 at 06:14

Mohammed H

6,880
16
81
127

answered Jun 29 '11 at 11:58

James Allardice

164,175
21
332
312

@Val - sorry, I missed that off the original code. I will update the answer to show it. You need `text()` because the the `filter` function returns the nodes themselves, not the contents of the nodes. – James Allardice Jun 29 '11 at 12:20
1

Not sure why but I'm unssuccsessful when testing the theory above. I ran the following `jQuery("*").each(function() { console.log(this.nodeType); })` and I got **1** for all the node types. – Batandwa Jan 06 '14 at 06:51
Is it possible to get text at the clicked node and text in all its children? – Jenna Kwon Dec 09 '16 at 04:25
This is interesting and solves this problem, but what happens when the situation gets more complex? There is a more flexible way to get the job done. – Anthony Rutledge May 15 '18 at 15:45
2

Without jQuery, document.querySelector(".title").childNodes[0].nodeValue – Balaji Gunasekaran Jun 07 '20 at 18:14
also. document.querySelector(".title").childNodes[0].wholeText will give whole value even if multiple textnodes are present. – Balaji Gunasekaran Jun 07 '20 at 18:26

score 67 · Answer 2 · answered Jun 29 '11 at 11:58

67

You can get the nodeValue of the first childNode using

$('.title')[0].childNodes[0].nodeValue

http://jsfiddle.net/TU4FB/

answered Jun 29 '11 at 11:58

Dogbert

212,659
41
396
397

11

While that will work, it depends on the position of the child nodes. If (when) that changes, it will break. – Armstrongest Feb 26 '18 at 23:46
2

If the text node is not the first child, you may get `null` for a return value. – Anthony Rutledge May 15 '18 at 08:27

score 27 · Answer 3 · edited Dec 03 '19 at 11:52

Another native JS solution that can be useful for "complex" or deeply nested elements is to use NodeIterator. Put NodeFilter.SHOW_TEXT as the second argument ("whatToShow"), and iterate over just the text node children of the element.

var root = document.querySelector('p'),
    iter = document.createNodeIterator(root, NodeFilter.SHOW_TEXT),
    textnode;

// print all text nodes
while (textnode = iter.nextNode()) {
  console.log(textnode.textContent)
}

<p>
<br>some text<br>123
</p>

You can also use TreeWalker. The difference between the two is that NodeIterator is a simple linear iterator, while TreeWalker allows you to navigate via siblings and ancestors as well.

The iterator will pull text nodes deeply, undesirable if you only want the top-level children. — ggorlen, Sep 12 '22 at 17:50

jujule · Answer 4 · 2020-04-22T23:13:25.463

23

ES6 version that return the first #text node content

const extract = (node) => {
  const text = [...node.childNodes].find(child => child.nodeType === Node.TEXT_NODE);
  return text && text.textContent.trim();
}

edited Apr 22 '20 at 23:13

answered Dec 09 '16 at 00:31

jujule

11,125
3
42
63

1

I am wondering about efficiency and flexibility. (1) The use of `.from()` to make a shallow-copied array instance. (2) The use of `.find()` to do a string comparisons using `.nodeName`. Using `node.NodeType === Node.TEXT_NODE` would be better. (3) Returning an empty string when no value, `null`, is more true if no *text node* is found. If no text node is found, one may need to create one! If you return an empty string, `""`, you may give the false impression that a text node exists and can be manipulated normally. In essence, returning an empty string is a white lie and best avoided. – Anthony Rutledge May 15 '18 at 18:33
(4) If there is more than one text node in a nodeList, there is no way here to specify which text node you want. You may want the *first* text node, but you very well may want the *last* text node. – Anthony Rutledge May 15 '18 at 18:36
What do you suggest to replace the Array.from ? – jujule May 19 '18 at 00:15
@Snowman please add your own answer for such substantive changes, or make recommendations for OP to give them the opportunity to incorporate them into their answer. – TylerH May 29 '19 at 21:56
@jujule - Better to use `[...node.childNodes]` to convert *HTMLCollection* into Arrays – vsync Dec 03 '19 at 11:59
@jujule, great answer, thanks; not sure if it's a typo or specs changed, but now `child.NodeType` should be `child.nodeType` (with lowercase nodeType). – Vaviloff Apr 21 '20 at 08:57

score 19 · Answer 5 · answered Jun 29 '11 at 11:58

19

If you mean get the value of the first text node in the element, this code will work:

var oDiv = document.getElementById("MyDiv");
var firstText = "";
for (var i = 0; i < oDiv.childNodes.length; i++) {
    var curNode = oDiv.childNodes[i];
    if (curNode.nodeName === "#text") {
        firstText = curNode.nodeValue;
        break;
    }
}

You can see this in action here: http://jsfiddle.net/ZkjZJ/

answered Jun 29 '11 at 11:58

Shadow The GPT Wizard

66,030
26
140
208

I think you could use `curNode.nodeType == 3` instead of `nodeName` as well. – Nilloc Jul 21 '17 at 18:57
1

@Nilloc probably, but what's the gain? – Shadow The GPT Wizard Jul 21 '17 at 19:52
6

@ShadowWizard @Nilloc recommended way for that is to use constants... `curNode.nodeType == Node.TEXT_NODE` (numeric comparison is faster but curNode.nodeType == 3 is not readable - what node has number 3?) – mikep Aug 08 '17 at 10:31
2

@ShadowWizard Use `curNode.NodeType === Node.TEXT_NODE`. This comparison is occurring within a loop of unknown possible iterations. Comparing two small numbers is better than comparing strings of various lengths (time and space considerations). The correct question to ask in this situation is "what kind / type of node do I have?", and not "what name do I have?" https://developer.mozilla.org/en-US/docs/Web/API/Node/nodeType – Anthony Rutledge May 15 '18 at 08:21
2

@ShadowWizard Also, if you are going to use a loop to sift through `childNodes`, know that an element node can have *more than one* text node. In a generic solution, one might need to specify which instance of a text node within an element node that you want to target (the first, second, third, etc...). – Anthony Rutledge May 15 '18 at 08:49

Anthony Rutledge · Answer 6 · 2020-02-02T05:56:02.200

Pure JavaScript: Minimalist

First off, always keep this in mind when looking for text in the DOM.

MDN - Whitespace in the DOM

This issue will make you pay attention to the structure of your XML / HTML.

In this pure JavaScript example, I account for the possibility of multiple text nodes that could be interleaved with other kinds of nodes. However, initially, I do not pass judgment on whitespace, leaving that filtering task to other code.

In this version, I pass a NodeList in from the calling / client code.

/**
* Gets strings from text nodes. Minimalist. Non-robust. Pre-test loop version.
* Generic, cross platform solution. No string filtering or conditioning.
*
* @author Anthony Rutledge
* @param nodeList The child nodes of a Node, as in node.childNodes.
* @param target A positive whole number >= 1
* @return String The text you targeted.
*/
function getText(nodeList, target)
{
    var trueTarget = target - 1,
        length = nodeList.length; // Because you may have many child nodes.

    for (var i = 0; i < length; i++) {
        if ((nodeList[i].nodeType === Node.TEXT_NODE) && (i === trueTarget)) {
            return nodeList[i].nodeValue;  // Done! No need to keep going.
        }
    }

    return null;
}

Of course, by testing node.hasChildNodes() first, there would be no need to use a pre-test for loop.

/**
* Gets strings from text nodes. Minimalist. Non-robust. Post-test loop version.
* Generic, cross platform solution. No string filtering or conditioning.
*
* @author Anthony Rutledge
* @param nodeList The child nodes of a Node, as in node.childNodes.
* @param target A positive whole number >= 1
* @return String The text you targeted.
*/
function getText(nodeList, target)
{
    var trueTarget = target - 1,
        length = nodeList.length,
        i = 0;

    do {
        if ((nodeList[i].nodeType === Node.TEXT_NODE) && (i === trueTarget)) {
            return nodeList[i].nodeValue;  // Done! No need to keep going.
         }

        i++;
    } while (i < length);

    return null;
}

Pure JavaScript: Robust

Here the function getTextById() uses two helper functions: getStringsFromChildren() and filterWhitespaceLines().

getStringsFromChildren()

/**
* Collects strings from child text nodes.
* Generic, cross platform solution. No string filtering or conditioning.
*
* @author Anthony Rutledge
* @version 7.0
* @param parentNode An instance of the Node interface, such as an Element. object.
* @return Array of strings, or null.
* @throws TypeError if the parentNode is not a Node object.
*/
function getStringsFromChildren(parentNode)
{
    var strings = [],
        nodeList,
        length,
        i = 0;

    if (!parentNode instanceof Node) {
        throw new TypeError("The parentNode parameter expects an instance of a Node.");
    }

    if (!parentNode.hasChildNodes()) {
        return null; // We are done. Node may resemble <element></element>
    }

    nodeList = parentNode.childNodes;
    length = nodeList.length;

    do {
        if ((nodeList[i].nodeType === Node.TEXT_NODE)) {
            strings.push(nodeList[i].nodeValue);
         }

        i++;
    } while (i < length);

    if (strings.length > 0) {
        return strings;
    }

    return null;
}

filterWhitespaceLines()

/**
* Filters an array of strings to remove whitespace lines.
* Generic, cross platform solution.
*
* @author Anthony Rutledge
* @version 6.0
* @param textArray a String associated with the id attribute of an Element.
* @return Array of strings that are not lines of whitespace, or null.
* @throws TypeError if the textArray param is not of type Array.
*/
function filterWhitespaceLines(textArray) 
{
    var filteredArray = [],
        whitespaceLine = /(?:^\s+$)/; // Non-capturing Regular Expression.

    if (!textArray instanceof Array) {
        throw new TypeError("The textArray parameter expects an instance of a Array.");
    }

    for (var i = 0; i < textArray.length; i++) {
        if (!whitespaceLine.test(textArray[i])) {  // If it is not a line of whitespace.
            filteredArray.push(textArray[i].trim());  // Trimming here is fine. 
        }
    }

    if (filteredArray.length > 0) {
        return filteredArray ; // Leave selecting and joining strings for a specific implementation. 
    }

    return null; // No text to return.
}

getTextById()

/**
* Gets strings from text nodes. Robust.
* Generic, cross platform solution.
*
* @author Anthony Rutledge
* @version 6.0
* @param id A String associated with the id property of an Element.
* @return Array of strings, or null.
* @throws TypeError if the id param is not of type String.
* @throws TypeError if the id param cannot be used to find a node by id.
*/
function getTextById(id) 
{
    var textArray = null;             // The hopeful output.
    var idDatatype = typeof id;       // Only used in an TypeError message.
    var node;                         // The parent node being examined.

    try {
        if (idDatatype !== "string") {
            throw new TypeError("The id argument must be of type String! Got " + idDatatype);
        }

        node = document.getElementById(id);

        if (node === null) {
            throw new TypeError("No element found with the id: " + id);
        }

        textArray = getStringsFromChildren(node);

        if (textArray === null) {
            return null; // No text nodes found. Example: <element></element>
        }

        textArray = filterWhitespaceLines(textArray);

        if (textArray.length > 0) {
            return textArray; // Leave selecting and joining strings for a specific implementation. 
        }
    } catch (e) {
        console.log(e.message);
    }

    return null; // No text to return.
}

Next, the return value (Array, or null) is sent to the client code where it should be handled. Hopefully, the array should have string elements of real text, not lines of whitespace.

Empty strings ("") are not returned because you need a text node to properly indicate the presence of valid text. Returning ("") may give the false impression that a text node exists, leading someone to assume that they can alter the text by changing the value of .nodeValue. This is false, because a text node does not exist in the case of an empty string.

Example 1:

<p id="bio"></p> <!-- There is no text node here. Return null. -->

Example 2:

<p id="bio">

</p> <!-- There are at least two text nodes ("\n"), here. -->

The problem comes in when you want to make your HTML easy to read by spacing it out. Now, even though there is no human readable valid text, there are still text nodes with newline ("\n") characters in their .nodeValue properties.

Humans see examples one and two as functionally equivalent--empty elements waiting to be filled. The DOM is different than human reasoning. This is why the getStringsFromChildren() function must determine if text nodes exist and gather the .nodeValue values into an array.

for (var i = 0; i < length; i++) {
    if (nodeList[i].nodeType === Node.TEXT_NODE) {
            textNodes.push(nodeList[i].nodeValue);
    }
}

In example two, two text nodes do exist and getStringFromChildren() will return the .nodeValue of both of them ("\n"). However, filterWhitespaceLines() uses a regular expression to filter out lines of pure whitespace characters.

Is returning null instead of newline ("\n") characters a form of lying to the client / calling code? In human terms, no. In DOM terms, yes. However, the issue here is getting text, not editing it. There is no human text to return to the calling code.

One can never know how many newline characters might appear in someone's HTML. Creating a counter that looks for the "second" newline character is unreliable. It might not exist.

Of course, further down the line, the issue of editing text in an empty <p></p> element with extra whitespace (example 2) might mean destroying (maybe, skipping) all but one text node between a paragraph's tags to ensure the element contains precisely what it is supposed to display.

Regardless, except for cases where you are doing something extraordinary, you will need a way to determine which text node's .nodeValue property has the true, human readable text that you want to edit. filterWhitespaceLines gets us half way there.

var whitespaceLine = /(?:^\s+$)/; // Non-capturing Regular Expression.

for (var i = 0; i < filteredTextArray.length; i++) {
    if (!whitespaceLine.test(textArray[i])) {  // If it is not a line of whitespace.
        filteredTextArray.push(textArray[i].trim());  // Trimming here is fine. 
    }
}

At this point you may have output that looks like this:

["Dealing with text nodes is fun.", "Some people just use jQuery."]

There is no guarantee that these two strings are adjacent to each other in the DOM, so joining them with .join() might make an unnatural composite. Instead, in the code that calls getTextById(), you need to chose which string you want to work with.

Test the output.

try {
    var strings = getTextById("bio");

    if (strings === null) {
        // Do something.
    } else if (strings.length === 1) {
        // Do something with strings[0]
    } else { // Could be another else if
        // Do something. It all depends on the context.
    }
} catch (e) {
    console.log(e.message);
}

One could add .trim() inside of getStringsFromChildren() to get rid of leading and trailing whitespace (or to turn a bunch of spaces into a zero length string (""), but how can you know a priori what every application may need to have happen to the text (string) once it is found? You don't, so leave that to a specific implementation, and let getStringsFromChildren() be generic.

There may be times when this level of specificity (the target and such) is not required. That is great. Use a simple solution in those cases. However, a generalized algorithm enables you to accommodate simple and complex situations.

Pranay Rana · Answer 7 · 2011-06-29T12:29:17.213

3

.text() - for jquery

$('.title').clone()    //clone the element
.children() //select all the children
.remove()   //remove all the children
.end()  //again go back to selected element
.text();    //get the text of element

edited Jun 29 '11 at 12:29

answered Jun 29 '11 at 11:53

Pranay Rana

175,020
35
237
263

1

I think the method for standard javascript must be 'innerText' – Reporter Jun 29 '11 at 11:55
2

This doesn't work the way the OP wants - it will get the text within the `a` element too: http://jsfiddle.net/ekHJH/ – James Allardice Jun 29 '11 at 12:00
1

@James Allardice - I am done with the jquery solution now this will work................. – Pranay Rana Jun 29 '11 at 12:19
That will nearly work, but you are missing the `.` at the start of your selector, meaning you actually get the text of the `title` element, not elements with `class="title"` – James Allardice Jun 29 '11 at 12:23
@reporter `.innerText` is an old IE convention only recently adopted. In terms of standard DOM scripting, `node.nodeValue` is how one grabs the text of a text node. – Anthony Rutledge May 15 '18 at 18:43
Not recommended if you have something to do inside of a loop, or desire flexibility. – Anthony Rutledge May 15 '18 at 18:57
@AnthonyRutledge Take a look on the date from answer and my comment :-) – Reporter May 17 '18 at 15:09
@reporter My comment was more for the casual reader. I noticed the 2011 date when I commented. – Anthony Rutledge May 17 '18 at 16:42

score 3 · Answer 8 · answered May 09 '22 at 19:02

3

Simply via Vanilla JavaScript:

const el = document.querySelector('.title');
const text = el.firstChild.textContent.trim();

answered May 09 '22 at 19:02

AliN11

2,387
1
25
40

This is hardcoded to OP's single use case but doesn't generalize if the text node(s) is anywhere else in the children. – ggorlen Sep 12 '22 at 17:15

score 2 · Answer 9 · edited Aug 23 '17 at 18:35

2

This will ignore the whitespace as well so, your never got the Blank textNodes..code using core Javascript.

var oDiv = document.getElementById("MyDiv");
var firstText = "";
for (var i = 0; i < oDiv.childNodes.length; i++) {
    var curNode = oDiv.childNodes[i];
    whitespace = /^\s*$/;
    if (curNode.nodeName === "#text" && !(whitespace.test(curNode.nodeValue))) {
        firstText = curNode.nodeValue;
        break;
    }
}

Check it on jsfiddle : - http://jsfiddle.net/webx/ZhLep/

edited Aug 23 '17 at 18:35

Nakilon

34,866
14
107
142

answered Nov 12 '13 at 01:53

webx

987
7
11

`curNode.nodeType === Node.TEXT_NODE` would be better. Using string comparison and a regular expression within a loop is a low performing solution, especially as the magnitude of `oDiv.childNodes.length` increases. This algorithm solves the OP's specific question, but, potentially, at a terrible performance cost. If the arrangement, or number, of text nodes changes, then this solution cannot be guaranteed to return accurate output. In other words, you cannot target the exact text node you want. You are at the mercy of the HTML structure and arrangement of text there in. – Anthony Rutledge May 18 '18 at 21:00

score 1 · Answer 10 · answered Feb 09 '16 at 10:35

You can also use XPath's text() node test to get the text nodes only. For example

var target = document.querySelector('div.title');
var iter = document.evaluate('text()', target, null, XPathResult.ORDERED_NODE_ITERATOR_TYPE);
var node;
var want = '';

while (node = iter.iterateNext()) {
    want += node.data;
}

score 1 · Answer 11 · answered Apr 20 '23 at 09:29

1

Here's the non-robust one liner:

Array.from(document.querySelector("#title").childNodes).find(n => n.nodeType == Node.TEXT_NODE).textContent

answered Apr 20 '23 at 09:29

br4nnigan

646
6
13

ggorlen · Answer 12 · 2022-09-12T17:53:44.860

There are some overcomplicated solutions here but the operation is as straightforward as using .childNodes to get children of all node types and .filter to extract e.nodeType === Node.TEXT_NODEs. Optionally, we may want to do it recursively and/or ignore "empty" text nodes (all whitespace).

These examples convert the nodes to their text content for display purposes, but this is technically a separate step from filtering.

const immediateTextNodes = el =>
  [...el.childNodes].filter(e => e.nodeType === Node.TEXT_NODE);

const immediateNonEmptyTextNodes = el =>
  [...el.childNodes].filter(e =>
    e.nodeType === Node.TEXT_NODE && e.textContent.trim()
  );

const firstImmediateTextNode = el =>
  [...el.childNodes].find(e => e.nodeType === Node.TEXT_NODE);

const firstImmediateNonEmptyTextNode = el =>
  [...el.childNodes].find(e =>
    e.nodeType === Node.TEXT_NODE && e.textContent.trim()
  );

// example usage:
const text = el => el.textContent;
const p = document.querySelector("p");
console.log(immediateTextNodes(p).map(text));
console.log(immediateNonEmptyTextNodes(p).map(text));
console.log(text(firstImmediateTextNode(p)));
console.log(text(firstImmediateNonEmptyTextNode(p)));

// if you want to trim whitespace:
console.log(immediateNonEmptyTextNodes(p).map(e => text(e).trim()));

<p>
  <span>IGNORE</span>
  <b>IGNORE</b>
  foo
  <br>
  bar
</p>

Recursive alternative to a NodeIterator:

const deepTextNodes = el => [...el.childNodes].flatMap(e => 
  e.nodeType === Node.TEXT_NODE ? e : deepTextNodes(e)
);

const deepNonEmptyTextNodes = el =>
  [...el.childNodes].flatMap(e =>
    e.nodeType === Node.TEXT_NODE && e.textContent.trim()
    ? e : deepNonEmptyTextNodes(e)
  );

// example usage:
const text = el => el.textContent;
const p = document.querySelector("p");
console.log(deepTextNodes(p).map(text));
console.log(deepNonEmptyTextNodes(p).map(text));

<p>
  foo
  <span>bar</span>
  baz
  <span><b>quux</b></span>
</p>

Finally, feel free to join the text node array into a string if you wish using .join(""). But as with trimming and text content extraction, I'd probably not bake this into the core filtering function and leave it to the caller to handle as needed.

How to get the text node of an element?

12 Answers12

Pure JavaScript: Minimalist

Pure JavaScript: Robust

Linked

Related