299

How to get tag in html page, if I know what text tag contains. E.g.:

<a ...>SearchingText</a>
James Douglas
  • 3,328
  • 2
  • 22
  • 43
Anton Kandybo
  • 3,678
  • 4
  • 23
  • 31
  • 3
    clean, functional approach returning array https://stackoverflow.com/a/45089849/696535 – Pawel Aug 09 '17 at 18:41

17 Answers17

351

You could use xpath to accomplish this

var xpath = "//a[text()='SearchingText']";
var matchingElement = document.evaluate(xpath, document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;

You can also search of an element containing some text using this xpath:

var xpath = "//a[contains(text(),'Searching')]";
carlin.scott
  • 6,214
  • 3
  • 30
  • 35
  • 16
    This should be the top answer. XPath can do much more, like select node by attribute value, select node sets ... Simple intro: https://www.w3schools.com/xml/xpath_syntax.asp – Timathon Dec 02 '17 at 02:43
  • 2
    Question is, what is the performance penalty for this trickery – vsync Mar 01 '18 at 09:43
  • 4
    @vsync I think this will be faster than any of the other answers as the xpath is executed by a browser provided algorithm rather than executing in javascript like all the other answers here. It's an interesting question though. – carlin.scott Mar 02 '18 at 17:46
  • 1
    Seems `Document.evaluate()` [isn't supposed](https://developer.mozilla.org/en-US/docs/Web/API/Document/evaluate#Browser_compatibility) in *IE* browser – vsync Mar 03 '18 at 10:56
  • Why do I get `null` as a result, if I am here at Stackoverflow and I execute in console: `document.evaluate("a[text()='share']", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;`? Firefox 63. I see many 'share' links. – Alexander C Nov 02 '18 at 23:36
  • @AlexanderChzhen you can add '//' to the beginning of the xpath to find all nodes regardless of their parent node. I almost always include that in my xpaths so I'm not sure why I left it out of my answer. Perhaps simplicity? It's definitely faster to only search the root document for an element. – carlin.scott Nov 04 '18 at 00:46
  • @carlin.scott, yes, it works. But why do i need to do this? You wrote the original answer without `//`. – Alexander C Nov 04 '18 at 06:54
  • 1
    I don't know why, but somehow ```var xpath = "//a[text()='SearchingText']";``` This is not working, but ```var xpath = "//a[contains(text(),'Searching')]";``` this works. Be aware of the excaped character, such as \' \'. – Joey Cho Oct 04 '19 at 12:35
  • I've found the `//a[contains(text(),'needle')]` to be a safer option, because any hidden characters / whitespace around the needle will thwart the more explicit `[text()='needle']` (which would return `null` if there's a trailing space, for instance) – ptim Jan 16 '20 at 06:44
  • You could use the normalize-space xpath method to remove the extra invisible characters. For example, `//a[contains(normalize-space(text())='SearchingText'])`. – carlin.scott Jan 17 '20 at 16:24
  • 1
    This only returns a single node, how to return all nodes? – Daniel Sep 12 '20 at 16:17
  • 6
    @Daniel You would need to change the call to this: ```js var matchingElementSet = document.evaluate(xpath, document, null, XPathResult.ORDERED_NODE_ITERATOR_TYPE, null); while(element = result.iterateNext()) { // do something with each element } ``` https://developer.mozilla.org/en-US/docs/Web/API/XPathResult/iterateNext – carlin.scott Sep 14 '20 at 21:26
  • Is there any way to make it work on `shadow-root` elements? – Radical Edward Mar 17 '21 at 13:45
213

You'll have to traverse by hand.

var aTags = document.getElementsByTagName("a");
var searchText = "SearchingText";
var found;

for (var i = 0; i < aTags.length; i++) {
  if (aTags[i].textContent == searchText) {
    found = aTags[i];
    break;
  }
}

// Use `found`.
August Lilleaas
  • 54,010
  • 13
  • 102
  • 111
92

Using the most modern syntax available at the moment, it can be done very cleanly like this:

for (const a of document.querySelectorAll("a")) {
  if (a.textContent.includes("your search term")) {
    console.log(a.textContent)
  }
}

Or with a separate filter:

[...document.querySelectorAll("a")]
   .filter(a => a.textContent.includes("your search term"))
   .forEach(a => console.log(a.textContent))

Naturally, legacy browsers won't handle this, but you can use a transpiler if legacy support is needed.

47

You can use jQuery :contains() Selector

var element = $( "a:contains('SearchingText')" );
Mouneer
  • 12,827
  • 2
  • 35
  • 45
  • I get: `Error: <![EX[["Tried to get element with id of \"%s\" but it is not present on the page","a:contains('SearchingText')"]]]> TAAL[1]` though I have elements with "SearchingText" in them. – Rishabh Agrahari Dec 15 '18 at 17:27
23

Functional approach. Returns array of all matched elements and trims spaces around while checking.

function getElementsByText(str, tag = 'a') {
  return Array.prototype.slice.call(document.getElementsByTagName(tag)).filter(el => el.textContent.trim() === str.trim());
}

Usage

getElementsByText('Text here'); // second parameter is optional tag (default "a")

if you're looking through different tags i.e. span or button

getElementsByText('Text here', 'span');
getElementsByText('Text here', 'button');

The default value tag = 'a' will need Babel for old browsers

Pawel
  • 16,093
  • 5
  • 70
  • 73
  • This is incorrect because it also includes results for all child nodes. I.e. if child node of `a` will contain `str` - `el` will be included into `getElementsByText` result; which is wrong. – avalanche1 Oct 13 '19 at 15:13
  • @avalanche1 it depends if that's undesirable. It might be needed to select by text even if it's wrapped in another tag i.e. – Pawel Oct 13 '19 at 15:28
  • I made `document` into a passed variable `elm` so I could narrow down before calling func, and no reason I can't just pass `document`, but I prefer it that way. Also removed the default `tag = 'a'`. Great answer though! I love how you used the name convention of existing methods. – FreeSoftwareServers Apr 05 '22 at 23:17
22

function findByTextContent(needle, haystack, precise) {
  // needle: String, the string to be found within the elements.
  // haystack: String, a selector to be passed to document.querySelectorAll(),
  //           NodeList, Array - to be iterated over within the function:
  // precise: Boolean, true - searches for that precise string, surrounded by
  //                          word-breaks,
  //                   false - searches for the string occurring anywhere
  var elems;

  // no haystack we quit here, to avoid having to search
  // the entire document:
  if (!haystack) {
    return false;
  }
  // if haystack is a string, we pass it to document.querySelectorAll(),
  // and turn the results into an Array:
  else if ('string' == typeof haystack) {
    elems = [].slice.call(document.querySelectorAll(haystack), 0);
  }
  // if haystack has a length property, we convert it to an Array
  // (if it's already an array, this is pointless, but not harmful):
  else if (haystack.length) {
    elems = [].slice.call(haystack, 0);
  }

  // work out whether we're looking at innerText (IE), or textContent 
  // (in most other browsers)
  var textProp = 'textContent' in document ? 'textContent' : 'innerText',
    // creating a regex depending on whether we want a precise match, or not:
    reg = precise === true ? new RegExp('\\b' + needle + '\\b') : new RegExp(needle),
    // iterating over the elems array:
    found = elems.filter(function(el) {
      // returning the elements in which the text is, or includes,
      // the needle to be found:
      return reg.test(el[textProp]);
    });
  return found.length ? found : false;;
}


findByTextContent('link', document.querySelectorAll('li'), false).forEach(function(elem) {
  elem.style.fontSize = '2em';
});

findByTextContent('link3', 'a').forEach(function(elem) {
  elem.style.color = '#f90';
});
<ul>
  <li><a href="#">link1</a>
  </li>
  <li><a href="#">link2</a>
  </li>
  <li><a href="#">link3</a>
  </li>
  <li><a href="#">link4</a>
  </li>
  <li><a href="#">link5</a>
  </li>
</ul>

Of course, a somewhat simpler way still is:

var textProp = 'textContent' in document ? 'textContent' : 'innerText';

// directly converting the found 'a' elements into an Array,
// then iterating over that array with Array.prototype.forEach():
[].slice.call(document.querySelectorAll('a'), 0).forEach(function(aEl) {
  // if the text of the aEl Node contains the text 'link1':
  if (aEl[textProp].indexOf('link1') > -1) {
    // we update its style:
    aEl.style.fontSize = '2em';
    aEl.style.color = '#f90';
  }
});
<ul>
  <li><a href="#">link1</a>
  </li>
  <li><a href="#">link2</a>
  </li>
  <li><a href="#">link3</a>
  </li>
  <li><a href="#">link4</a>
  </li>
  <li><a href="#">link5</a>
  </li>
</ul>

References:

Let Me Tink About It
  • 15,156
  • 21
  • 98
  • 207
David Thomas
  • 249,100
  • 51
  • 377
  • 410
15

Simply pass your substring into the following line:

Outer HTML

document.documentElement.outerHTML.includes('substring')

Inner HTML

document.documentElement.innerHTML.includes('substring')

You can use these to search through the entire document and retrieve the tags that contain your search term:

function get_elements_by_inner(word) {
    res = []
    elems = [...document.getElementsByTagName('a')];
    elems.forEach((elem) => { 
        if(elem.outerHTML.includes(word)) {
            res.push(elem)
        }
    })
    return(res)
}

Usage:

How many times is the user "T3rm1" mentioned on this page?

get_elements_by_inner("T3rm1").length

1

How many times is jQuery mentioned?

get_elements_by_inner("jQuery").length

3

Get all elements containing the word "Cybernetic":

get_elements_by_inner("Cybernetic")

enter image description here

Cybernetic
  • 12,628
  • 16
  • 93
  • 132
14

To get the filter method from user1106925 working in <=IE11 if needed

You can replace the spread operator with:

[].slice.call(document.querySelectorAll("a"))

and the includes call with a.textContent.match("your search term")

which works pretty neatly:

[].slice.call(document.querySelectorAll("a"))
   .filter(a => a.textContent.match("your search term"))
   .forEach(a => console.log(a.textContent))
Alkie
  • 321
  • 3
  • 7
  • 3
    I like this method. You can also `Array.from` instead of `[].slice.call`. For example: `Array.from(document.querySelectorAll('a'))` – Richard May 15 '20 at 18:16
8

You can do this, not sure if this is recommended but it works for me.

[... document.querySelectorAll('a')].filter(el => el.textContent.includes('sometext'));
RJ Jaictin
  • 81
  • 2
  • 3
  • 1
    Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Mar 05 '22 at 12:45
6

I found the use of the newer syntax a little bit shorter compared to the others answer. So here's my proposal:

const callback = element => element.innerHTML == 'My research'

const elements = Array.from(document.getElementsByTagName('a'))
// [a, a, a, ...]

const result = elements.filter(callback)

console.log(result)
// [a]

JSfiddle.net

Amin NAIRI
  • 2,292
  • 21
  • 20
6
document.querySelectorAll('a').forEach(function (item) {
    if (item.innerText == 'SearchingText') {
        console.dir(item);
    }
});
  • 2
    Please don't post code-only answers but add a little textual explanation about how and why your approach works and what makes it different from the other answers given. You may also have a look at our ["How to write a good answer"](https://stackoverflow.com/help/how-to-answer) entry. – ahuemmer Jul 26 '22 at 08:48
5

You can use a TreeWalker to go over the DOM nodes, and locate all text nodes that contain the text, and return their parents:

const findNodeByContent = (text, root = document.body) => {
  const treeWalker = document.createTreeWalker(root, NodeFilter.SHOW_TEXT);

  const nodeList = [];

  while (treeWalker.nextNode()) {
    const node = treeWalker.currentNode;

    if (node.nodeType === Node.TEXT_NODE && node.textContent.includes(text)) {
      nodeList.push(node.parentNode);
    }
  };

  return nodeList;
}

const result = findNodeByContent('SearchingText');

console.log(result);
<a ...>SearchingText</a>
Ori Drori
  • 183,571
  • 29
  • 224
  • 209
2

While it's possible to get by the inner text, I think you are heading the wrong way. Is that inner string dynamically generated? If so, you can give the tag a class or -- better yet -- ID when the text goes in there. If it's static, then it's even easier.

Zack Marrapese
  • 12,072
  • 9
  • 51
  • 69
2

This does the job.
Returns an array of nodes containing text.

function get_nodes_containing_text(selector, text) {
    const elements = [...document.querySelectorAll(selector)];

    return elements.filter(
      (element) =>
        element.childNodes[0]
        && element.childNodes[0].nodeValue
        && RegExp(text, "u").test(element.childNodes[0].nodeValue.trim())
    );
  }
avalanche1
  • 3,154
  • 1
  • 31
  • 38
2

const el = Array.from(document.body.querySelectorAll('a')).find(elm => elm.textContent.toLowerCase().includes('searching text'));
const el2 = document.evaluate('//a[contains(text(), "text5")]', document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
console.log(el, el2);
<a href="#">text1</a>
<a href="#">text2</a>
<a href="#">Searching Text</a>
<a href="#">text3</a>
<a href="#">text4</a>
<a href="#">text5</a>
0

I think you'll need to be a bit more specific for us to help you.

  1. How are you finding this? Javascript? PHP? Perl?
  2. Can you apply an ID attribute to the tag?

If the text is unique (or really, if it's not, but you'd have to run through an array) you could run a regular expression to find it. Using PHP's preg_match() would work for that.

If you're using Javascript and can insert an ID attribute, then you can use getElementById('id'). You can then access the returned element's attributes through the DOM: https://developer.mozilla.org/en/DOM/element.1.

Jeff Meyers
  • 144
  • 5
0

I've just needed a way to get the element that contains a specific text and this is what I came up with.

Use document.getElementsByInnerText() to get multiple elements (multiple elements might have the same exact text), and use document.getElementByInnerText() to get just one element (first match).

Also, you can localize the search by using an element (e.g. someElement.getElementByInnerText()) instead of document.

You might need to tweak it in order to make it cross-browser or satisfy your needs.

I think the code is self-explanatory, so I'll leave it as it is.

HTMLElement.prototype.getElementsByInnerText = function (text, escape) {
    var nodes  = this.querySelectorAll("*");
    var matches = [];
    for (var i = 0; i < nodes.length; i++) {
        if (nodes[i].innerText == text) {
            matches.push(nodes[i]);
        }
    }
    if (escape) {
        return matches;
    }
    var result = [];
    for (var i = 0; i < matches.length; i++) {
        var filter = matches[i].getElementsByInnerText(text, true);
        if (filter.length == 0) {
            result.push(matches[i]);
        }
    }
    return result;
};
document.getElementsByInnerText = HTMLElement.prototype.getElementsByInnerText;

HTMLElement.prototype.getElementByInnerText = function (text) {
    var result = this.getElementsByInnerText(text);
    if (result.length == 0) return null;
    return result[0];
}
document.getElementByInnerText = HTMLElement.prototype.getElementByInnerText;

console.log(document.getElementsByInnerText("Text1"));
console.log(document.getElementsByInnerText("Text2"));
console.log(document.getElementsByInnerText("Text4"));
console.log(document.getElementsByInnerText("Text6"));

console.log(document.getElementByInnerText("Text1"));
console.log(document.getElementByInnerText("Text2"));
console.log(document.getElementByInnerText("Text4"));
console.log(document.getElementByInnerText("Text6"));
<table>
    <tr>
        <td>Text1</td>
    </tr>
    <tr>
        <td>Text2</td>
    </tr>
    <tr>
        <td>
            <a href="#">Text2</a>
        </td>
    </tr>
    <tr>
        <td>
            <a href="#"><span>Text3</span></a>
        </td>
    </tr>
    <tr>
        <td>
            <a href="#">Special <span>Text4</span></a>
        </td>
    </tr>
    <tr>
        <td>
            Text5
            <a href="#">Text6</a>
            Text7
        </td>
    </tr>
</table>
akinuri
  • 10,690
  • 10
  • 65
  • 102