5

I'm trying to search inner text case insensitive using puppeteer.

I've read this: case insensitive xpath contains() possible?

For example I have this elements:

<div>
 <span>Test One</span>
 <span>Test Two</span>
 <span>Test Three</span>
</div>

I've tried this unsuccessfully:

const element = await page.$x("//span//text()[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'two')]");
vsemozhebuty
  • 12,992
  • 1
  • 26
  • 26
kurtko
  • 1,978
  • 4
  • 30
  • 47

3 Answers3

5

Your XPath expression is valid, but you are returning text() instead of the node itself. page.$x expects the XPath to return an element, therefore your code does not work. To return the node you need to query for the span element.

const element = await page.$x("//span[contains(translate(text(), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'two')]");

Please note, that text() only works for text-only nodes. If you have mixed content (containing elements and text), you should use the string value (. instead of text()):

const element = await page.$x("//span[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'two')]");

To compare the expressions I put them below each other:

//span//text()[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'two')]
//span[contains(translate(text(), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'two')
//span[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'two')]

The first one is the expression (given by you) for the text of the span node. The second one queries the node itself by using text(). The last one uses the string value to query the node.

Thomas Dondorf
  • 23,416
  • 6
  • 84
  • 105
  • 1
    Do note that whenever you have mixed content (text and markup) it's better not to use text nodes but string value: this expresion `//span[contains(translate(.,'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'),'two')]` will select this ` this IS two.` – Alejandro Mar 28 '19 at 21:49
  • @Alejandro Thanks for the comment. I added the expression to the answer. – Thomas Dondorf Mar 29 '19 at 15:50
3

Not as pretty, but you can use page.evaluateHandle along with a regex to find the element:

const element = await page.evaluateHandle(() =>
    Array.from(document.querySelectorAll("div > span")).find(a => /test two/i.test(a.innerText))
);
spb
  • 897
  • 2
  • 7
  • 15
1

Similar to spb's, I would do:

const element = await page.evaluateHandle(() =>
 [...document.querySelectorAll('span')].find(s => s.innerText.toLowerCase().match('two'))
)
pguardiario
  • 53,827
  • 19
  • 119
  • 159