0

I have the following HTML code:

<html>

<span class='whatever'>

  <div @id='xyz'>
    "text1"
    "text2"   <=== I am trying to extract this text
  </div>
</span>
</html>

Is it possible to write an xpath that points to the node that is text2? If yes, then I can extract via .text (python).

Mugen
  • 1,417
  • 5
  • 22
  • 40

2 Answers2

1

That really depends on what type of parser you are using for your html. Your html parser would provide you with something like a inner html or inner text node get module. You can use that and if you only want text2 you can use regular expressions or something other to filter the text out.

There is another method that if html is also written by you. Then you can enclose the text2 with span tag and directly get it.

1

You can extract complete text with //div[@id='xyz']/text() XPath and then get required text with

text.split('\n')[-1]
JaSON
  • 4,843
  • 2
  • 8
  • 15
  • I've been using this method but was wondering if there is a better method. Looks like there isn't. One thing I want to add is that adding `/text()` at the end isn't a good method because whenever we try to fetch this element via selenium, it will lead to an error because selenium only allows grabbing HTML elements. But I realize that I didn't mention Selenium anywhere in my question so I guess we can leave this as it stands. – Mugen Feb 23 '21 at 03:54