40

I have an div elemet:

<div>
   This is some text
   <h1>This is a title</h1>
   <div>Some other content</div>
</div>

What xpath expression should I use to only get the div content without his child elements h1 and div

//div[not(h1)&not(div)]

Something like that? I cannot figure it out

nticaric
  • 401
  • 1
  • 4
  • 3
  • Good question, +1. See my answer for three XPath expressions that probably provide you with the not defined by you "content" of the `div` element. – Dimitre Novatchev Dec 15 '10 at 22:57

4 Answers4

62

To get the string value of div use:

string(/div)

This is the concatenation of all text nodes that are descendents of the (top) div element.

To select all text node descendents of div use:

/div//text()

To get only the text nodes that are direct children of div use:

/div/text()

Finally, get the first (and hopefully only) non-whitespace-only text node child of div:

/div/text()[normalize-space()][1]
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • Your answer only shows how to get text of either parent node AND it's descendants, or descendants only. I'm sure the question was how to get ONLY parent node text, without descendants (e.g. `This is some text` in example given in question) – Eduard Sukharev Mar 24 '20 at 10:26
  • 1
    @EduardSukharev, The answer correctly provides this expression: `/div/text()` In general one may not know exactly which of the text-node children of `div` is the wanted one. There may be other, whitespace-only nodes. So, one has to provide the position of the node they are interested in... In case it is specified that only one text child node of `div` is a non-whitespace-only node, then the expression to select this node is: `/div/text()[not(normalize-space())][1]` I have edited the question. Please reverse your downvote and consider upvoting now. – Dimitre Novatchev Mar 24 '20 at 13:16
  • 1
    My bad, I misunderstood `text nodes` as just `nodes`, which is obviously not what you meant. And, yes, your answer is correct, full and thorough. Thank you. – Eduard Sukharev Mar 24 '20 at 14:13
5

expression like ./text() will retrieve only the content of root element only.

Regards, Nitin

Nitin
  • 51
  • 1
  • 3
5

What xpath expression should I use to only get the div content without his child elements h1 and div

This XPath expression:

/div/node()[not(self::h1|self::div)]

It selects every div root element's children except those h1 or div elements.

2

You can use this XPath expression:

./div[1]/text()[1]

to test, I use this online tester : http://xpather.com/