1

I would like to get to most inner div in a html document that has the id or class containing "content".

What I have tried:

//div[@id[contains(.,'content') and not(*)]]

This works for getting the most inner div with an id containing "content".

Now I want to get the most inner div by id or class (depends on whats the most inner is) containing the id or class "content".

Sample data:

<body>
<div class="outerContent">
    <div id="moreContent">
        <div class="anotherContent">
            This is what I am looking for.
        </div>
    </div>
</div>
</body>

or

<body>
<div class="outerContent">
    <div id="moreContent">
        <div id="anotherContent">
            This is what I am looking for.
        </div>
    </div>
</div>
</body>

Note that "This is what I am looking for" could be inside a div class conainting "content" or a div id containing "content".

Thank you!

jimbo
  • 582
  • 1
  • 11
  • 28

3 Answers3

0

Updated.

If I'm understanding your question correctly, this is how I would do it: //descendant::div[last()][contains(@id,'Content')]

If you need to check @id for any case, use the translate function around the @id part.

Community
  • 1
  • 1
JWiley
  • 3,129
  • 8
  • 41
  • 66
0

I'm not totally clear about your exact question, so I give two interpretations.

No other <div/> elements fulfilling the predicates

The <div/> with either @class or @id containing 'content' which does not contain any other <div/> fulfilling this predicate. This allows other markup inside the <div/>.

//descendant::div
  (: either @id or @class contain 'content' :)
  [contains(lower-case(@id), 'content') or contains(lower-case(@class), 'content')]
  (: only inner-most div fulfilling this condition :)
  [last()]

Nothing but text

There must not be any element nodes inside the <div/>.

//descendant::*
  (: only inner-most elements :)
  [last()]
  (: which are a div :)
  [local-name(.) eq 'div']
  (: and either @id or @class contain 'content' :)
  [contains(lower-case(@id), 'content') or contains(lower-case(@class), 'content')]

If you haven't got XQuery 2.0 support, there should not be any fn:lower-case() available. If so, you will have to strip it and replace 'content' by 'Content' for the data you provided.

If you've got XQuery 2.0 support, you could also use this as a predicate which is more extensible (much easier to add new attributes to the list and less redundant code):

[some $attribute in (@id, @class) satisfies contains(lower-case($attribute), 'content')]
Jens Erat
  • 37,523
  • 16
  • 80
  • 96
0

This answer uses only XPath 1.0 expressions. My understanding is that XPath 2.0 isn't available.

Use:

//div[contains(@id, 'Content') or contains(@class, 'Content')]
       [not(descendant::div[contains(@id, 'Content') or contains(@class, 'Content')])]

This selects any div element whose id attribute has string value that contains the string "content", or whose class attribute has string value that contains the string "content", and that has no descendant div elements with this properties.

Do note, that such thing as "the most inner div" may not be singular -- that is, many div elements may exist such that they fulfill the conditions set in the question.

If this is the case, and you need just one such div element (say, the 1st), you can use:

(//div[contains(@id, 'content') or contains(@class, 'content')]
        [not(descendant::div[contains(@id, 'content') or contains(@class, 'content')])]
)[1]

XSLT - based verification:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="/">
     <xsl:copy-of select=
     "//div[contains(@id, 'Content') or contains(@class, 'Content')]
       [not(descendant::div[contains(@id, 'Content') or contains(@class, 'Content')])]"/>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the first provided XML document:

<body>
    <div class="outerContent">
        <div id="moreContent">
            <div class="anotherContent">
            This is what I am looking for.
            </div>
        </div>
    </div>
</body>

the Xpath expression is evaluated and the result of this evaluation is copied to the output:

<div class="anotherContent">
            This is what I am looking for.
            </div>

With the second document, again the correct result is produced:

<div id="anotherContent">
            This is what I am looking for.
        </div>

Finally, in case the comparisson for "Content" should be case-independet, use:

  //div[contains(translate(@id,'CONTE','conte'), 'content')
      or contains(translate(@class,'CONTE','conte'), 'content')
       ]
         [not(descendant::div
               [contains(translate(@id,'CONTE','conte'), 'content')
               or contains(translate(@class,'CONTE','conte'), 'content')
               ]
              )
         ] 
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431