3

I have an html document that I need to grab all table elements that are the 5th table deep in the DOM, not to be confused with the 5th child table. My problem is this 5 table deep structure could be wrapped in any number of div elements so I can't use an absolute path such as

/html/body/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr/td/table/tbody/tr/td/table

For example:

<body>    
    <table>    
        <table>
            <table>
                <table>
                   <!--Grab this one -->
                   <table>
                   </table>
                </table>
            </table>
       </table>
    </table>
</body>

Or This:

 <body> 
    <div> <!--Could be wrapped more than just once though -->  
        <table>    
            <table>
                <table>
                    <table>
                       <!--Grab this one -->
                       <table>
                       </table>
                    </table>
                </table>
           </table>
        </table>
    </div>
</body>
MisterIsaak
  • 3,882
  • 6
  • 32
  • 55

4 Answers4

4

Use:

(//table[count(ancestor::table) = 4])[1]

This selects the first table in the document that has exactly four ancestors named table .

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
3

I believe you'd want the // expression between each element, making the full expression:

//table//table//table//table//table

This will select any table that has 4 tables anywhere in its path

Thymine
  • 8,775
  • 2
  • 35
  • 47
1
XElement doc = XElement.Parse(yourXml); 
var requiredTable = doc.Descendants("table").ElementAt(4);
Dmitry Khryukin
  • 6,408
  • 7
  • 36
  • 58
1

for mshtml (because you question is c# and html tagged) the way to access a html childnode element is something like mentioned here: How can I retrieve all the text nodes of a HTMLDocument in the fastest way in C#?

maybe this helps!

Community
  • 1
  • 1
Zameer Ansari
  • 28,977
  • 24
  • 140
  • 219