2

I am working on a XPATH generator (using absolute paths). The idea is that I have a function where you pass a HTMLElement (that is found in the webbrowser) and it will return the XPATH like:

/html/body/div[3]/div[1]/a

The function to generate the xpath looks something like this:

HTMLElement node=...;
while (node != null)
     {
       int i = FindElementIndex(node); //find the index of our current node in the parent elements
       if(i==1)
          xpath.Insert(0, "/" + node.TagName.ToLower());
       else
          xpath.Insert(0, "/" + node.TagName.ToLower() + "[" + i+ "]");
       node = node.Parent;
    }

The idea is this:

a)take the element

b)find the index position of element in element.parent

c)append xpath

The problem appears when the parent is a custom html code like "<layer>" Example:

<html>
  <body>
     <div>
        <layer>
           <a href="http://site.com">aaa</a>
        </layer>
      </div>
  </body>
</html>

If our HTMLElement is <a href="http://site.com">aaa</a> and we call ourelement.Parent it will return the DIV element and NOT the element

So instead of having: /html/body/div/layer/a

We will have (which is incorrect) /html/body/div/a

How can this be solved? Really hope someone can help figure this out.

EDIT 1: Just for testing purposes I implemented the function from Get the full path of a node, after get it with an XPath query in JavaScript

The results were that if the page that contained a "custom" tag (like <layer>) AND if the page was opened in firefox, the xpath was showed correctly.

If the page was opened in Internet Explorer (like webbrowser is) the <layer> was not included as a parent.

So the issue is with internet explorer not "parsing" the dom correctly. What is the solution? What function can help create xpath for cases like this (if using webbrowser htmlelement).

Community
  • 1
  • 1
BlasterGod
  • 196
  • 1
  • 13

1 Answers1

0

This is not a direct answer to your question; but have considered using http://htmlagilitypack.codeplex.com/ to load the HTML. It will not have the problem of ignoring the element.

Richard Schneider
  • 34,944
  • 9
  • 57
  • 73
  • I am using the htmlagilitypack, but I have a dom selector (similar to firebug) that selects a htmlelement from the webbrowser. I than need to get the htmlnode (htmlagilitypack node) that corresponds to the htmlelement selected. I first tried comparing outerhtml of the element, but the dom outerhtml of the webbrowser is sometimes (it formats the text) different than the htmlagilitypack outerhtml. That's why I need to get the XPATH of the HTMLElement, so I can find the Htmlagilitypack HTMLNode. Hope it makes sense – BlasterGod Jun 08 '12 at 01:12