3

How can I get the XPath from a clicked HtmlElement in the WebBrowserControl?

This is how I retrieve the clicked HtmlElement:

System.Windows.Forms.HtmlDocument document = this.webBrowser1.Document;
document.MouseUp += new HtmlElementEventHandler(this.htmlDocument_Click);

private void htmlDocument_Click(object sender, HtmlElementEventArgs e)
{
    HtmlElement element = this.webBrowser1.Document.GetElementFromPoint(e.ClientMousePosition);
}

I want to click specific elements (price, article number, description, etc) on a website and get their XPath expressions.

Thank you!

jimbo
  • 582
  • 1
  • 11
  • 28

1 Answers1

11

XPath expression is not a standard feature of HTML (unlike with XML). If you're looking to get an element XPath which you can later use with Html Agility Pack, you have at least two options:

  1. Walk up the element's DOM ancestry tree using HtmlElement.Parent and construct the XPath manually.

  2. Use Html Agility Pack itself and do something like this (untested):

HtmlElement element = this.webBrowser1.Document.GetElementFromPoint(e.ClientMousePosition);

var savedId = element.Id;
var uniqueId = Guid.NewGuid().ToString();
element.Id = uniqueId;

var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(element.Document.GetElementsByTagName("html")[0].OuterHtml);
element.Id = savedId;

var node = doc.GetElementbyId(uniqueId);
var xpath = node.XPath;
carla
  • 1,970
  • 1
  • 31
  • 44
noseratio
  • 59,932
  • 34
  • 208
  • 486