0

I have a large set of XPaths for selecting content in webpages and I want users to be able to use them in the browser (including IE).

What do you recommend? Try and interpret the XPaths with JavaScript?
Or perhaps convert to regex?

Some existing JavaScript XPath work:
http://js-xpath.sourceforge.net/xpath-example.html
http://goog-ajaxslt.sourceforge.net

hoju
  • 28,392
  • 37
  • 134
  • 178
  • 1
    Seems like you answered your own question doesn't it? – MooGoo Sep 12 '10 at 14:39
  • What makes you think XPath could be converted to regex? – Tomalak Sep 12 '10 at 14:41
  • About using regex to parsse HTML: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Hendrik Brummermann Sep 12 '10 at 16:02
  • 2
    PLEASE STOP LINKING THAT ANSWER IT MAKES ME ANGRIER THAN THE GUY WHO WROTE IT /.*<\/html>/g KAJSHDKAJSHDKASHD – MooGoo Sep 12 '10 at 16:14
  • @MooGoo: I haven't tested them though. Was hoping for feedback from anyone with experience here. – hoju Sep 14 '10 at 09:00
  • Agreed, that post is linked too much. http://blog.sitescraper.net/2010/06/web-scraping-with-regular-expressions.html – hoju Sep 14 '10 at 09:06
  • 1
    @Richard: What browsers are you targeting? I ask because up today `selectNodes` works for IE and `evaluate` works for Firefox, Chrome, Opera and Safari... –  Sep 14 '10 at 22:48
  • interesting - am just targeting IE and FireFox – hoju Sep 15 '10 at 00:26
  • @Tomalak: the full picture is I have been asked to manually convert the xpaths to regex but I want to come up with an automated alternative – hoju Sep 15 '10 at 00:33
  • @Richard: That does not answer my question. – Tomalak Sep 15 '10 at 08:00
  • @Richard: You can even parse an XPath expression with RegExp... –  Sep 16 '10 at 20:47
  • @Tomalak: your question was why do I think I can convert XPath's to regex. Because I can. Atleast for the XPath's I need to deal with. But as I said I don't want to spend my time doing that. Got it? – hoju Sep 17 '10 at 04:15

2 Answers2

1

I would look for an XSLT javascript library. Since most modern browsers have built-in XSLT support, and XSLT includes support for XPath, it is possible to use that engine to power your XPath selectors.

Personally, I've used Sarissa and the Glyphix jQuery.xslTransform libraries successfully:

This looks interesting too:

Docunext
  • 803
  • 6
  • 9
0

Nowadays browsers support the XPath 1.0 based DOM 3 XPath out of the box. The main API is the document.evaluate function which is available in all mayor desktop browsers except IE.

And there are polyfills, if you want to use it in older browser versions or IE.

user7610
  • 25,267
  • 15
  • 124
  • 150