21

Would anyone enlighten me some comprehensive performance comparison between XPath and DOM in different scenarios? I've read some questions in SO like xPath vs DOM API, which one has a better performance and XPath or querySelector?. None of them mentions specific cases. Here's somethings I could start with.

  1. No iteration involved. getElementById(foobar) vs //*[@id='foobar']. Is former constantly faster than latter? What if the latter is optimized, e.g. /html/body/div[@id='foo']/div[@id='foobar']?
  2. Iteration involved. getElementByX then traverse through child nodes vs XPath generate snapshot then traverse through snapshot items.
  3. Axis involved. getElementByX then traverse for next siblings vs //following-sibling::foobar.
  4. Different implementations. Different browsers and libraries implement XPath and DOM differently. Which browser's implementation of XPath is better?

As the answer in xPath vs DOM API, which one has a better performance says, average programmer may screw up when implementing complicated tasks (e.g. multiple axes involved) in DOM way while XPath is guaranteed optimized. Therefore, my question only cares about the simple selections that can be done in both ways.

Thanks for any comment.

Community
  • 1
  • 1
Reci
  • 4,099
  • 3
  • 37
  • 42
  • As with almost any question regarding performance and optimization, it's going to depend on your specific circumstances and content. The answer is "profile your app with your data, and choose whichever works best for you". Also, you've asked too many general questions. This should probably be closed as "not a real question", and if anyone else thinks so I'll join them in voting to do so. – Ken White Mar 12 '11 at 00:43
  • My personal experience is, DOM are usually more than 10 times faster than XPath or selector API implementation (e.g. Firefox). However, since XPath accept context node, maybe it is best to select a "stable" parent node with DOM and use XPath for the rest job. This can be both high performance and robust. – Reci Mar 25 '11 at 19:37
  • XPath can be built on a non-DOM API, for example, vtd-xml's xpath implementation is built on top of Virtual token descriptors... – vtd-xml-author Jun 13 '13 at 23:42

3 Answers3

34

XPath and DOM are both specifications, not implementations. You can't ask questions about the performance of a spec, only about specific implementations. There's at least a ten-to-one difference between a fast XPath engine and a slow one: and they may be optimized for different things, e.g. some spend a lot of time optimizing a query on the assumption it will be executed multiple times, which might be the wrong thing to do for single-shot execution. The one thing one can say is that the performance of XPath depends more on the engine you are using, and the performance of DOM depends more on the competence of the application programmer, because it's a lower-level interface. Of course all programmers consider themselves to be much better than average...

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • 3
    How about the most common implementation: Firefox, Chrome, IE, Opera? I assume they optimize the engine in a fairly balanced way. Under such assumption, is there any answer to the question? – Reci Mar 25 '11 at 19:31
6

This page has a section where you can run tests to compare the two and see the results in different browsers. For instance, for Chrome, xpath is 100% slower than getElementById.

See getElementById vs QuerySelector for more information.

jamesmortensen
  • 33,636
  • 11
  • 99
  • 120
  • 1
    Hi Claudiu, welcome to StackOverflow! While the link you posted may be helpful, the goal of StackOverflow is to become a repository of knowledge for years to come so that others who visit this page could benefit from your answer. If the link ever were to break, your answer would be useless. Consider editing your answer to include the example from the link, so that if the link dies, your answer has value. Good luck, and welcome to StackOverflow! :) – jamesmortensen May 29 '12 at 04:42
  • I went ahead and made some improvements. Good luck! – jamesmortensen May 29 '12 at 04:48
0

I agree with Michael that it may depends on implementation, but I would generally say that DOM is faster. The reason is because there is no way that I see you can optimize the parsed document to make XPath faster.

If you're traversing HTML and not XML, specialized parser is able to index all the ids and classes in the document. This will make getElementById and getElementsByClass much faster.

With XPath, there's only one way to find the element of that id...by traversing, either top down or bottom up. You may be able to memoize repeated queries (or partial queries), but I don't see any other optimization that can be done.

ming_codes
  • 2,870
  • 25
  • 24