6

Say I have this XML with about 1000+ bookinfo nodes.

<results>
  <books>
   <bookinfo>
        <name>1</dbname>
   </bookinfo>
   <bookinfo>
     <name>2</dbname>
   </bookinfo>
   <bookinfo>
     <name>3</dbname>
   </bookinfo>
 </books>
</results>

I'm currently using this to get the name of each book:

var books = this.req.responseXML.getElementsByTagName("books")[0].getElementsByTagName("bookinfo")

Then use a for loop to do something with each book name:

var bookName = books[i].getElementsByTagName("name")[0].firstChild.nodeValue;

I'm finding this really slow when books is really big. Unfortunately, there's no way to limit the result set nor specify a different return type.

Is there a faster way?

doremi
  • 14,921
  • 30
  • 93
  • 148
  • You can store the `getElementsByTagName` call as a variable/array so it only gets called once. –  Feb 28 '11 at 04:13
  • 100 XML nodes is **nothing**. Show how you're doing it for more than one book at a time. – Matt Ball Feb 28 '11 at 04:16
  • It's actually more like 1000+ and there's A LOT more data than what I've presented in the sample XML output. – doremi Feb 28 '11 at 04:16
  • 1000+ is also not that many. If you want help optimizing your code, I suggest you show us a real example to work with. – Matt Ball Feb 28 '11 at 04:17
  • How can you say it isn't a problem? It's the slowest part of my app. – doremi Feb 28 '11 at 04:18
  • 1
    Are you sure it's the parsing that's slow? Are you using a JavaScript profiler? How are you sure that it's not something else caused by a large XML file, like transporting it to the browser? – Matt Ball Feb 28 '11 at 04:20
  • 1
    @Joshua I think @Matt is right - you should run your application in a browser that has profiler capabilities. – Pointy Feb 28 '11 at 04:22
  • Running profiler in Chrome. 49% is send. Another 35% is 'Program' (whatever that means). I take it send includes the entire request/response for the xml request? Also, what is this 'program' business taking 35%? – doremi Feb 28 '11 at 04:27
  • Found my answer regarding 'program': http://stackoverflow.com/questions/3847954/chrome-debugger-what-is-program-in-profiler – doremi Feb 28 '11 at 04:29
  • 1
    any chance to convert that XML to JSON in the server? you will save both bytes transferred and CPU time – gonchuki Feb 28 '11 at 04:33
  • Agreed with @gonchuki. JSON tends to be much more concise and much quicker to parse. – Matt Ball Feb 28 '11 at 04:34

3 Answers3

11

You can try fast xml parser to convert XML data to JSON which is implemented in JS. Here is the benchmark against other parser.

var parser = require('fast-xml-parser');
var jsonObj = parser.parse(xmlData);

// when a tag has attributes
var options = {
        attrPrefix : "@_"        };
var jsonObj = parser.parse(xmlData,options);

If you don't want to use npm library, you can include parser.js in your HTML directly.

Disclaimer: I'm the author of this library.

Amit Kumar Gupta
  • 7,193
  • 12
  • 64
  • 90
  • 2
    Unfortunately, as of 2021, this library still doesn't preserve the order of elements in nodes. It merges any node of same name into an array, without respecting the order of these nodes. – Robin Jun 06 '21 at 22:17
  • This didn't work for me. Is fast-xml-parser still supported? – Danielle Apr 08 '22 at 12:56
  • fxp v4 supports preserve order. please check which version you're using for the correct syntax. check its home page. – Amit Kumar Gupta Apr 09 '22 at 01:23
8

Presumably you are using XMLHttpRequest, in which case the XML is parsed before you call any methods of responseXML (i.e. the XML has already been parsed and turned into a DOM). If you want a faster parser, you'll probably need a different user agent or a different javascript engine for your current UA.

If you want a faster way to access content in the XML document, consider XPath:

Mozilla documentation

MSDN documentation

I used an XPath expression (like //parentNode/node/text()) on a 134KB local file to extract the text node of 439 elements, put those into an array (because that's what my standard evalXPath() function does), then iterate over that array to put the nodeValue for each text node into another array, doing two replace calls with regular expressions to format the text, then alert() that to the screen with join('\n'). It took 3ms.

A 487KB file with 529 nodes took 4ms (IE 6 reported 15ms but its clock has very poor resolution). Of course my network latency will be nearly zero, but it shows that the XML parser, XPath evaluator and script in general can process that size file quickly.

BenMorel
  • 34,448
  • 50
  • 182
  • 322
RobG
  • 142,382
  • 31
  • 172
  • 209
3

if you want to parse the information from that xml much faster, try txml. it is very easy to use and for the type of xml you have shown, you can use its simplify method. it will give you very clean objects to work with.

https://www.npmjs.com/package/txml

Disclaimer: I'm the author of this library.

Tobias Nickel
  • 512
  • 5
  • 7