XML parsing - algorithm

Question

i have this following schema

<dataset>
   <record>
      <A> </A>
      <B> </B>
   </record>

   <record>
      <A> </A>
      <B> </B>
   </record>
</dataset>

Can you suggest an efficient algorithm to parse the 'record's and store them in a c structure?

Straightforward parsing is taking a long time as the number of records is around 1500. Any changes to be made in the schema are also welcome.

Use a library, there are a few to pick from. Parsing XML is *hard*. — Some programmer dude, Oct 30 '13 at 06:24
Algorithm: 1. Download XML library 2. Use xml library with xpath to access elements — nurettin, Oct 30 '13 at 07:19
When you say "parsing is taking a long time", are you saying you have written your own parser, rather than using one off-the-shelf? In that case, it's not surprising that it takes a long time, and the answer is to use an off-the-shelf parser, which will be far more efficient than anything you are likely to write yourself. — Michael Kay, Oct 30 '13 at 08:57
Try to use the algorithm like evaluation of postfix expression using stacks — Atul, Oct 30 '13 at 12:38

score 0 · Accepted Answer · edited May 23 '17 at 10:31

0

I suggest not re-inventing an algorithm and using an XML-parser instead. For reasons check this masterly answer on why not to use RegEx on XHTML: RegEx match open tags except XHTML self-contained tags (I admit parsing XHTML is even harder, but things like occasional occuring attributes are the same and the accepted answer is really worth reading)

edited May 23 '17 at 10:31

Community

1
1

answered Oct 30 '13 at 07:04

Andreas

1,220
8
21

score 0 · Answer 2 · answered Oct 30 '13 at 07:22

0

You are concerned about runtime - are you on an embedded device? If so you could preprocess the xml into a format that is easy to parse on the microcontroller.

answered Oct 30 '13 at 07:22

Dill

1,943
4
20
28

XML parsing - algorithm

2 Answers2