2

I have an XML file with a structure like this:

<index>
    <compound kind="file">
        <name>file.c</name>
        <member kind="variable"><name>foo</name></member>
        <member kind="variable"><name>bar</name></member>
        ...
    </compound>
    <compound kind="file">
        <name>file.h></name>
        <member>...
    </compound>
</index>

I need to search by file member name but I can't figure out if there is a way that avoids iterating the entire tree. My solution currently looks like:

for f in xmlroot.iter("compound"):
    for m in f.iter("member"):
        if m.find("name").text == my_var_name:
            print "Found"

Is there a way to use a dict() search to improve efficiency because I actually have another for loop above that that goes through the list of variables to search for so performance-wise this is really poor.

Would changing those two for loops to a single XPath search improve performance?

Makis
  • 12,468
  • 10
  • 62
  • 71
  • Couldn't you just parse the whole xml once and store the data in, say, a dictionary. The Dictionary could be of the form {'member_name':[files_in_which_it_occurs]. This assumes different files can have same member names. – RedBaron Oct 13 '11 at 08:10
  • This is answer to your question: http://stackoverflow.com/questions/247135/using-xpath-to-search-text-containing#247903 – Mikko Ohtamaa Oct 13 '11 at 08:14
  • @RedBaron: yes, that's possible, just feels a bit silly since lxlm already has dictionaries. – Makis Oct 13 '11 at 12:22
  • @Mikko: I don't understand how that answer applies here. – Makis Oct 13 '11 at 12:23
  • lxml provides API to use XPath selectors. You can select the nodes by content by using the XPath expression with XPath text() function. – Mikko Ohtamaa Oct 13 '11 at 21:33

0 Answers0