I failed to parse an XML file(it is GC history). Sample of the XML is shown below.
<?xml version="1.0" ?>
<verbosegc xmlns="http://www.ibm.com/j9/verbosegc" version="R28_jvm.28_20150612_0201_B252774_CMPRSS">
<initialized id="1" timestamp="2015-12-04T20:17:07.219">
<attribute name="gcPolicy" value="-Xgcpolicy:gencon" />
<attribute name="maxHeapSize" value="0x20000000" />
<attribute name="initialHeapSize" value="0x400000" />
</initialized>
<cycle-start id="4" type="scavenge" contextid="0" timestamp="2015-12-04T20:17:10.677" intervalms="3457.977" />
<gc-start id="5" type="scavenge" contextid="4" timestamp="2015-12-04T20:17:10.677">
<mem-info id="6" free="3037768" total="4194304" percent="72">
</mem-info>
</gc-start>
<gc-end id="8" type="scavenge" contextid="4" durationms="0.807" usertimems="0.000" systemtimems="0.000" timestamp="2015-12-04T20:17:10.678" activeThreads="2">
<mem-info id="9" free="3163968" total="4194304" percent="75">
</mem-info>
</gc-end>
<cycle-end id="10" type="scavenge" contextid="4" timestamp="2015-12-04T20:17:10.678" />
<cycle-start id="16" type="scavenge" contextid="0" timestamp="2015-12-04T20:17:10.742" intervalms="64.838" />
<gc-start id="17" type="scavenge" contextid="16" timestamp="2015-12-04T20:17:10.742">
<mem-info id="18" free="3037664" total="4194304" percent="72">
</mem-info>
</gc-start>
<gc-end id="20" type="scavenge" contextid="16" durationms="0.649" usertimems="0.000" systemtimems="0.000" timestamp="2015-12-04T20:17:10.743" activeThreads="2">
<mem-info id="21" free="3110592" total="4194304" percent="74">
</mem-info>
</gc-end>
<cycle-end id="22" type="scavenge" contextid="16" timestamp="2015-12-04T20:17:10.743" />
<allocation-satisfied id="23" threadId="0000000002E10500" bytesRequested="416" />
</verbosegc>
I want to mem-info::free in gc-start and gc-end, both of which are enclosed by cycle-start and cycle-end tags and have the same contexid. For example, the first two mem-info values are 3037768 and 3163968, the corresponding contextid is 4 which equals to the cycle-start id. With these data, I can draw the figure to show memory footprint.
The main problem for me is that I could not parse the XML sucessfully with the method in XML parse python. The getroot works but all other find/findall returns empty. Is there any other solutions for this? thanks
Here are my tries:
>>> tree = ET.parse('gc.trace')
>>> tree
<xml.etree.ElementTree.ElementTree object at 0x7fdfaddc19d0>
>>> root=tree.getroot()
>>> root
<Element '{http://www.ibm.com/j9/verbosegc}verbosegc' at 0x7fdfaddc1a90>
>>> cycle_start = root.findall('cycle-start')
>>> cycle_start
[] ; Empty???
>>> cycle_start = root.findall('mem-info')
>>> print cycle_start
[] ;Empty???
>>>
>>> cycle_start = root.find('mem-info')
>>> cycle_start
>>> print cycle_start
None
from lxml import etree
tree = etree.parse("gc.log")
root = tree.getroot()
>>root.findall('mem-info', root.nsmap)
>>> root.nsmap
{None: 'http://www.ibm.com/j9/verbosegc'}