0

I hope to parse a '.xml' file using python. The format of the file is as follows:

<root><dm_log_packet>
    <pair key ="type_id">LTE_PHY_Serv_Cell_Measurement</pair>
  </dm_log_packet>
</root>

I tried to parse it using ElementTree but failed.

Here is my code:

from xml.etree import ElementTree

class Log:
    def __init__(self,type_id=None):
        self.type_id=type_id
    def __str__(self):
        return self.type_id

roota=ElementTree.parse("file.xml")
log_file = roota.findall("dm_log_packet")

lo = []
for aa in log_file:
    log = Log()
    log.type_id = aa.find("type_id").text
    lo.append(log)

I expect to parse each pair, but it can't do it like I have a <type_id>...</type_id> pair.

gcc17
  • 91
  • 1
  • 9
  • 1
    You need to describe your expected output and the error you see, if any, or the output you currently get instead. – Tomalak Oct 17 '19 at 16:40

2 Answers2

0

You can use BeautifulSoup

xml = """
  <root>
     <dm_log_packet>
         <pair key ="type_id">LTE_PHY_Serv_Cell_Measurement</pair>
    </dm_log_packet>
  </root>
  """

soup_obj = BeautifulSoup(xml)
soup_obj.html.body.foo.bar.findAll('type')[0]['foobar']

Output will

'1'

More Descriptive Answer

0

.find() and .findall() expect XPath as arguments, plain strings like "dm_log_packet" will not find anything.

from xml.etree import ElementTree

class Log:
    def __init__(self, type_id=None):
        self.type_id=type_id

    def __str__(self):
        return self.type_id

tree = ElementTree.parse("file.xml")
lo = []

for dm_log_packet in tree.findall(".//dm_log_packet"):
    pair = dm_log_packet.find("./pair/[@key='type_id']")
    if pair is not None:
        lo.append(Log(pair.text))

Note that dm_log_packet.find("./pair/[@key='type_id']") will return None when there is no <pair key="type_id">, hence the extra check.

Tomalak
  • 332,285
  • 67
  • 532
  • 628