How to parse data inside XML?

Question

any idea how to parse this kind of record? This record has data on it.

       <record id="1" model="custom.model>
            <field name="name">Create</field>
            <field name="email_from">dummy@mail.com</field>
            <field name="email_to">todummy@mail.com</field>
            <field name="email_subject">Create new company</field>
            <field name="email_body">
                <![CDATA[
                <record>
                    <field name="process">Create</field>
                    <field name="model">res.company</field>
                    <field name="name">XYZ Company</field>
                    <field name="currency_id">base.USD</field>
                </record>
                ]]>
            </field>
            <field name="email_read">False</field>
        </record>

Hello, would answers from this question be any help to you? https://stackoverflow.com/questions/2784183/what-does-cdata-in-xml-mean — Tomasz Plaskota, Oct 08 '21 at 10:13
Hello, thanks for your comment. I am looking for how to read the cdata so that I know how to put it on records automatically — Kai Ning, Oct 08 '21 at 10:16

score 1 · Accepted Answer · answered Oct 08 '21 at 11:13

Assuming you are looking for the data inside CDATA the code below finds this section and parse it as xml.

import xml.etree.ElementTree as ET


xml = '''<record id="1" model="custom.model">
            <field name="name">Create</field>
            <field name="email_from">dummy@mail.com</field>
            <field name="email_to">todummy@mail.com</field>
            <field name="email_subject">Create new company</field>
            <field name="email_body">
                <![CDATA[
                <record>
                    <field name="process">Create</field>
                    <field name="model">res.company</field>
                    <field name="name">XYZ Company</field>
                    <field name="currency_id">base.USD</field>
                </record>
                ]]>
            </field>
            <field name="email_read">False</field>
        </record>'''
outer_root = ET.fromstring(xml)
email = outer_root.find('.//field[@name="email_body"]')
inner_root = ET.fromstring(email.text)
for field in inner_root.findall('field'):
  print(f'{field.attrib["name"]} -> {field.text}')

output

process -> Create
model -> res.company
name -> XYZ Company
currency_id -> base.USD

score 0 · Answer 2 · edited Oct 08 '21 at 11:18

0

Parse XML document:

Import xml.dom.minidom.
Use the function parse to parse the document: doc=xml.dom.minidom.parse(file name).
Call the list of XML tags from the XML document using code: doc.getElementsByTagName(“name of xml tags”).

edited Oct 08 '21 at 11:18

Timus

10,974
5
14
28

answered Oct 08 '21 at 10:17

shashi sharma

23
1
4

score 0 · Answer 3 · answered Oct 08 '21 at 14:36

Basically, you parse the outer document with an XML parser in the normal way, navigate to the field with name="email_body", extract the string value of that element, and then pass that string to a new XML parser to be parsed again, to get the inner document.

How to parse data inside XML?

3 Answers3