XML generation from python reading from stdout

Question

I have the stdout put in the below form and I need to generate xml for the below output> How can I generate the xml in python. I have the subelements inside the loop reading the stdout. But it is not generating right xml. it is generating xml with only one subelement.

MAC             : 00:19:ec;dc;bc
IP              : 192.111.111.111
NAME            : 900, Charles
Download        : 36MB
Upload          : 12MB
comments        : Since total througput is very less, we cannot continue

MAC             : 00:19:ac:bc:cd:
IP              : 192.222.222.222
NAME            : 800, Babbage
Download        : 36MB
Upload          : 24MB
comments        : Since total througput is high, we can continue

I need xml in below format

<results>
   <machine>
     <MAC>00:19:ec;dc;bc</MAC>
     <ip>192.111.111.111</ip>
     <name>900, Charles</name>
     <upload>36MB</upload>
     <download>12MB</download>
     <comments>Since total througput is very less, we cannot continue</comments>
   </machine>
   <machine>
     <MAC>00:19:ac:bc:cd:</MAC>
     <ip>192.222.222.222</ip>
     <name>800, Babbage</name>
     <upload>36MB</upload>
     <download>24MB</download>
     <comments>Since total througput is high, we can continue</comments>
   </machine>
</results>

code is

results = ET.Element("results")
machine = ET.SubElement(results,"machine")
mac = ET.SubElement(machine, "mac")
ip = ET.SubElement(machine,"ip")
name = ET.SubElement(machine,"name")
download = ET.SubElement(machine, "download")
upload = ET.SubElement(machine, "upload")
comment = ET.SubElement(machine, "comment")

for line in lines.split("\n"):
         if 'MAC' in line:
                mac = line.split(":")
                stnmac.text = str(mac[1].strip())
        if 'IP' in line:
                ip = line.split(":")
                stnip.text = str(ip[1].strip())
        if 'NAME' in line:
                name = line.split(":")
                apidname.text = str(name[1].strip())
        if 'Download' in line:
                down = line.split(":")
                download.text = str(down[1].strip())
        if 'Upload' in line:
                up = line.split(":")
                upload.text = str(up[1].strip())
        if 'Comment' in line:
                user = line.split(":")
                userexp.text = str(user[1].strip())

tree = ET.ElementTree(results)
tree.write('machine.xml')

The same attributes get repeated inside the xml and if I put inside the loop, subelements, it each machine would have only one attribute in xml.

It would be easier to help you if you provided the non-working code you only describe instead of a specification for your problem that would require us to write the whole program for you. — Two-Bit Alchemist, Mar 28 '14 at 21:05

alecxe · Answer 1 · 2014-03-28T21:13:22.733

You can use BeautifulSoup. The idea is to split the string into new-lines (or read the source file line-by-line) and create a machine tag on every empty line. On every non-empty line, split the line by first :, create a tag and append to the machine tag.

Here's an working example you may start to use:

from bs4 import Tag


data = """
MAC             : 00:19:ec;dc;bc
IP              : 192.111.111.111
NAME            : 900, Charles
Download        : 36MB
Upload          : 12MB
comments        : Since total througput is very less, we cannot continue

MAC             : 00:19:ac:bc:cd:
IP              : 192.222.222.222
NAME            : 800, Babbage
Download        : 36MB
Upload          : 24MB
comments        : Since total througput is high, we can continue
"""

results = Tag(name='results')

machine = None
for line in data.split('\n'):
    if not line:
        if machine:
            results.append(machine)
        machine = Tag(name='machine')
    else:
        name, value = line.split(':', 1)
        tag = Tag(name=name.strip())
        tag.string = value.strip()
        machine.append(tag)

print results.prettify()

prints:

<results>
 <machine>
  <MAC>
   00:19:ec;dc;bc
  </MAC>
  <IP>
   192.111.111.111
  </IP>
  <NAME>
   900, Charles
  </NAME>
  <Download>
   36MB
  </Download>
  <Upload>
   12MB
  </Upload>
  <comments>
   Since total througput is very less, we cannot continue
  </comments>
 </machine>
 <machine>
  <MAC>
   00:19:ac:bc:cd:
  </MAC>
  <IP>
   192.222.222.222
  </IP>
  <NAME>
   800, Babbage
  </NAME>
  <Download>
   36MB
  </Download>
  <Upload>
   24MB
  </Upload>
  <comments>
   Since total througput is high, we can continue
  </comments>
 </machine>
</results>

PS: I don't really like the "grouping" part - doesn't look pythonic, but works.

Hope that helps.

Thanks for quick response. I am using python 2.4 and elementtree, I am bound to use 2.4 and cannot update, hence I prefered elementtree — NoviceInPython, Mar 28 '14 at 21:21
@NoviceInPython omg, you just need to update. `2.4` is very-very old. — alecxe, Mar 28 '14 at 21:21
I know, i am just using the one which comes default with centos. If I update it, it would break packages and I do not want to spend time on that — NoviceInPython, Mar 28 '14 at 21:35
@NoviceInPython you can install standalone python, or/and use virtual environments, see http://stackoverflow.com/questions/1534210/use-different-python-version-with-virtualenv — alecxe, Mar 28 '14 at 21:38

score 0 · Answer 2 · answered Mar 29 '14 at 21:34

Here's a solution using xml.etree.ElementTree:

import xml.etree.ElementTree as ET

data = """
MAC             : 00:19:ec;dc;bc
IP              : 192.111.111.111
NAME            : 900, Charles
Download        : 36MB
Upload          : 12MB
comments        : Since total througput is very less, we cannot continue

MAC             : 00:19:ac:bc:cd:
IP              : 192.222.222.222
NAME            : 800, Babbage
Download        : 36MB
Upload          : 24MB
comments        : Since total througput is high, we can continue
"""

results = ET.Element('results')

machine = None
for line in data.split('\n'):
    if not line:
        machine = ET.SubElement(results, 'machine')
    else:
        name, value = line.split(':', 1)
        tag = ET.SubElement(machine, name.strip())
        tag.text = value.strip()

print ET.tostring(results)

it prints:

<results>
    <machine>
        <MAC>00:19:ec;dc;bc</MAC>
        <IP>192.111.111.111</IP>
        <NAME>900, Charles</NAME>
        <Download>36MB</Download>
        <Upload>12MB</Upload>
        <comments>Since total througput is very less, we cannot continue</comments>
    </machine>
    <machine>
        <MAC>00:19:ac:bc:cd:</MAC>
        <IP>192.222.222.222</IP>
        <NAME>800, Babbage</NAME>
        <Download>36MB</Download>
        <Upload>24MB</Upload>
        <comments>Since total througput is high, we can continue</comments>
    </machine>
    <machine/>
</results>

XML generation from python reading from stdout

2 Answers2