1

I am trying to html parse a XML file and getting the contents of a tag to print in a list. Below is the XML file:

<?xml version="1.0" encoding="UTF-8"?>
<metadata>
  <groupId>org.caltesting.mt.caaspp</groupId>
  <artifactId>mt-web</artifactId>
  <version>1.0.365-20150828.172422-3</version>
  <versioning>
    <latest>1.0.373-SNAPSHOT</latest>
    <versions>
      <version>1.0.365-SNAPSHOT</version>
      <version>1.0.366-SNAPSHOT</version>
      <version>1.0.367-SNAPSHOT</version>
      <version>1.0.368-SNAPSHOT</version>
      <version>1.0.369-SNAPSHOT</version>
      <version>1.0.370-SNAPSHOT</version>
      <version>1.0.372-SNAPSHOT</version>
      <version>1.0.373-SNAPSHOT</version>
    </versions>
    <lastUpdated>20150925021611</lastUpdated>
  </versioning>
</metadata>

My Python Code to parse this and print the version numbers(1.0.3XX) in the version tag with versions tag.

from xml.dom import minidom
xmldoc = minidom.parse('/Users/Downloads/metadata.xml')
itemlist = xmldoc.getElementsByTagName('version')
for s in itemlist:
    print(s)

Thanks!!

alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
Deepak Thota
  • 87
  • 1
  • 2
  • 13

1 Answers1

0

If I understand you correctly, you may just construct a list of versions via a list comprehension:

versions = [version[0].firstChild.nodeValue
            for version in xmldoc.getElementsByTagName('version')]

See also:

Community
  • 1
  • 1
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • When I use the single line of code you mentioned here I am getting a Attribute error: AttributeError: Element instance has no attribute '__getitem__' But I was able to use the version[0].firstChild.nodeValue bit and get the solution for my problem thanks :) – Deepak Thota Sep 29 '15 at 03:47