0

I am new with Python so I apologize if this has been covered previously and I have been too ignorant to apply the solution.

Here is the XML:

<?xml version="1.0" encoding="UTF-8"?>
<project xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd" xmlns="http://maven.apache.org/POM/4.0.0"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <modelVersion>4.0.0</modelVersion>
  <groupId>DEFAULT</groupId>
  <artifactId>ADP_ServiceTechnology-JRG_Testing</artifactId>
  <version>2.0.31</version>
  <dependencies>
    <dependency>
      <groupId>DEFAULT</groupId>
      <artifactId>ADP Standard Operations</artifactId>
      <version>2.2.86.17-SNAPSHOT</version>
    </dependency>
    <dependency>
      <groupId>DEFAULT</groupId>
      <artifactId>Base</artifactId>
      <version>1.9.0-SNAPSHOT</version>
    </dependency>
    <dependency>
      <groupId>DEFAULT</groupId>
      <artifactId>Databases</artifactId>
      <version>[1.1.0]</version>
    </dependency>
    <dependency>
      <groupId>DEFAULT</groupId>
      <artifactId>HPE Solutions</artifactId>
      <version>[1.8.2]</version>
    </dependency>
    <dependency>
      <groupId>DEFAULT</groupId>
      <artifactId>Business Applications</artifactId>
      <version>[1.3.0]</version>
    </dependency>
    <dependency>
      <groupId>DEFAULT</groupId>
      <artifactId>Operating Systems</artifactId>
      <version>[1.3.0]</version>
    </dependency>
  </dependencies>
</project>

I successfuly import the data with:

import xml.etree.ElementTree as ET
tree = ET.parse('pom.xml')
root = tree.getroot()

I just need to iterate through the tree and retrieve the <artifactId> and <version> values. I have tried numerous methods found on the web with no luck. It was simple for me with php and xpath but the python has me stumped.

This:

for elem in tree.iter():
  print "%s: '%s'" % (elem.tag, elem.text)  

will return every element tag and text but I want to navigate to just the two that I indicated.

Thanks in advance!

meejo57
  • 23
  • 5

1 Answers1

0

You can find them if you prepend the namespace:

import xml.etree.ElementTree as ET
tree = ET.parse("t.xml")
root = tree.getroot()

namesp = root.tag.replace("project","")  # get the namesapce from the root key

version = root.find(namesp+"version")
artifactId = root.find(namesp+"artifactId")

print(version.text)
print(artifactId.text)

Output:

2.0.31
ADP_ServiceTechnology-JRG_Testing 

You can find more general information here: Parsing XML in Python using ElementTree example

and in the doku https://docs.python.org/3/library/xml.etree.elementtree.html (switch to python 2 on top of page)

If you want to strip the namespace from your data, see https://stackoverflow.com/a/25920989/7505395 from Python ElementTree module: How to ignore the namespace of XML files to locate matching element when using the method "find", "findall"

Patrick Artner
  • 50,409
  • 9
  • 43
  • 69