0

I am trying to get specific data from an XML file, namely X, Y coordinates that are appear, to my beginners eyes, attributes of an element called "Point" in my file. I cannot get to that data with anything other than a sledgehammer approach and would gratefully accept some help.

I have used the following successfully:

for Shooter in root.iter('Shooter'):
    print(Shooter.attrib)

But if I try the same with "Point" (or "Points") there is no output. I cannot even see "Point" when I use the following:

for child in root:
   print(child.tag, child.attrib)

So: the sledgehammer

print([elem.attrib for elem in root.iter()])

Which gives me the attributes for every element. This file is a single collection of data and could contain hundreds of data points and so I would rather try to be a little more subtle and home in on exactly what I need.

My XML file https://pastebin.com/abQT3t9k

UPDATE: Thanks for the answers so far. I tried the solution posted and ended up with 7000 lines of which wasn't quite what I was after. I should have explained in more detail. I also tried (as suggested)

def find_rec(node, element, result):
    for item in node.findall(element):
        result.append(item)
        find_rec(item, element, result)
        return result

print(find_rec(ET.parse(filepath_1), 'Shooter', [])) #Returns <Element            'Shooter' at 0x125b0f958>
print(find_rec(ET.parse(filepath_1), 'Point', []))   #Returns None

I admit I have never worked with XML files before, and I am new to Python (but enjoying it). I wanted to get the solution myself but I have spent days getting nowhere.

I perhaps should have just asked from the beginning how to extract the XY data for each ShotNbr (in this file there is just one) but I didn't want code written for me.

I've managed to get the XY from this file but my code will never work if there is more than one shot, or if I want to specifically look at, say, shot number 20.

How can I find shot number 2 (ShotNbr="2") and extract only its XY data points?

owen b
  • 1
  • 1
  • Welcome to SO! In order to get all the help you need, you need to give full context: include your imports. It is not clear if you are using xml.etree.ElementTree. – xor007 Apr 11 '20 at 05:42
  • see https://stackoverflow.com/questions/30097949/elementtree-findall-to-recursively-select-all-child-elements/45588388 for a solution – xor007 Apr 11 '20 at 05:43

1 Answers1

0

Assuming that you are using:

xml.etree.ElementTree,

You are only looking at the direct children of root.

You need to recurse into the tree to access elements lower in the hierarchical tree.

This seems to be the same problem as ElementTree - findall to recursively select all child elements

which has an excellent answer that I am not going to plagiarize.

Just apply it.

Alternatively,

import xml.etree.ElementTree as ET
root = ET.parse("file.xml")
print root.findall('.//Point')

Should work.

See: https://docs.python.org/2/library/xml.etree.elementtree.html#supported-xpath-syntax

xor007
  • 976
  • 2
  • 12
  • 21