python xml parsing with etree

Question

Im trying to follow some of the other XML parsing questions already posted here. But it seems that my xml is somewhat weird. Im trying to parse https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64/repodata/primary.xml

I tried to do something like:

url = 'https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64/repodata/primary.xml'

opener = urllib.request.build_opener()
tree = etree.parse(opener.open(url))
root = tree.getroot()

for child in root:
        print(child.tag, child.attrib)

But this gets such a line for every child: {http://linux.duke.edu/metadata/common}package {'type': 'rpm'}

I dont get why the child's tag includes the "{http://linux.duke.edu/metadata/common}" part.

*"I dont get why the child's tag includes the "{http://linux.duke.edu/metadata/common}" part."* - Because the elements are in the `http://linux.duke.edu/metadata/common` namespace, and that prefix is ElementTree's way of telling you. (Before you ask, no, you can't get rid of it.) Ask the question you really want to ask. — Tomalak, Sep 08 '20 at 08:16
Ok thanks. Well the question would be, how do i iterate over the package elements? Want i really need of that xml is the location URL. I tried to get it with `for location in root.iter('location')` but that dont seem to work — embedded, Sep 08 '20 at 08:41
One of the ways: You can prefix the namespace, just as ElementTree does. There are uncounted examples on this site alone how to work with a default namespace (that's how this is called) in ElementTree, take a look around. — Tomalak, Sep 08 '20 at 08:47

python xml parsing with etree

0 Answers0