0

I'm parsing an xml file and using xml.etree.ElementTree module. Excerpt of the file I'm using is below.

I have the absolute path for ZIP Code, and many others. The path for ZIP Code is: "Return/ReturnHeader/Filer/USAddress/ZIPCd". I'm trying to get the ZIP Code out of this xml, but couldn't get that. (I'm preferring this way of writing the path, as there are duplicates and it makes easier for me to get all other values as well).

Sample XML looks like this:

<?xml version="1.0" encoding="utf-8"?>
<Return xmlns="http://www.irs.gov/efile" xsi:schemaLocation="http://www.irs.gov/efile" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" returnVersion="2014v5.0">
    <ReturnHeader binaryAttachmentCnt="0">
    <ReturnTs>2016-11-14T11:59:09-06:00</ReturnTs>
      <Filer>
        <EIN>942788907</EIN>
        <PhoneNum>9162866665</PhoneNum>
        <USAddress>
           <CityNm>SACRAMENTO</CityNm>
           <StateAbbreviationCd>CA</StateAbbreviationCd>
           <ZIPCd>95833</ZIPCd>
        </USAddress>
      </Filer>
    </ReturnHeader>
</Return>

Any help is appreciated.

import xml.etree.ElementTree as ET
tree = ET.parse('201533089349301428.xml')
root = tree.getroot()

zipn = root.find("Return/ReturnHeader/Filer/USAddress/ZIPCd")
zip = zipn.text

print(zip)
Tomalak
  • 332,285
  • 67
  • 532
  • 628
  • 1
    I see what you mean. I have added missing lines to the xml. – user18379466 Apr 08 '22 at 14:18
  • Thanks for the comment. This is how all my xmls are structured, and I can't edit them. I'm not sure what you mean by consider the namespace. – user18379466 Apr 08 '22 at 15:02
  • Even if I take the name space out, it wasn't returning the text. – user18379466 Apr 08 '22 at 15:25
  • To your point, I've also excluded name space from the xml as well. – user18379466 Apr 08 '22 at 16:00
  • Going back to your namespace comment, is there any module that I can use other than ETree to avoid worrying about namespace. – user18379466 Apr 08 '22 at 19:24
  • You don't want to "avoid worrying about the namespace". You want to write proper code that uses the namespace correctly. There are - literally - thousands of answers on this site that show you how to do that with Python and ElemenTree. You have not searched. Also, next time you ask an XML question, stop making arbitrary changes to your XML just because you think it doesn't matter. If you want answers that work (and not waste your, but **much** more importantly: everybody else's) time, show the XML you actually have. – Tomalak Apr 09 '22 at 06:24

0 Answers0