1

I am trying to loop through an XML document, find some tags, combine them into one new one and then write back to the xml doc using the ElementTree module in python.

I have the code to the point where I believe it would work, but when i get to the portion of writing the file I receive an error:

AttributeError: '_IterParseIterator' object has no attribute 'write'

the file I am trying to parse is 120mb so I figured that using interparse would be more efficient. and it is also what I am more familiar with.

 import xml.etree.ElementTree as ET #imports the ElementTree module for working with XML
import pprint
from collections import defaultdict


def is_tigerbase(elem):
    return (elem.tag =="tag") and (elem.attrib['k'] == "tiger:name_base")

def is_tigertype(elem):
    return (elem.tag =="tag") and (elem.attrib['k'] == "tiger:name_type")




def audit():
    tree = ET.iterparse('map')
    base = 0
    t_type = 0
    for event, elem in tree:
        #look for all nodes and ways 
        if elem.tag == "node" or elem.tag == "way":
            #loop through all the tags
            for tag in elem.iter('tag'):
                #if the tag is a tiger base then change the base value to 1
                #also get the v attribute and put it in the basedetail var
                #then stop the loop
                if is_tigerbase(tag):
                    base == 1
                    if 'v' in elem.attrib:
                        basedetail = elem.attrib['v']
                    break
            #loop through all the tags again
            for tag in elem.iter('tag'):
                #look for the tiger type tag, if there is one change the base
                #value to 1 and get the v attribute for the detail
                #stop the loop
                if is_tigertype(tag):
                    t_type == 1
                    if 'v' in elem.attrib:
                        t_typedetail = elem.attrib['v']
                    break
            #look to see if you had a base and a type and get ready to create
            #the new tag
            if base == 1 and t_type == 1:
                new = basedetail + " " + t_typedetail
                ET.SubElement(elem, "tag", k="addr:street", v=new)
                print(new)
            elif base == 1 and ttype == 0:
                new = basedetail
                ET.SubElement(elem, "tag", k="addr:street", v=new)
                print(new)
            base = 0
            ttype = 0

    tree.write('map')

audit()

A Small sample of the XML file I am parsing:

<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="Overpass API 0.7.55.3 9da5e7ae">
<note>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</note>
<meta osm_base="2018-06-22T21:32:02Z"/>

  <bounds minlat="28.3156000" minlon="-81.6952000" maxlat="28.4497000" maxlon="-81.4257000"/>

  <node id="26794208" lat="28.3306444" lon="-81.5475040" version="14" timestamp="2014-07-07T10:17:59Z" changeset="24000940" uid="14293" user="KindredCoda"/>
  <node id="26794209" lat="28.3612495" lon="-81.5194078" version="17" timestamp="2014-07-05T01:17:25Z" changeset="23960255" uid="14293" user="KindredCoda"/>
  <node id="26794210" lat="28.3822849" lon="-81.5005573" version="25" timestamp="2018-02-26T21:48:01Z" changeset="56704055" uid="4018842" user="Stephen214">
    <tag k="highway" v="motorway_junction"/>
    <tag k="old_ref" v="27"/>
    <tag k="ref" v="68"/>
  </node>
  <way id="596852739" version="1" timestamp="2018-06-12T09:57:29Z" changeset="59771511" uid="5659851" user="marthaleena">
    <nd ref="5289076747"/>
    <nd ref="5126801577"/>
    <tag k="HFCS" v="Urban Collector"/>
    <tag k="highway" v="unclassified"/>
    <tag k="lanes" v="2"/>
    <tag k="name" v="Polynesian Isles Boulevard"/>
    <tag k="tiger:cfcc" v="A41"/>
    <tag k="tiger:county" v="Osceola, FL"/>
    <tag k="tiger:name_base" v="Polynesian Isles"/>
    <tag k="tiger:name_type" v="Blvd"/>
    <tag k="tiger:reviewed" v="no"/>
    <tag k="tiger:zip_left" v="34746"/>
    <tag k="tiger:zip_right" v="34746"/>
  </way>
Sam L
  • 162
  • 1
  • 8
  • I'm not so familiar with this library, but looking at the Python document, `ET.iterparse()` returns a tuple `(event, elem)` whereas `write` needs to act on an element tree. The docs show that write can be used on the value returned by `ET.parse()`. [Documentation](https://docs.python.org/2/library/xml.etree.elementtree.html) – Miket25 Jun 24 '18 at 23:04
  • Is this related to your previous [question](https://stackoverflow.com/q/50998718/1422451) with same desired output? Consider XSLT whenever you need to manipulate XML which Python can run with its third-party module, `lxml`. See [XSLT Fiddle](https://xsltfiddle.liberty-development.net/6qVRKwb). – Parfait Jun 25 '18 at 03:05
  • @Parfait yes it is in relation. I've kinda fixed this issue by switching the way I am doing it. – Sam L Jun 25 '18 at 12:08
  • Again, I would recommend XSLT when manipulating XML files which uses no `for` looping and `if` logic and can handle 120 MB (GBs is a different story!). Did you see the fiddle's result (lower left)? – Parfait Jun 25 '18 at 14:17
  • @parfait, thanks, while that is great this is for a project where I have to use python scripts to manipulate the data. and have to attach the .py files – Sam L Jun 25 '18 at 20:44
  • Python's third-party module, `lxml`, can run XSLT scripts. See [example here](https://stackoverflow.com/a/48978545/1422451). – Parfait Jun 25 '18 at 21:40

1 Answers1

0

Because iterparse() does not have a write function as it returns a tuple, you cannot write to a document in the same way as with .parse(). switching my code to use parse resolved the issue.

    root = tree.getroot()
    for way in root.findall(".//way"):
        kbool = False
        tbool = False
        for key in way.iterfind(".//tag"):
            if key.attrib['k'] == "tiger:name_base":
                kbool = True
                # print(key.attrib['v'])
                base = key.attrib['v']
            if key.attrib['k'] == "tiger:name_type":
                tbool = True
                ttype = key.attrib['v']
        if kbool == True and tbool == True:
            ET.SubElement(way, 'tag k="addr:street" v="{} {}"'.format(base, ttype))
        elif kbool == True and tbool == False:
            ET.SubElement(way, 'tag k="addr:street" v="{}"'.format(base))


    tree.write('maps')
Sam L
  • 162
  • 1
  • 8