0

I downloaded some data from OpenStreetMap, and have been sorting the data so i only have the nodes and ways that i need for my project (highways and the corresponding nodes in the references). To sort the XML file and create a new one, i use the library Pyosmium. Everything works except i cant parse the XML file with xml.etree.ElementTree. When i sort my data into a new file im not moving the bounds that contain the min and max longitude and latitude. If i manually copy in the bounds it parses.

I read through the Pyosium doc's and only found osmium.io.Reader and osmium.io.Header as well as some Geometry Attributes that describe the box (containing what i need), but i found no help in regards to getting it from my file and using my writer to write it to the new one.

So far this is what i have in my main method that just handles the nodes and ways, using SimpleHandlers

    wayHandler = XMLhandlers.StreetHandler()
    nodeHandler = XMLhandlers.NodeHandler()
    wayHandler.apply_file('data/map_2.osm')
    nodeHandler.apply_file('data/map_2.osm')

    if os.path.exists('data/map_2_TEST.osm'):
        os.remove('data/map_2_TEST.osm')


    writer = XMLhandlers.wayWriter('data/map_2_TEST.osm')
    writer.apply_file('data/map_2.osm')

    tree = ET.parse('data/map_2_TEST.osm')

pruces the following error:

xml.etree.ElementTree.ParseError: no element found: line 1, column 0

Pastebin of original XML file: https://pastebin.com/i8uyCneC Pastebin of sorted XML file that wont parse: https://pastebin.com/WZUcsZg4

EDIT: The error is not in the parsing itself. If i comment out the part that generates the new XML and only try to parse the new XML file (that was generated beforehand) it works for some reason.

EDIT 2: The error was i forgot to call close() on my SimpleWriter to flush remaining buffers and close the writer.

Skovgaard
  • 11
  • 3
  • Sorry i added the line for parsing the tree, the error it produces, as well as pastebins to the two xml files. – Skovgaard Feb 15 '23 at 15:18
  • I can't find anything wrong with any of the two XML files. – mzjn Feb 15 '23 at 15:34
  • Thats the issue. For some reason it parses if i manually copy the line from the original XML: '' into the new XML, but i cant find a way to do that with the libraries used. – Skovgaard Feb 15 '23 at 18:59
  • It's a valid XML and `chardet` said {'encoding': 'utf-8', 'confidence': 0.99, 'language': ''}, but if you try to parse with `xml.etree.ElementTree` it tells you the rows and columns where ET have problems. If you remove # Uhrhøj, “Margueritruten”, "Sønderborg" the special character the parsing works. I don't have pyosmium, but maybe the same issue? – Hermann12 Feb 15 '23 at 20:20
  • Those are all part of the original XML file, im trying to parse the second one, where those are already sorted out. – Skovgaard Feb 15 '23 at 20:27

1 Answers1

0

The issue happens since the code never stops the writer when done. By calling writer.close() it flushes the remaining buffers and closes the writer.

The following code has the line added, and the tree parses as expected.

    wayHandler = XMLhandlers.StreetHandler()
    nodeHandler = XMLhandlers.NodeHandler()
    wayHandler.apply_file('data/map_2.osm')
    nodeHandler.apply_file('data/map_2.osm')

    if os.path.exists('data/map_2_TEST.osm'):
        os.remove('data/map_2_TEST.osm')

    writer = XMLhandlers.wayWriter('data/map_2_TEST.osm')
    writer.apply_file('data/map_2.osm')
    writer.close()
    
    tree = ET.parse('data/map_2_TEST.osm')
Skovgaard
  • 11
  • 3