-3

How do I modify the following xml snippet

<routes xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://sumo.dlr.de/xsd/routes_file.xsd">
    <vType id="car1_73" length="4.70" minGap="1.00" maxSpeed="12.76" probability="0.00" vClass="passenger" guiShape="passenger/van">
        <carFollowing-Krauss accel="2.40" decel="4.00" sigma="0.55"/>
    </vType>
    <vehicle id="0" type="vTypeDist" depart="0.00" departLane="best" departPos="random" departSpeed="random">
        <routeDistribution last="1">
            <route cost="108.41" probability="0.44076116" edges="bottom7to7/0 7/0to6/0 6/0to6/1 6/1to5/1 5/1to5/2 5/2to6/2"/>
            <route cost="76.56" probability="0.55923884" edges="bottom7to7/0 7/0to6/0 6/0to5/0 5/0to5/1 5/1to5/2 5/2to6/2"/>
        </routeDistribution>
    </vehicle>
</routes>

so that the resulting one looks like this:

<routes xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://sumo.dlr.de/xsd/routes_file.xsd">
    <vehicle id="0" type="vTypeDist" depart="0.00" departLane="best" departPos="random" departSpeed="random">
        <route edges="bottom7to7/0 7/0to6/0 6/0to5/0 5/0to5/1 5/1to5/2 5/2to6/2"/>
    </vehicle>
</routes>

Basically the following has to be done

  • remove the <vtype> (and the <carFollowing...> elements within it) completely,
  • remove the <routeDistribution...>,
  • create <route> element which keeps only the last edges attribute from within <routeDistribution...> element.

EDIT: Here I provide my version using xml.etree.ElementTree. Why all the downvotes though... :/

import xml.etree.ElementTree as ET


if __name__ == "__main__":

tree = ET.parse('total-test.xml')
root = tree.getroot()

# remove <carFollowing> subelement from each vType 
vTypes = root.findall("vType")
for vType in vTypes:
    carFollowings = vType.findall("carFollowing-Krauss")
    for carFollowing in carFollowings:
         vType.remove(carFollowing)

# remove each <vType> (to remove an element reference to its parent is required)
for element in root:
    if element.tag == "vType":
        root.remove(element)

# from root get into <vehicle>
vehicles = root.findall("vehicle")
for vehicle in vehicles:
    # for each <vehicle> get reference <routeDistribution>s
    routeDistributions = vehicle.findall("routeDistribution")
    for routeDist in routeDistributions:
        # for each vehicle distrbution get reference to <route>s
        routes = routeDist.findall("route")

        # fill a container with dictionaries which represent <route> attributes
        listOfRouteDicts = list()
        for route in routes:
            listOfRouteDicts.append(route.attrib)

        # find the min_cost for the given routes
        min_cost = min(float(routeDict['cost']) for routeDict in listOfRouteDicts)
        print(min_cost)

        for route in routes:
            if route.get('cost') == str(min_cost):
                # remove the other attributes of the <route>, we only want the <edges>
                route.attrib = {routeAttr:v for routeAttr,v in route.attrib.items() if routeAttr == "edges"}
                vehicle.append(route)   # move route one level-up to <vehicle> because <routeDistribution> needs to be removed 
            else:
                routeDist.remove(route) # remove all routes which don't have the lowest cost

    # remove the <routeDistribution> for each <vehicle> 
    vehicle.remove(routeDist)
    vehicle.set('type', 'vTypeDist')


tree.write('output.xml')
Kristof Pal
  • 966
  • 4
  • 12
  • 28
  • you can have alook here: https://wiki.python.org/moin/MiniDom – Stefano Apr 22 '16 at 11:29
  • @Stefano I have not tried much so far as I am not familiar with xml related things in Python. So open for suggestions – Kristof Pal Apr 22 '16 at 11:31
  • even if i think you shold have done a bit more effort before simply asking around to do the script for you i have posted below a "quick and dirty" code to get you started. – Stefano Apr 22 '16 at 12:05
  • I agree with you... started playing around with ElementTree... any reason you choose minidom over it? – Kristof Pal Apr 22 '16 at 13:33
  • not really... dom / minidom are just more classic libraries. i also used to use lxml2 which is very fast. LXML is yet another. i think this is based on libxml2 but should give a nicer interface. – Stefano Apr 22 '16 at 13:48

1 Answers1

1

Probably you need something a bit more generic. the following script takes your input (in.xml) and generate the new output (out.xml). For sure this is not really good coding but it can get you started with the syntax and help you generalize this for your needs.

from xml.dom.minidom import parse, parseString

dom = parse("in.xml" )   # parse an XML file
docRoot = dom.documentElement

# delete all vType
vTypeNode = docRoot.getElementsByTagName('vType')[0]
docRoot.removeChild(vTypeNode)

#i keep only first route node... second is the same... 
#but i am not sure if this will always be the case
routeNode = docRoot.getElementsByTagName('route')[0]

#remove all old route nodes
vehicleNode = docRoot.getElementsByTagName('vehicle')[0]
for child in vehicleNode.childNodes:
    if child.nodeType == child.ELEMENT_NODE:
        vehicleNode.removeChild(child) 

#create a new route node
newRouteNode = dom.createElement("route")
newRouteNode.setAttribute("edges"  , routeNode.getAttribute("edges"))

#append new node
vehicleNode.appendChild(newRouteNode)

#print output
#print dom.toprettyxml()

#write to file
outFile = open("out.xml","wb")
dom.writexml(outFile)
outFile.close()

N.B: this is just a quick and dirty to get you started!!!

EDIT:

minidom ouptus is always quite dirty as it contains many useless white spaces. This is a well known problem but can be easily fixed in different ways. You might be interested having alook here:

problem with the new lines when I use toprettyxml()

Community
  • 1
  • 1
Stefano
  • 3,981
  • 8
  • 36
  • 66