UPDATE - XML PARSER IMPLEMENTATION : since replace a specific <Location>
tag require to modify the regex i'm providing a more general and safer alternative implementation based upon ElementTree parser (as stated above by @stribizhev and @Saket Mittal).
I've to add a root element <Emps>
(to make a valid xml doc, requiring root element), i've also chosen to filter the location to edit by the <city>
tag (but may be everyfield):
#!/usr/bin/python
# Alternative Implementation with ElementTree XML Parser
xml = '''\
<Emps>
<Emp>
<Name>Raja</Name>
<Location>
<city>ABC</city>
<geocode>123</geocode>
<state>XYZ</state>
</Location>
<sal>100</sal>
<type>temp</type>
</Emp>
<Emp>
<Name>GsusRecovery</Name>
<Location>
<city>Torino</city>
<geocode>456</geocode>
<state>UVW</state>
</Location>
<sal>120</sal>
<type>perm</type>
</Emp>
</Emps>
'''
from xml.etree import ElementTree as ET
# tree = ET.parse('input.xml') # decomment to parse xml from file
tree = ET.ElementTree(ET.fromstring(xml))
root = tree.getroot()
for location in root.iter('Location'):
if location.find('city').text == 'Torino':
location.set("isupdated", "1")
location.find('city').text = 'MyCity'
location.find('geocode').text = '10.12'
location.find('state').text = 'MyState'
print ET.tostring(root, encoding='utf8', method='xml')
# tree.write('output.xml') # decomment if you want to write to file
Online runnable version of the code here
PREVIOUS REGEX IMPLEMENTATION
This is a possible implementation using the lazy modifier .*?
and dot all (?s)
:
#!/usr/bin/python
import re
xml = '''\
<Emp>
<Name>Raja</Name>
<Location>
<city>ABC</city>
<geocode>123</geocode>
<state>XYZ</state>
</Location>
</Emp>'''
locUpdate = '''\
<Location isupdated=1>
<city>MyCity</city>
<geocode>10.12</geocode>
<state>MyState</state>
</Location>'''
output = re.sub(r"(?s)<Location>.*?</Location>", r"%s" % locUpdate, xml)
print output
You can test the code online here
Caveat: if there are more than one <Location>
tag in the xml input the regex replace them all with locUpdate
. You have to use:
# (note the last ``1`` at the end to limit the substitution only to the first occurrence)
output = re.sub(r"(?s)<Location>.*?</Location>", r"%s" % locUpdate, xml, 1)