0

I’ve got a master .xml file generated by an external application and want to create several new .xmls by adapting and deleting some rows with python. The search strings and replace strings for these adaptions are stored within an array, e.g.:

replaceArray = [
[u'ref_layerid_mapping="x4049" lyvis="off" toc_visible="off"',
u'ref_layerid_mapping="x4049" lyvis="on" toc_visible="on"'],
[u'<TOOL_BUFFER RowID="106874" id_tool_base="3651" use="false"/>',
u'<TOOL_BUFFER RowID="106874" id_tool_base="3651" use="true"/>'],
[u'<TOOL_SELECT_LINE RowID="106871" id_tool_base="3658" use="false"/>',
u'<TOOL_SELECT_LINE RowID="106871" id_tool_base="3658" use="true"/>']]

So I'd like to iterate through my file and replace all occurences of 'ref_layerid_mapping="x4049" lyvis="off" toc_visible="off"' with 'ref_layerid_mapping="x4049" lyvis="on" toc_visible="on"' and so on. Unfortunately the ID values of "RowID", “id_tool_base” and “ref_layerid_mapping” might change occassionally. So what I need is to search for matches of the whole string in the master file regardless which id value is inbetween the quotation mark and only to replace the substring that is different in both strings of the replaceArray (e.g. use=”true” instead of use=”false”). I’m not very familiar with regular expressions, but I think I need something like that for my search?

re.sub(r'<TOOL_SELECT_LINE RowID="\d+" id_tool_base="\d+" use="false"/>', "", sentence)

I'm happy about any hint that points me in the right direction! If you need any further information or if something is not clear in my question, please let me know.

ateko
  • 3
  • 3
  • so basically you want ignore `ref_layerid_mapping` and `RowID` while searching and replace other parts of the string without touching the ID. Right ? – Kavin Eswaramoorthy Jun 09 '15 at 07:13
  • why are you using regex for this you can parse it using python xml parser and then change appropriate attribute – Kavin Eswaramoorthy Jun 09 '15 at 07:18
  • If you want to know how to parse xml you can follow the link http://stackoverflow.com/questions/1912434/how-do-i-parse-xml-in-python – Kavin Eswaramoorthy Jun 09 '15 at 07:20
  • The master xml file is being generated by an external application which writes very complex and sometimes not well structured xml files. So parsing would be an elegant way, but I'd like to avoid generating xml output my application cannot read in the worst case... But regarding your first question, yes, this is what I basically need. – ateko Jun 09 '15 at 10:29

1 Answers1

0

One way to do this is to have a function for replacing text. The function would get the match object from re.sub and insert id captured from the string being replaced.

import re

s  = 'ref_layerid_mapping="x4049" lyvis="off" toc_visible="off"'
pat = re.compile(r'ref_layerid_mapping=(.+) lyvis="off" toc_visible="off"')

def replacer(m):
    return "ref_layerid_mapping=" + m.group(1) + 'lyvis="on" toc_visible="on"';

re.sub(pat, replacer, s)

Output:

'ref_layerid_mapping="x4049"lyvis="on" toc_visible="on"'

Another way is to use back-references in replacement pattern. (see http://www.regular-expressions.info/replacebackref.html)

For example:

import re
s = "Ab ab"
re.sub(r"(\w)b (\w)b", r"\1d \2d", s)

Output:

'Ad ad'
Ashwinee K Jha
  • 9,187
  • 2
  • 25
  • 19
  • After a few tests I'm still not sure how to apply this in my case. My replaceArray has about 50 search strings and corresponding replace strings, each pair with a completely different syntax, so this replacer definition in your example wouldn't really work unfortunately. – ateko Jun 09 '15 at 20:08
  • Hi, I have updated my answer. Probably second approach will work better for you as you can specify both search and replacement patterns in a file or something. I wasn't aware of this approach when I wrote the first version of the answer. – Ashwinee K Jha Jun 10 '15 at 14:55
  • Sometimes life can be so easy! Thanks a lot for the link to the back references, it's exactly what I was looking for! I will replace the IDs in my Excel file once with the corresponding regular expressions and don't have to worry about ID changes in the future... – ateko Jun 11 '15 at 10:38