I am currently trying to scan a file for a specific pattern and capture pieces of the matched pattern to use in a replacement string.
My current Python 3 script is using this pattern and captures the data in simple cases.
def readFile(filename):
pattern = re.compile(b"(<%InsertIf expression=\"\$\{\(\((.*?)\[\'(.*?)\'\].*?\'(.*?)\'\)\)\}\".*?\/InsertIf%>)", re.DOTALL)
with open(filename, 'r+') as f:
data = mmap.mmap(f.fileno(), 0)
for match in re.finditer(pattern, data):
print(match.groups())
print ("")
For example, when matching this snippet of the file:
<%InsertIf expression="${((user.MemberAttribute['treatmentcode'] == 'NM'))}" %>some random text goes here<sup>®</sup> membership<%/InsertIf%><%InsertIf expression="${((user.MemberAttribute['treatmentcode'] == 'N1'))}" %>some random text goes here<sup>®</sup> upgrade.<%/InsertIf%><br />
I obtain the desired output from the regex I have in place for these patterns:
(b'<%InsertIf expression="${((user.MemberAttribute[\'treatmentcode\'] == \'NM\'))}" %>some random text goes here<sup>\xc2\xae</sup> membership<%/InsertIf%>', b'user.MemberAttribute', b'treatmentcode', b'NM')
(b'<%InsertIf expression="${((user.MemberAttribute[\'treatmentcode\'] == \'N1\'))}" %>some random text goes here<sup>\xc2\xae</sup> upgrade.<%/InsertIf%>', b'user.MemberAttribute', b'treatmentcode', b'N1')
However, when the InsertIf expression has additional conditionals, I cannot figure out the appropriate pattern to use for the regex.
Here is a two complex snippets which I am trying resolve. In one case there is an additional '||' conditional. In the other there is an "and" conditional.
<%InsertIf expression="${((user.MemberAttribute['country'] == 'US') || (user.MemberAttribute['country'] == 'CA'))}" %>
In the above case I would expect a second set of captures:
- Full InsertIf captured string
- user.MemberAttribute
- country
- US
- user.MemberAttribute
- country
- CA
But since the pattern doesn't account for the conditional the 4th capture returns: 4. US') || (user.MemberAttribute['country'] == 'CA
AND example
<%InsertIf expression="${((user.MemberAttribute['country']=='US') and (user.MemberAttribute['treatmentcode']=='NM'))}" %><%InsertCSE id="XXXXX"%><%/InsertIf%>
Similar expectations and bad result as the '||' example above.
Any assistance with the pattern is greatly appreciated. I am still learning regular expressions and this one is just a tad out of my depth.
Thanks.
Adding additional details as requested: I am essentially trying to perform a conversion of one syntax to another within a file.
Example: I want to find this pattern...
<%InsertIf
expression="${((user.MemberAttribute['treatmentcode']=='NM'))}" %>
<%InsertCSE id="4000116068"%><%/InsertIf%>
<%InsertElse expression="${((user.MemberAttribute['treatmentcode']=='N1'))}" %>
<%InsertCSE id="4000116069"%>
<%/InsertElse%>
and convert it to this pattern while preserving the variable values:
%%[ if treatmentcode == "NM" then ]%%
%%=contentArea("4000116068")=%%
%%[ elseif treatmentcode == "N1" then ]%%
%%=contentArea("4000116069")=%%
%%[ endif ]%%
The challenge comes into play when there are additional conditionals as part of the expression itself. The original snippets above show more of the details for the input. I can get simple expressions working as desired but it falls apart on the more complex statements.
I was initially trying to take a simple InsertIf case and get it working. I could then loop the file to handle the InsertElse and other cases.