1

I am trying to use re.sub() to manipulate latex math expressions, specifically, replace strings such as

string1 = "- \frac{2}{- 4 \sqrt{2} + 2}" # with "\frac{2}{4 \sqrt{2} - 2}"

string2 = "\frac{2}{- 4 \sqrt{2} + 2}" # with "\frac{2}{2 - 4 \sqrt{2}}"

Here is the python code that raised an error("unmatched group").

pattern = r"(?P<neg>- )?\\frac{(?P<numer>\d*)}{- (?P<denom1>\d* ?\\sqrt{\d*}) \+ (?P<denom2>\d*)\}"
replacement = r"\\frac{\g<numer>}{(?(\g<neg>)(\g<denom2> - \g<denom1>)|(\g<denom1> - \g<denom2>))}"
key = sub(pattern, replacement, string)

I am sure that the pattern matches correctly because I tried using the re.sub() without the conditional in the replacement argument and the code worked fine. Of course, in this case, the code works either for string1 or string2 but not both.

pattern = r"(?P<neg>- )?\\frac{(?P<numer>\d*)}{- (?P<denom1>\d* ?\\sqrt{\d*}) \+ (?P<denom2>\d*)\}"
replacement = r"\\frac{\g<numer>}{\g<denom1> - \g<denom2>}"
key = sub(pattern, replacement, string)

so is it a syntax problem and if that's the case, what's the problem? or If-then-else conditionals are not allowed in the replacement argument?

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
K. Ali
  • 45
  • 2
  • 4
  • A replacement string is a simple string with eventual placeholders for the capture, that's all. There is no "if-then-else conditional" feature in the Python re module. If you want to make a conditional replacement, use a lambda function as replacement, not a string. – Casimir et Hippolyte Jun 01 '17 at 11:24
  • I found on "regular-expressions.info", that it's supported by python regex engine. "Conditionals are supported by the JGsoft engine, Perl, PCRE, Python, and the .NET framework. Ruby supports them starting with version 2.0. Languages such as Delphi, PHP, and R that have regex features based on PCRE also support conditionals." http://www.regular-expressions.info/conditional.html – K. Ali Jun 01 '17 at 11:31
  • You are right it exists in the re module, I didn't know that! However you can't use it in a replacement string (as all other regex tokens except reference to capture groups). – Casimir et Hippolyte Jun 01 '17 at 11:36

1 Answers1

1

You may pass the match to a method where you may check if a certain group matched or not, and then build the replacement dynamically applying your conditions using standard Python means:

import re
def repl(x):
    return r"\frac{{{0}}}{{{1} - {2}}}".format(x.group("numer"),
        (x.group("denom1") if x.group("neg") else x.group("denom2")),
        (x.group("denom2") if x.group("neg") else x.group("denom1")))

string1 = r"- \frac{2}{- 4 \sqrt{2} + 2}"
string2 = r"\frac{2}{- 4 \sqrt{2} + 2}"
pattern = r"(?P<neg>- )?\\frac{(?P<numer>\d*)}{- (?P<denom1>\d* ?\\sqrt{\d*}) \+ (?P<denom2>\d*)\}"
print(re.sub(pattern, repl, string1)) # => \frac{2}{4 \sqrt{2} - 2}
print(re.sub(pattern, repl, string2)) # => \frac{2}{2 - 4 \sqrt{2}}

See Python demo

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563