-1

I would like to replace all double backslashes \\ enclosed between $ symbols or $$ symbols with four backslashes \\\\.

E.g., I want to convert some \\ random text $ 5\\ 6$ to some \\ random text $ 5\\\\ 6$, and some $5x^2 \$ random text $$ 5 \\ 6$$ to some $5x^2 \$ random text $$ 5 \\\\ 6$$.

How can I do this using regex and Python?

  • 1
    What have you tried? What didn't work? What did you get? What did you expect? – Toto Oct 02 '18 at 19:03
  • Try if [this](https://stackoverflow.com/questions/5658369/how-to-input-a-regex-in-string-replace) can answer. – Irfanuddin Oct 02 '18 at 19:05
  • 1
    Could this be an instance of the [XY problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem)? What are you actually trying to achieve? Maybe you could use [raw strings](https://stackoverflow.com/questions/2081640/what-exactly-do-u-and-r-string-flags-do-and-what-are-raw-string-literals). – mkrieger1 Oct 02 '18 at 19:07
  • I need to replace double backslashes with 4 backslashes so that mathjax will render properly after rendering markdown. –  Oct 02 '18 at 19:08
  • 3
    To make sure this is not an escape problem: wbat is the source of the text? How does it get into Python? – Klaus D. Oct 02 '18 at 19:13
  • I've created a website using django, and I allow users create posts in markdown (similar to on SO). When I try to render certain posts which contain \\ between $ symbols, the markdown renderer which I am using converts \\ to \. –  Oct 02 '18 at 19:15
  • @Jack Don't try to escape your users' markup for them. You'll create more problems than you'll solve. Give them a preview and let them deal with it. A little documentation about which flavor (maybe even the exact renderer) of Markdown you're using might also help. – jpmc26 Oct 02 '18 at 19:47
  • Also, you can't do this properly with regex. (Or if you can, it'll be a nightmare regex. It's the wrong tool for the job.) You need a parser to handle matching delimiters that can contain other nested elements correctly. – jpmc26 Oct 02 '18 at 19:51

1 Answers1

0

Try something like this:

Pattern: (?<!\\)([$]+)([^$]*?)\\([^$]*?)(?<!\\)\1 Substitution: \1\2\\\\\\\\\3\1

Examples using your provided tests: https://regex101.com/r/X9lGCF/2

Rough explanation of pattern:

(?<!\\)([$]+) - Match and capture at least one unescaped $; the (?<!\\) is a negative lookbehind to make sure the $'s aren't prefixed with a backslash

([^$]*?)\\([^$]*?) - Capture the text between the first matched $ sequence and the same initially matched $ sequence on either side of the \\

(?<!\\)\1 - Reuse the initially matched $ sequence in our pattern (this enforces the surrounding $ sequences to be the same length; e.g. not matching things like $\\$$), ensuring that the last sequence is also unescaped

The substitution will replace the backslashes (they're escaped, hence why we use 8 of them to get 4 backslashes) with the surrounding captured text and $ sequences.

John
  • 2,395
  • 15
  • 21