1

I am trying to write a regex to find comment lines in LaTeX. I created the following example. The last regex does not work. Can I have a single regex for all of the cases?

Before:

\usepackage{test}%COMMENT1

TEXT
%COMMENT2
TEXT

Value is 10\%, this should not be removed. %COMMENT3

begin{tikz}[
important 1,
%COMMENT4
important 2, %COMMENT5
]

TEXT
%COMMENT 6

TEXT

Table: value1&value2\\%COMMENT7

After:

\usepackage{test}

TEXT
TEXT

Value is 10\%, this should not be removed.

begin{tikz}[
important 1,
important 2,
]

TEXT

TEXT

Table: value1&value2\\

The is what I reached so far:

(^%(.*?)\r?\n)

Works for comment 2,4,6 when replaced with nothing

([\]{2}%(.*?)\r\n)

This works for comment 7 when replaced with \\\r\n

([^\]%(.*?)\r?\n)

This does NOT work for comment 1 because it select the '}'

NuminousName
  • 200
  • 1
  • 15

1 Answers1

1

You may use

Regex.Replace(s, @"(?m)(?<=(?<!\\)(?:\\{2})*)%.*(?:\r?\n(?!\r?$))?", "")

See the regex demo

Details

  • (?m) - RegexOptions.Multiline inline option, $ will match before a newline, too.
  • (?<=(?<!\\)(?:\\{2})*) - any even amount of backslashes, it is a positive lookbehind that matches a location that is not immediately preceded with \ and then any 0 or more repetitions of double backslashes
  • % - a % sign
  • .* - any 0+ chars other than an LF as many as possible
  • (?:\r?\n(?!\r?$))? - an optional non-capturing group matching
    • \r?\n - an optional CR and then LF...
    • (?!\r?$) - not immediately followed with an optional CR and end of a line.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Just in case: `(?:\r?\n(?!\r?$))?` matches a line break sequence if it is not followed with an empty line. If there can be *blank* lines, you may use `(?!\s*$)` instead of `(?!\r?$)`. – Wiktor Stribiżew Feb 16 '19 at 19:19