0

I know some basic concepts of regex. But when I need to but them together I'm failing. Here it IMHO comes to a negative lookahead. But I'm even not sure about if this is the right approach.

The input string is #+foo: bar: lore.

The regex (Python3) is ^#\+(.*): *(.*)$ and extract me this two groups.

#+foo: bar: lore
  ^^^^^^^^  ^^^^

The output I wish is

#+foo: bar: lore
  ^^^  ^^^^^^^^^

Spoken in human language I would describe the first capturing group with "Everything after #+ until the first :. The problem is that the last is used instead of the first :.

So I thought a negative lookahead would be the solution. I tried several things but none of them worked. Here is one approach.

^#\+(.*(?!:)): *(.*)$
       ^^^^^
buhtz
  • 10,774
  • 18
  • 76
  • 149
  • 2
    You don't need lookahed. Use lazy quantifier: `^#\+(.*?): *(.*)$` – markalex Apr 16 '23 at 21:52
  • 1
    Try `^#\+(.*?): *(.*)$` where `?` causes as few characters as possible to be matched. [Demo](https://regex101.com/r/aUexZn/1). Alternatively, `^#\+([^:]*): *(.*)$`. [Demo](https://regex101.com/r/V40Gip/1). – Cary Swoveland Apr 16 '23 at 21:53
  • 1
    The non-greedy quantifier is the best bet. The assertion equivalent is this `((?:(?!:).)*)` which checks that the next character is not a colon, preventing matching past it. This is usually slower but is 100% assured not to go past it, where as `.*?` is dependent upon the next expression in the regex. Cautionary tale ..! – sln Apr 16 '23 at 22:23
  • 1
    I expect this to be marked as a duplicate, regardless of the hidden dangers of `.*?` that won't be mentioned or considered. – sln Apr 16 '23 at 22:32
  • 1
    `^#\+([^:]*): *(.*)$` [demo](https://regex101.com/r/tBsCVs/1) – dawg Apr 16 '23 at 23:16

0 Answers0