0

So here's my regex to match a word after "define" or "define:"

((?<=define |define: )\w+)

That part works well and all. But when I add the part where it also should match word between {} if it can, it matches everything.

((?<=define |define: )\w+)|([^{][A-Z]+[^}])

The regex with the examples

The thing that I noticed is that when I add ^ at first [{] then it ruins everything and I don't understand why.

AlexINF
  • 250
  • 2
  • 11
Boy pro
  • 452
  • 4
  • 19

2 Answers2

2

Why does using [^{] not work?

By using [^{], your regex becomes:

[^{][A-Z]+[^}]

In words, this translates to:

  • character that's not a {
  • a bunch of letters
  • character that's not a }

Note how nothing in your regex enforces the idea that the "a bunch of letters" part has to be between {}s. It just says that it has to be after a character that is not {, and before a character that is not }. By this logic, even something like ABC would match because A is not {, B is the bunch of letters, and C is not }.

How to match a word between {}?

You can use this regex:

{([A-Z]+)}

And get group 1.

I don't think that you should combine this with the regex that matches a word after define. You should use 2 separate regexes because these are two completely different things.

So split it into two regexes:

(?<=define |define: )\w+

and

{([A-Z]+)}
Sweeper
  • 213,210
  • 22
  • 193
  • 313
  • But that regex matches {} too. Sure I can do replace, but is there any way to just make it match stuff between {} and not {} too? – Boy pro Jun 23 '19 at 14:29
  • @Boypro You _could_ use `(?<={)[A-Z]+(?=})` but that's a lot slower, so my recommendation is to just use the regex in the answer, and just get group 1 from it instead. See [this](https://stackoverflow.com/questions/1327369/extract-part-of-a-regex-match) if you don't know how. – Sweeper Jun 23 '19 at 14:31
  • Cool. Thank you! – Boy pro Jun 23 '19 at 14:32
  • @Boy pro, if you use PCRE engine, you can match only what is between `{}` without using any group constructs with the following regex `{\K[^}]+`. See a demo [here](https://regex101.com/r/XsKuDb/2). – Junitar Jun 23 '19 at 14:38
  • @Junitar In a previous revision of the question, it's tagged python, so I assumed OP is using python. – Sweeper Jun 23 '19 at 14:39
  • @Sweeper Yeah I use Python – Boy pro Jun 23 '19 at 15:21
2

You are using negated character classes the way we would use positive lookbehind (?<=) and positive lookahead (?=). They are fundamentally different and, as opposed to lookbehind or lookahead, character classes consume characters.

Hence:

  • [^{][A-Z] matches a capital letter that is preceded by a character other than {.
  • [A-Z][^}] matches a capital letter that is followed by a character other than }.

So if you try to match the letters in {OO} with the regex [^{][A-Z]+[^}], it is totally normal that your regex won't match anything because you have two letters, one preceded by a {, the other followed by a }.

Junitar
  • 905
  • 6
  • 13