3

I'm trying to use ST2's regex capability in search & replace, but can't figure out how to probably make a non-capturing group. For this example, I want to find instances of "DEAN" which are not followed by "UMBER", i.e. to distinguish "DEANCARE" from "DEANUMBER"

From what I've read and used in the past, the syntax with a non-capture should be:

DEAN(?:UMBER)

Which should match "DEANCARE" but not "DEANUMBER". Yet instead, Sublime Text only finds "DEANUMBER" as if I had typed:

DEAN(UMBER)

Using square brackets on the first (or each) of the unwanted letters does work:

DEAN[^U] 

But I'd still prefer to use the group non-match as opposed for other purposes and to avoid having to explicitly not-match each individual character. Do I have a syntax mistake, or maybe a conceptual error in how ST2's regex works?

TCAllen07
  • 1,294
  • 16
  • 27

1 Answers1

8

A non capturing group is the same as a group except it does not capture the matching portion of the regex in a back-reference.

If you were to use the regex DEAN(?:UMBER) on the string DEANUMBER then you would have a match, but referencing \1 in, e.g. a search and replace would give you nothing, because the group is non-capturing.

Using DEAN(UMBER) on the other hand you could do a search and replace with made of L\1 which would produce made of LUMBER because the match of the first (capturing) group is being back-referenced by \1. This of course is a very pointless example, if you want to learn more about groups and back-referencing I'd suggest you read this or some other documentation/turoial on the matter.

As suggested in the comments, what you want is a negative lookahead.

rvalvik
  • 1,559
  • 11
  • 15
  • Hello. Sorry for opening such an old question, but i think i better write a comment to an old question instead of creating a new one. The question is like in the subject: What's the syntax of a non capturing group in Sublime Text? I use ST3, it uses Boost regex engine, i hope ST2 uses it too. When i use your regex (DEAN(?:UMBER)) on the string DEANUMBER, it captures DEANUMBER. So how to use it properly? I need exactly what i ask about, not lookaheads/behinds. Thanks in advance! – lucifer63 Aug 18 '16 at 07:41
  • @lucifer63: The syntax for non-capturing group is `(?:)`. In your regex however you are enclosing `DEAN(?:UMBER)` in a capturing group, that's why it captures `DEANUMBER`. The purpose of non-capturing groups is to group things together but not have them be back referenced. To illustrate the difference try the following two regexes in ST3 on the string DEANUMBER - `(DEAN)(?:UMBER)` replace with `G1: \1 - G2: \2` and `(DEAN)(UMBER)` replace with `G1: \1 - G2: \2`, that should illustrate the difference: `G1: DEAN - G2:` and `G1: DEAN - G2: UMBER` respectively. – rvalvik Aug 18 '16 at 10:08
  • @rvalvik: that's the reason i asked the question - for me it's G1: DEAN - G2: UMBER and G1: DEAN - G2: UMBER respectively. Look -> http://i.imgur.com/t56P2Jo.png – lucifer63 Aug 18 '16 at 12:05
  • Your screenshot is correct, both of them match the same string. The **only** difference is that `(DEAN)(UMBER)` has `UMBER` back-referenced as `\2` whereas `(DEAN)(?:UMBER)` has no `\1` back-reference. Did you try to do a search and replace? If search and replace as I described results in `G1: DEAN - G2: UMBER` for both then something is wrong. – rvalvik Aug 18 '16 at 12:20
  • @rvalvik:i create a new document with only text DEANUMBER. Then i press Ctrl+Shift+F, select "Regular expression" option, I also has selected "Show context" and "Use buffer" options. Then i paste "(DEAN)(UMBER)" in the first input field ("Find"), type "lol" in "Replace field", then click "Replace" button and DEANUMBER gets replaced with lol. The same happens if i first paste "(DEAN)(?:UMBER)" – lucifer63 Aug 18 '16 at 12:36
  • @lucifer63: As stated previously, in terms of the string they match, the two regexes are the same. They are both the same as just `DEANUMBER` the difference is what they are capturing as back-references (see http://www.regular-expressions.info/backref.html). If you are not using the back-referenced groups then the whole thing becomes moot and there is no difference between the two expressions, or any other permutation of groups for that matter. – rvalvik Aug 18 '16 at 13:30
  • Ok, but what about your words "If search and replace as I described results in G1: DEAN - G2: UMBER for both then something is wrong" ? And what about this - http://stackoverflow.com/questions/3512471/what-is-a-non-capturing-group? It states clearly: non capturing expression doesn't capture the text it matches. But it does. In ST3 at least. – lucifer63 Aug 18 '16 at 13:38
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/121255/discussion-between-rvalvik-and-lucifer63). – rvalvik Aug 18 '16 at 13:50