3

background:

For syntax highlighting in Sublime Text,
you can write a tmLanguage file with a corresponding tmTheme file.

The tmLanguage file contains regular expressions in which you give names to,
and then the tmTheme file uses those names to style what was captured.

I want to colorize the same pattern differently depending on how many duplicate patterns came before it. Or, to put it another way, I want to style the nth match of each pattern on each line differently.

the problem:

So for example,
How can I write 3 regular expressions to match the following bold groups?

< foo >< bar >< baz >
< foo >< bar >< baz >
< foo >< bar >< baz >

anything could be inside of < >.

expression 1 would capture the first instance of <*.?>
expression 2 would capture the second instance of <*.?>
expression 3 would capture the third instance of <*.?>

Assume the three examples above are actually the same line.
My goal is to get get each group a different color

<this would be red> <this would be orange> <this would be yellow> <etc..>

The regular expression language is Oniguruma.


My attempts so far:

I can capture the first group like this:

^<.*?>

I can't find out how to capture the second group only

^<.*?>{2}            captures nothing
<.*?>{2}             captures nothing
<.*?>{2,}            captures nothing
^(?:<.*?>)<.*?>      captures 1st and 2nd 
^(?!<.*?>)<.*?>      captures nothing
^(?=<.*?>)(<.*?>)    captures 1st
^(?=<.*?>)(<.*?>){1} captures 1st
^(?=<.*?>)(<.*?>){2} captures 1st and 2nd
(?=<.*?>)(<.*?>)     captures everything
Trevor Hickey
  • 36,288
  • 32
  • 162
  • 271
  • There could be any number on each line. That expression does seem to work on my example, but it looks looks like it's going backwards to get the 2nd match. Can I modify this to skip over the first and match the second, instead of skipping over the last to match the second to last? – Trevor Hickey Jul 18 '15 at 19:14
  • This `^(?:.*?(<.*?>)){N}` matches the n'th `<>` in capture group 1. You can use separate ones. The problem is you would need a variable length lookbehind, to get it without capture groups. –  Jul 18 '15 at 19:28
  • Like with a variable length lookbehind its `(?<=^.*?(?:<.*?>.*?){N-1})<.*?>` –  Jul 18 '15 at 19:32
  • Fwiw, I guess you could use the `\K` construct (poor man's variable length lookbehind) with `^.*?(?:<.*?>.*?){N-1}\K<.*?>` but, I don't see that construct available from the syntax page you linked. –  Jul 18 '15 at 19:45

2 Answers2

1

You can use

(?m)^(?:<[^>]*>[[:blank:]]*){1}\K<[^>]*>

To match the second value. Then, just increment the 1 to get further values.

Here is a demo

The thrid value will be matched with (?m)^(?:<[^>]*>[[:blank:]]*){2}\K<[^>]*>, etc.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • This is exactly the behaviour I'm looking for! It seems to do nothing in Sublime since it is of pcre flavor and Sublime expects Oniguruma . But I'm hoping the tweaks here aren't difficult. – Trevor Hickey Jul 18 '15 at 19:39
  • There is no way I am afraid. I suggest using [`(?m)^(?:<[^>]*>[[:blank:]]*){1}(<[^>]*>)`](https://regex101.com/r/eX4gT4/4) and use the hints provided at [`Fine Tuning Matches`](http://sublimetext.info/docs/en/extensibility/syntaxdefs.html). – Wiktor Stribiżew Jul 18 '15 at 21:13
0

You can do:

(?:(?:\s*<\s*(?!TGT)\w+\s*>\s*)*(<\s*TGT\s*>)){N}

where TGT is what you seek and N is the match.

Demo (cycle through the 3 versions to see all your examples...)


OK, you can do:

/^((<[^>]*>){N-1})((<[^>]*>))/gm

Where N is the one you seek.

Demo

dawg
  • 98,345
  • 23
  • 131
  • 206
  • Reading over my question, you did exactly what I asked for, but its not what I had intended! One regular expression would match the first <.*?> on every line. The second regular expression would match the second <.*?> on every line. And the third regular expression would match the 3rd <.*?> on every line. – Trevor Hickey Jul 18 '15 at 19:24