3

I am currently looking to use regex for a find/replace operation within my text editor, SublimeText 3.

What I have are a lot of lines that look like this:

array(self::var1, self::var2, class::var_1, class::var_2, self::varCaps)

what I would like to do is match each item in the array. The only thing I know for sure is that each one has the :: characters in the middle.

I can match the string before the :: characters pretty easily using

(?<=::)[a-zA-z0-9\_\-]+
//should match 'self::' in self::var1

I can also match after the :: characters using

[a-zA-z0-9\_\-]+(?=::)
//should match 'var1' in self::var1

How would I combine these two things to create an expression which matches the entire thing?

EDIT: My text editor is SublimeText 3

Steven Doggart
  • 43,358
  • 8
  • 68
  • 105
KroniK907
  • 361
  • 2
  • 4
  • 12
  • 1
    Can't you use the comma delimiter? Seems simpler. – mbroshi Jan 21 '16 at 20:21
  • 1
    What is your text editor? Vim? The regex flavor differs between editors. – kennytm Jan 21 '16 at 20:31
  • 1) I could use the comma delimiter for this case, however I am ultimately wanting to set up a little custom command for SublimeText that will quickly match any `string::string2` using regex. 2) I am using SublimeText 3 as my editor. – KroniK907 Jan 21 '16 at 20:38
  • 2
    What about just `\w+::\w+`? – Steven Doggart Jan 21 '16 at 20:40
  • Steven, you are a genious... or I am just dense :P Not sure why I didn't think of that. Somehow I got it in my head that I needed to match the `::` first – KroniK907 Jan 21 '16 at 21:15

1 Answers1

1

You have an issue in your patterns: [A-z] range matches more than just lower- and uppercase letters (see [A-z] and [a-zA-Z] difference).

To combine (?<=::)[a-zA-Z0-9_-]+ and [a-zA-Z0-9_-]+(?=::) (note the escapes with _ and - are redundant) you can use [a-zA-Z0-9_-]+::[a-zA-Z0-9_-]+ (note the :: are part of the match which cannot be avoided as regex cannot match discontinuous text in 1 match operation).

Now, [a-zA-Z0-9_] is NOT the same as \w in Sublime Text as \w also matches all Unicode letters and digits! If you do not mind that, you can use \w+::\w+.

Also,if you want - to only appear once in between word characters, use \w+(?:-\w+)*::\w+(?:-\w+)* Unicode aware) or you may use [a-zA-Z0-9_]+(?:-[a-zA-Z0-9_])*::[a-zA-Z0-9_]+(?:-[a-zA-Z0-9_])* to match only ASCII letters/digits.

Community
  • 1
  • 1
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Thanks. Great explanation! Somehow I had it in my head that I had to match the `::` before everything else which is obviously not correct. – KroniK907 Jan 21 '16 at 21:49