1

I want to extract URLs from text such as this:

this is text

bla bla bla


http://dev.pricewombat.com/d/4
http://www.pricewombat.com/d/12/Spalding-Premier-Excel-Basketball-15-Free-Store-Pickup

I wrote the following regex:

^(https?:\/\/(dev|www).pricewombat.com\/d\/[^ \n]+)$

http://regex101.com/r/iJ1fZ0/1

However, if you notice I'm using alternation for (dev|www) and because parenthesis are used it creates a capture group where I don't want one.

Is it possible to use alternation without creating a capture group?

Note that this is not the same question as this "similar question": Can I use an OR in regex without capturing what's enclosed?

EDIT: Apparently it actually is the same question as the one above, I simply misunderstood how the ?: operator works.

Community
  • 1
  • 1
Nate
  • 26,164
  • 34
  • 130
  • 214

1 Answers1

5

Yes, you're wanting to use a Non-capturing group instead. By placing ?: immediately after the opening parenthesis you're specifying that the group is not to be captured, but to simply group the expressions only.

(?:dev|www)  # group, but do not capture: 'dev' OR 'www'
hwnd
  • 69,796
  • 4
  • 95
  • 132
  • Oops, apparently I completely misunderstood the `?:` operator. When I read about it I thought it meant it would exclude the contents from being in any capture group (in my case, I still wanted it in the outer capture group). Thanks. – Nate Oct 12 '14 at 22:22