0

There are some regular expressions here:

  1. r.+
  2. re.+
  3. re\w+
  4. re\w{2,4}
  5. [a-zA-Z]+
  6. [a-zA-Z]{6}
  7. ...

All the regular expressions above match the word "result", but is there a precedence between each of them? For example, re\w{2,4} may have a level 5 and re\w+ gets 3. In which the level measures the priority. When both of them match a word, I would prefer to pay more attention to the one with higher priority in some scenarios.

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
Ggicci
  • 785
  • 2
  • 13
  • 29
  • 5
    No, there's no natural ordering to regular expressions. You'll have to define your own. – NPE Sep 26 '14 at 13:51
  • 1
    There is no 'precedence' in regex, it has a binary state: it either matches or not. There is no 'better matching'. – Nir Alfasi Sep 26 '14 at 13:51
  • 3
    You just might want to put your regexes in an ordered list and use the first one that matches – Bergi Sep 26 '14 at 13:52
  • You can use the following to profile your regexes: http://regex101.com – hek2mgl Sep 26 '14 at 13:57
  • 3
    Per curiosity: What is the use case ? – Tensibai Sep 26 '14 at 14:12
  • 1
    @Tensibai A router that supports regex path. Maybe I should reconsider what to do next. I think I will take Bergi's advice. – Ggicci Sep 26 '14 at 14:16
  • 1
    Indeed, but for a router, you have a weigth in your route, so you'll give the weigth of each regex by the defintion of the route with each next hop. Indeed its @Bergi advice – Tensibai Sep 26 '14 at 14:19
  • `[a-zA-Z]{5}` doesn't matches `result` – elias Sep 26 '14 at 14:20
  • 1
    @Elias How's that ? `result` has more than 5 letters yes, but it match as it has 5 and there's no anchor enforcing only 5 chars – Tensibai Sep 26 '14 at 14:24
  • Indeed. But it may have some troubles according the use and/or implementation. For example, the java `"result".matches("[a-zA-Z]{5}")` will return `false` – elias Sep 26 '14 at 14:31
  • @Elias Yes, I mean 6, too. – Ggicci Sep 26 '14 at 14:33
  • Well, without any knowledge on the regex interpreter, telling it will not match is quite a guess. the question is about regexes, without word boundaries thoose regex are correct. I'm even wondering if it's not only java having this behavior (as php, javascript or python will match according to [this](http://regex101.com/r/wW9hM8/1)). – Tensibai Sep 26 '14 at 14:35
  • @Tensibai I will cut the path to several parts and define weights on them :) – Ggicci Sep 26 '14 at 14:36
  • 1
    Read this: [Why won't a longer token in an alternation be matched?](http://stackoverflow.com/q/25511528/3622940) – Unihedron Sep 26 '14 at 14:38
  • Well there's still something I don't understand. When talking about a router are you talking about a network device routing network packets or something else like a webapp and you wish to choose action based on an ORed regex ? – Tensibai Sep 26 '14 at 15:39
  • With the string `result`, in your example regex list, 1 through 4 are increasing specificity. While 5,6 are general. Regex does have a priority within itself. Its always left to right. So its not a good idea to combine weighted expressions into a single regex. Its better to simulate weight, say as an array of indexes to a general list of regex that are to be executed. This way the regex are reusable, and the only variability is the order in which they are executed. –  Sep 26 '14 at 16:54
  • @Tensibai It's like a webapp. My regex path looks like `/archive/{year}\d{4}+/{month}\d{2}`. In which, `{year}` and `{month}` are just tags. – Ggicci Sep 28 '14 at 07:32

0 Answers0