3

i am trying to create a regex which should be able to accept the following strings

proj_asdasd_000.gz.xml
proj_asdasd.gz.xml

basically 2nd underscore is optional and if any value follows it, it should only be integer.

Following is my Regex that i am trying.

^proj([a-zA-z0-9]?)+_[a-zA-z]+(_[0-9]?)+\.[a-z]+.[a-z]

Any suggestion to make it accept the above mentioned strings?

noobie-php
  • 6,817
  • 15
  • 54
  • 101
  • 1
    Try switching `(_[0-9]?)+` to `(_[0-9]+)?`. The way you have it now matches `_1_2_3_4` – dvo Oct 15 '19 at 13:49

1 Answers1

2

You may use

^proj[a-zA-Z0-9]*_[a-zA-Z]+(?:_[0-9]+)?\.[a-z]+\.[a-z]+$
^proj[a-zA-Z0-9]*_[a-zA-Z]+(?:_[0-9]+)?(?:\.[a-z]+){2}$

See the regex demo

Details

  • ^ - start of string
  • proj - a literal substring
  • [a-zA-Z0-9]* - 0 or more alphanumeric chars
  • _ - a _ char
  • [a-zA-Z]+ - 1+ ASCII letters
  • (?:_[0-9]+)? - an optional sequence of an underscore followed with 1+ digits
  • \.[a-z]+\.[a-z]+ = (?:\.[a-z]+){2} - two occurrences of . and 1+ lowercase ASCII letters
  • $ - end of string.

Notes:

  • [A-z] matches more than just ASCII letters
  • ([a-zA-z0-9]?)+ matches an optional character 1 or more times, which makes little sense. Either match a char 1 or more times with + or 0 or more times with *, no need of parentheses
  • (_[0-9]?)+ matches 1 or more sequences of _ followed by a single optional digit (so, it matches _9___1_, for example). The quantifiers must be swapped to match an optional sequence of _ and 1+ digits.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563