1

im a newbie at constructing regex.

I have this working regex:

^([a-zA-Z0-9\d]+-)*[a-zA-Z0-9\d]+$

Example:

-test : false 
test- : false
te--st : false
test : true
test-test : true 
te-st-t : true 

I would like to add support for _ (underscores), so the above example replaced - to _ is the same result, but can only be one option only.

Example:

te-st_test : false
te_st_test : true

The solutions I tried:

^([a-zA-Z0-9\d]+(-|_))*[a-zA-Z0-9\d]+$

^(([a-zA-Z0-9\d]+-)|([a-zA-Z0-9\d]+_))*[a-zA-Z0-9\d]+$

Bad result:

te_st-test : true

I would like to have this result:

-test : false
test- : false
--test : false
__test : false
test-- : false
test__ : false
-_test : false
test-_ : false
test--test : false
test__test : false
test-_test : false
te-st_test : false
te-st : true
te_st : true
te_st_test : true
te-st-test : true
test : true

Thanks & have a nice day!

Kopale
  • 25
  • 3

1 Answers1

1

You may capture the first delimiter (if any) and then use a backreference to that value in the repeated non-capturing group:

^[a-zA-Z\d]+(?=([-_])?)(?:\1[a-zA-Z\d]+)*$
\A[a-zA-Z\d]+(?=([-_])?)(?:\1[a-zA-Z\d]+)*\z

See the regex demo.

Note: when validating strings, I'd rather use \A (start of string) and \z (the very end of string) anchors rather than ^/$. Also, if you are worried about matching all Unicode digits (e.g. ३৫৬૦૧௮೪൮໘) with \d you need to pass RegexOptions.ECMAScript option when compiling the regex object, or replace \d with 0-9 inside the character class.

Details:

  • \A - start of string
  • [a-zA-Z\d]+ - one or more letters or digits
  • (?=([-_])?) - a positive lookahead that captures into Group 1 the next char that is an optional - or _
  • (?:\1[a-zA-Z\d]+)* - zero or more sequences of Group 1 value and one or more letters or digits
  • \z - the very end of string.

In C#, you can declare it as

var Pattern = new Regex(@"\A[a-zA-Z\d]+(?=([-_])?)(?:\1[a-zA-Z\d]+)*\z");
// Or,
var Pattern = new Regex(@"\A[a-zA-Z\d]+(?=([-_])?)(?:\1[a-zA-Z\d]+)*\z", RegexOptions.ECMAScript);
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563