0

How can i write a regexp, that will check if string starts and ends with digits and in between contains only digits and comas? Comas must also be separated from each other with at least one digit. For the conditions above i have following regexp: ^\d(,?\d)*$ but i have following additional condition: All comma separated integers, that are composed by sequences of digits, must be different from each other. What would be the regexp that allows only this kind of strings?

Thank you

Tornike Shavishvili
  • 1,244
  • 4
  • 16
  • 35

1 Answers1

1

First of all, your regex contains unquantified \d, and that matches only single digits. You need to add + after \d to match 1 or more digits.

To avoid having duplicate values, you may use

^(?!.*\b(\d+)\b.*\b\1\b)\d+(?:,\d+)*$
 ^^^^^^^^^^^^^^^^^^^^^^^

See the regex demo

The (?!.*\b(\d+)\b.*\b\1\b) is a negative lookahead that will fail the match if after any 0+ chars other than line break chars, there is a group of digits that appear later in the string (after another 0+ chars other than line break chars) again.

Details

  • ^ - start of string
  • (?!.*\b(\d+)\b.*\b\1\b) - a negative lookahead that fails the match if identical values appear in the text
  • \d+ - 1+ digits
  • (?:,\d+)* - zero or more occurrences of
    • , - a comma
    • \d+ - 1+ digits
  • $ - end of string.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Dear Wiktor, currently i am trying to understand by writing regex myself and your example greatly helps me. I have written following simple regex: [(\d++)(?!(,\1))](https://regex101.com/r/YaZa0M/1) as you can see it should not mach the whole first appearance of 123, but it only does not much first character 1 and matches 23. Why is this happening? What would be the correct way writing this regex? – Tornike Shavishvili Jul 16 '18 at 10:44
  • 1
    @TornikeShavishvili The correct regex is in my answer. `(\d++)(?!(,\1))` matches 1+ digits that are not followed with `,` and this string of digits. – Wiktor Stribiżew Jul 16 '18 at 11:13
  • Dear Wiktor, I finally was able to construct my regex. It looks like this: [^\b([1-9]\d*+|0)(?!([0-9,]*,\1(?!\d)))((,([1-9]\d*+|0))(?!([0-9,]*\4(?!\d))))*$](https://regex101.com/r/hXo5xY/2/) It matches only comma separated integers, does not match dublicates and also does not mach numbers that are preceded by zeros (009,09,00...) Could you please give me your experienced opinion, maybe you see a bug, or you would write somethins (or everything) differently? Thank you – Tornike Shavishvili Jul 17 '18 at 10:02
  • @TornikeShavishvili You are heading a blind alley. **WHY** don't you use my pattern? – Wiktor Stribiżew Jul 17 '18 at 10:12
  • I have made minor modifications in your pattern so that it does not mach the numbers that are preceded with 0, like 09,009,00 etc. Regex looks like this: `^(?!.*\b(\d+)\b.*\b\1\b)([1-9]\d*|0)(?:,[1-9]\d*|,0)*$` When modifying your answer i noticed that yur pattern took **3341** steps to match strings while my pattern took **1795** steps. Could there be some bug in my pattern? interesting . . . I am accepting your answer thank you. – Tornike Shavishvili Jul 17 '18 at 11:41
  • 1
    @TornikeShavishvili [`^(?!.*\b(\d+)\b.*\b\1\b)([1-9]\d*|0)(?:,[1-9]\d*|,0)*$`](https://regex101.com/r/xbxBV7/3) is a good pattern, and it takes more steps than mine because there are alternation groups. Alternation inside grouping constructs always eats up some extra computing power. However, since the patterns match different strings, you should not compare their performance. We only compare apples to apples, you know. – Wiktor Stribiżew Jul 17 '18 at 11:44