2

What would be the best regular expression in c# to validate the below condition?

1, 2-10,5-10,6,9-100 - it is something like page numbers specified as range or individual ones separated by commas.

Soner Gönül
  • 97,193
  • 102
  • 206
  • 364
user2325247
  • 774
  • 1
  • 7
  • 18
  • Is this a `string`? If it is, you can use `String.Split(',')` also.. – Soner Gönül Dec 09 '13 at 20:34
  • The requirements seem unclear. Do you allow individual pages mixed with ranges separated by commas? – Mike Perrenoud Dec 09 '13 at 20:37
  • Ignoring that it's for javascript and the problem was about escaping the backslashes, this is what showed up in google for me: http://stackoverflow.com/questions/4468336/javascript-regular-expression-page-range-validation – moritzpflaum Dec 09 '13 at 20:41

3 Answers3

6

Try the following expression:

\d+(?:-\d+)?(?:,\d+(?:-\d+)?)*

Note that the pattern is quite fragile in that it doesn't allow any whitespace as is.


The idea is built around the main subpattern \d+(?:-\d+)?:

  1. \d+ — match one or more consecutive digits (either stand-alone or as a left range boundary)
  2. -\d+ — match a minus sign followed by one or more digits (right range boundary)

The trailing question mark makes the minus sign and the right range boundary optional (which is required to also match single page numbers); the (?:) denotes a non-capturing group.

Marius Schulz
  • 15,976
  • 12
  • 63
  • 97
2

When I cut-and-paste your sample string, I noticed that a space follows the 1, at the very beginning:

1, 2-10,5-10,6,9-100
  ^

I don't know if that was intentional, but I think it's reasonable to allow one or more space characters to surround a comma.

That said, here's a regex that will meet your requirements:

^[0-9]+(?:(?:\s*,\s*|-)[0-9]+)*$
 ^^^^^^      ^^^^^^^ ^ ^^^^^^ ^
    A           B1   B2   C   D
             ^^^^^^^^^
                 B

A - One or more digits
B1 - A comma with optional space characters on either side, *OR*
B2 - A dash (without whitespace on either side)
C - One or more digits
D - Optionally repeat B and C

Note: \d and [0-9] are not equivalent; the former matches all Unicode digits. I have presumed that only the digits 0 through 9 are of interest to you.

Community
  • 1
  • 1
DavidRR
  • 18,291
  • 25
  • 109
  • 191
0

This regex will match each page range individually:

\d+-\d+|\d+

I used alternation to accomplish this. In the case that \d+-\d+ (a page number range) is not matched, it will simply match a singular, infinite number \d+.

If you encounter any characters between the page ranges besides -, you will need to change the regex.

Vasili Syrakis
  • 9,321
  • 1
  • 39
  • 56
  • This regex does not allow for the possibility of a comma between numbers as requested. – DavidRR Dec 09 '13 at 21:13
  • He said "separated by commas". If I matched commas, the page ranges would not be separate. If he needs every page range to be suffixed with a comma, he can use the regex: `\d+-\d+,|\d+,` – Vasili Syrakis Dec 09 '13 at 21:40
  • That will match `1` or `12` or `12,` or `12-345,` individually. My interpretation of the question is that the OP wants to match a **sequence** of comma-separated individual numbers or ranges (e.g. `1,3-5,6,7-12`) and determine whether the **entire sequence** is valid. – DavidRR Dec 09 '13 at 22:16