1

So I want the formats xxxxxx-xxxx AND xxxxxxxx-xxxx to be possible. I've managed to fix the first section before the dash, but the last four digits are troublesome. It does require to match at least 4 characters, but I also want the regex to return false if there's more than 4 characters. How do I do it?

This is how it looks so far:

var regex = new Regex(@"^\d{6,8}[-|(\s)]{0,1}\d{4}");

And this is the results:

var regex = new Regex(@"^\d{6,8}[-|(\s)]{0,1}\d{4}");

Match m = regex.Match("840204-2344");
Console.WriteLine(m.Success); // Outputs True

Match m = regex.Match("19840204-2344");
Console.WriteLine(m.Success); // Outputs True

Match m = regex.Match("19840204-23");
Console.WriteLine(m.Success); // Outputs false


Match m = regex.Match("19840204-2323423423");
Console.WriteLine(m.Success); // Outputs true, and this is what I don't want
Jesper
  • 2,044
  • 3
  • 21
  • 50

2 Answers2

2

The \d{6,8} pattern matches 6, 7 or 8 digits, so that will already invalidate your regex pattern. Besdies, [-|(\s)]{0,1} matches 1 or 0 -, (, ), | or whitespace chars, and will also match strings like 19840204|2323, 19840204(2323 and 19840204)2323.

You may use

^\d{6}(?:\d{2})?[-\s]?\d{4}$

See the regex demo.

Details

  • ^ - start of string
  • \d{6} - 6 digits
  • (?:\d{2})? - optional 2 digits
  • [-\s]? - 1 or 0 - or whitespaces
  • \d{4} - 4 digits
  • $ - end of string.

To make \d only match ASCII digits, pass RegexOptions.ECMAScriptoption. Example:

var res = Regex.IsMatch(s, @"^\d{6}(?:\d{2})?[-\s]?\d{4}$", RegexOptions.ECMAScript);
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Yes, this is better actually, although ASCII digits are for american social numbers right? – Jesper Oct 02 '17 at 19:58
  • @Jesper: ASCII digits are those that you match with `[0-9]`. There are a lot of Unicode digits, like [in this answer](https://stackoverflow.com/a/16621778/3832970). – Wiktor Stribiżew Oct 02 '17 at 20:02
-1

You are forgetting the $ at the end:

        var regex = new Regex(@"^(\d{6}|\d{8})-\d{4}$");

If you want to match the social security number anywhere in a string, you van also use \b to test for boundaries:

        var regex = new Regex(@"\b(\d{6}|\d{8})-\d{4}\b");

Edit: I corrected the RegEx to fix the problems mentioned in the comments. The commentors are right, of course. In my earlier post I just wanted to explain why the RegEx matched the longer string.

GHN
  • 65
  • 1
  • 3
  • 1
    Note that `\d{6,8}` matches 6, 7 or 8 digits, and that might be too "permissive" for the current requirements. – Wiktor Stribiżew Oct 02 '17 at 19:33
  • Actually, the [`^\d{6,8}[-|(\s)]{0,1}\d{4}$`](http://regexstorm.net/tester?p=%5e%5cd%7b6%2c8%7d%5b-%7c%28%5cs%29%5d%7b0%2c1%7d%5cd%7b4%7d%5cr%3f%24&i=19840204%7c2323%0d%0a19840204%282323%0d%0a19840204%292323%0d%0a&o=m) pattern matches strings like `19840204|2323`, `19840204(2323` and `19840204)2323`. – Wiktor Stribiżew Oct 02 '17 at 20:01