3

I need to validate some commands with the format similar to this

"/a foo.bar /b bar.foo /c 01(.01)"

where the final (.01) is optional (brackets are there to mark the contents as optional). Any digits can be set instead of the 0's and 1's. Also the switches /a, /b, /c are fixed For the moment, I've developed this regular expression:

@"/a\s*([\w\W]*)\s*/b\s*([\w\W]*)\s*/c\s*[0-9,0-9,(.,0-9,0-9){0,1}]

but for some reason, if the command is for example

"/a foo.bar /b bar.foo /c 01.", 

it still validates against the regex. Valid commands should end either with 2digits.2digits or simple 2digits.

Can someone help me to get this fixed?

Cheers,

Alex Barac

Alex Barac
  • 632
  • 4
  • 12
  • 1
    Isn't it easier to parse the command line (http://stackoverflow.com/questions/491595/best-way-to-parse-command-line-arguments-in-c), letting the library take care of parsing and checking the types? Then you only have to check the result for options a, b and c being set. – CompuChip Nov 15 '13 at 09:27
  • Good suggestion, but I'm in a sort of time crysis and I need this fixed kind of asap... – Alex Barac Nov 15 '13 at 09:32

2 Answers2

2

Try this one:

^/a\s*(.*)\s*/b\s*(.*)\s*/c\s*((\d{2}\.\d{2})|(\d{2}))$

Regular expression visualization

Debuggex Demo

Alex Filipovici
  • 31,789
  • 6
  • 54
  • 78
0

You're misunderstanding character classes. For example, this:

[0-9,0-9,(.,0-9,0-9){0,1}]

and this:

[0-9,.(){}]

mean the same thing. In other words, no matter what you put within the square brackets, that expression matches a single character, unless you quantify the character class. What you want, instead, is this:

@"/a\s*([\w\W]*)\s*/b\s*([\w\W]*)\s*/c\s*[0-9][0-9](\(\.[0-9][0-9])){0,1}"

But let me help you simplify and fix this further:

  1. You can probably use . (a wildcard) rather than [\w\W] (unless you're doing that specifically to match newline characters as well—by default, . doesn't match newlines).
  2. That's why I escaped the . that you actually mean to mean a dot, near the end.
  3. Parentheses are also special characters, which is why I've escaped them, near the end.
  4. \d is shorthand for [0-9].
  5. The outer parentheses are so you can apply the {0,1} quantifier on a group.
  6. ? is shorthand for {0,1}.

Taking all that into account:

@"/a\s*(.*)\s*/b\s*(.*)\s*/c\s*\d\d(\(\.\d\d))?"

Last two things. First, if there should be at least one space between arguments, you should use \s+ instead of \s*. Second, if you want to allow more than two digits, then you can do \d+ instead of \d\d.

Andrew Cheong
  • 29,362
  • 15
  • 90
  • 145