Regular expressions, groups issue

Question

have a such issue with regexp in java. I code chat application and client could sent to the server just "message" or "\command value". I want to parse this string and check was a command i the line or not. If it login - value is required, but if it logout value.required should be false. I create such regexp

"^\\\\(?<comm>login|status|((?=logout)))\\s(?<content>\\w+)"

What should i write to logout condition? I want to have logout in comm group. I should write something like if logout than do not read anything else which steps after.

score 2 · Answer 1 · answered Jun 11 '12 at 17:36

2

You are probably better off making the parameter optional:

^\\(?<command>login|status|logout)(?:\s(?<param>\w+))?

and checking the logic of the results with Java.

If you really want yo use regex, you could do:

^\\(?<command>login|status|logout(?=$))(?:\s(?<param>\w+))?$

answered Jun 11 '12 at 17:36

Qtax

33,241
9
83
121

Yeah sure i can do this using Java. But all day i seek possibility doing this with regexp.=) And why are you delete slash cause Pattern do not understand \s? And it do't understand $. It not match it properly. – Igor Masternoy Jun 11 '12 at 17:43
@Igor, I write the regex string (as it would look if you print it), extra backslashes are only needed when quoting. – Qtax Jun 11 '12 at 17:45
Maybe it because i use readline? or trim the string and it trim "\n"? – Igor Masternoy Jun 11 '12 at 17:47

Damian Powell · Accepted Answer · 2012-06-11T17:59:07.610

I would probably do the checking for whether a certain command should have a parameter or not in general purpose code (i.e. Java) rather than in the regular expression. That is to say, Regular Expressions are good for tokenizing, but not so much for parsing.

However, if you've got a good reason to do it that way, then I would probably split the expression into two parts - one where the commands require a command, and one where they do not. For example:

^\\(?:(?<command>login)\s+(?<param>\w+)|(?<command>status|logout)(?<param>))$

Note that the final (?<param>) is not strictly necessary for the regular expression to operate correctly, it is merely there so that any subsequent code can rely on two named groups in the result: command and param.

Logically, this could be extended to three groups where the third group contains commands with an optional parameter.

This can more clearly be written like so if you're using Groovy (or Java 7) for multi-line strings:

^\\(?x:
    (?<command>login) \s+ (?<param>\w+)   # Commands that require a parameter
    |                                     # -or-
    (?<command>status|logout) (?<param>)  # Commands that do not require a parameter
)$

Nope Pattern do not allow me to declare group name twice. Should i do 2 patterns? — Igor Masternoy, Jun 11 '12 at 18:00
Ahhh, sorry, Igor. If the multiple named groups like this don't work in Java, then I would suggest going with @Qtax's solution. Regular expressions make perfect sense when they're simple but as soon as you have to start embedding logic in them, then you're better off moving up to a higher level language. — Damian Powell, Jun 12 '12 at 07:44

Regular expressions, groups issue

2 Answers2

Linked