322

I'm currently programming a vocabulary algorithm that checks if a user has typed in the word correctly. I have the following situation: The correct solution for the word would be "part1, part2". The user should be able to enter either "part1" (answer 1), "part2" (answer 2) or "part1, part2" (answer 3). I now try to match the string given by the user with the following, automatically created, regex expression:

^(part1|part2)$

This only returns answer 1 and 2 as correct while answer 3 would be wrong. I'm now wondering whether there's an operator similar to | that says and/or instead of either...or.

May anyone help me solve this problem?

Ed The ''Pro''
  • 875
  • 10
  • 22
Jonathan
  • 3,463
  • 2
  • 17
  • 16
  • 2
    Regular expressions might not be the best solution for this. I'd use normal string methods. – Felix Kling Nov 05 '11 at 14:44
  • 3
    This problem is poorly specified. Why are you using pattern matching when all you need is an exact string comparison against a set of legal strings? Unless your regex compiler optimizes alternatives into an O(1) trie structure the way Perl’s does, you should probably be doing a test against hash membership instead. Other regex engines just aren’t very clever at this. – tchrist Nov 05 '11 at 16:26
  • @tchrist The use case could be a mongodb `$or` regex match – Abbas Aug 26 '19 at 15:25

5 Answers5

375

I'm going to assume you want to build a the regex dynamically to contain other words than part1 and part2, and that you want order not to matter. If so you can use something like this:

((^|, )(part1|part2|part3))+$

Positive matches:

part1
part2, part1
part1, part2, part3

Negative matches:

part1,           //with and without trailing spaces.
part3, part2, 
otherpart1
Gaute Løken
  • 7,522
  • 3
  • 20
  • 38
  • 8
    Note that "part1, part"1 will be also positive. Which is not always desirable – dimaaan Dec 15 '16 at 21:06
  • 1
    @dimaaan Did you misplace your quotes? "part1, part1" will be a match, but "part1, part" won't be. Though you're correct that such a scenario is not covered by this solution, for the application of the OP where he's checking if the test-string consists of words in a vocabulary, I believe he does want a positive match even when a word is repeated. The word will still be part of the vocabulary no matter how many instances of it you've got. – Gaute Løken Dec 17 '16 at 06:02
51
'^(part1|part2|part1,part2)$'

does it work?

Kent
  • 189,393
  • 32
  • 233
  • 301
9

Not an expert in regex, but you can do ^((part1|part2)|(part1, part2))$. In words: "part 1 or part2 or both"

BlackBear
  • 22,411
  • 10
  • 48
  • 86
5

Does this work without alternation?

^((part)1(, \22)?)?(part2)?$

or why not this?

^((part)1(, (\22))?)?(\4)?$

The first works for all conditions the second for all but part2(using GNU sed 4.1.5)

potong
  • 55,640
  • 6
  • 51
  • 83
4

Or you can use this:

^(?:part[12]|(part)1,\12)$
FailedDev
  • 26,680
  • 9
  • 53
  • 73