1

I've got a regex that I'm trying to use to detect if a certain input is valid. The syntax of the input should be {A|B|C}. {A|B|} should fail.

(?:
 (
  \{{1}
  (?:[A-Z0-1-_.*]+ \| [A-Z0-1-_.*]+)*
  \}{1}
 )
)

This is what I have so far, but I'm starting to think this isn't the way to go. Even if it did work properly, it wouldn't allow {A} which should be valid.

So basically what I'm trying to do is check if each [A-Z0-1-_.*] element is split by | and that there are no empty elements within the {} brackets.

One concept I'm really struggling with which feels relevant here is having n amount of possible elements. Like let's say, the string to validate is Foo{A}Bar{B|C}Test The way I would check that has 2 elements. One element to check for alphabetical characters, and another element to check the bracketed characters. So to check the string above, I would do alphaElem*|BracketElem*|alphaElem*|BracketElem*|alphaElem* But that's a lot of writing out, and it doesn't scale if the amount of elements increases. Is there some way I can solve this with regex?

PaulG
  • 13,871
  • 9
  • 56
  • 78
Olli
  • 375
  • 5
  • 15
  • I am confused about your actual input. Is it `{A|B|C}` form or `Foo{A}Bar{B|C}Test` ? And which one do you have to check? – Fildor Apr 04 '18 at 12:11
  • 1
    Probably you want [`{[A-Z0-1-_.*]+(?:\|[A-Z0-1-_.*]+)*}`](http://regexstorm.net/tester?p=%7b%5bA-Z0-1-_.*%5d%2b%28%3f%3a%5c%7c%5bA-Z0-1-_.*%5d%2b%29*%7d&i=%7bA%7cB%7cC%7d.+%7bA%7cB%7c%7d&o=x). The last `*` can be replaced with `{0,2}` to match 0, 1 or 2 times (to match 1, 2 or 3 elements inside `{...}`). – Wiktor Stribiżew Apr 04 '18 at 12:11
  • @Fildor the final input might be a any combination of strings and {A|B|C} brackets – Olli Apr 04 '18 at 12:37
  • So what about my suggestion? – Wiktor Stribiżew Apr 04 '18 at 13:31
  • yep, its what i was looking for – Olli Apr 04 '18 at 13:38

2 Answers2

1

You may use

{[A-Z0-1-_.*]+(?:\|[A-Z0-1-_.*]+)*}

Note that the last * modifier can be replaced with a limiting quantifier. E.g. {0,2} to match 0, 1 or 2 times (to match 1, 2 or 3 elements inside {...}).

See the regex demo.

Details

  • { - a { char
  • [A-Z0-1-_.*]+ - 1 or more chars defined in the character class (uppercase ASCII letters, 0, 1, -, _, . or * chars)
  • (?: - a non-capturing group matching 0 or more occurrences of:
    • \| - a | char
    • [A-Z0-1-_.*]+ - 1 or more chars defined in the character class
  • )* - end of the grouping construct
  • } - a } char.

Note you do not need to escape { and } chars in a .NET regex, it is "intelligent" enough to parse { as a literal { if there is no matching } with min or min,max values before.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
0

This solution will validate everything you (seem to) want in one pass (see on regex101):

^\w+({[A-Z0-1-_.*]+(\|[A-Z0-1-_.*]+)*}\w+)*$

It's several layers of possibly-repeating sections.

Here's the breakdown:

^ anchor matching start of text

\w+ matches any amount of "word" characters

{[A-Z0-1-_.*]+(\|[A-Z0-1-_.*]+)*} matches an element in brackets, possibly followed by any number of pipes and other elements within the brackets

({[A-Z0-1-_.*]+(\|[A-Z0-1-_.*]+)*}\w+)* this is the previously-described match, allowed to repeat zero to many times, each time with another "word"

$ anchor matching end of text

Brian Stephens
  • 5,161
  • 19
  • 25