2

I need to check correctness of input string using regex pattern, every word should start with capital letter, also at the end there could be expression separated with "-". String should contain at least two words or expression with dash.

e.g.

correct:

  • Apple Banana Couonut-Dates

  • Apple Banana

  • Banana Couonut-Dates

  • Couonut-Dates

incorrect:

  • Apple

  • Apple Banana Couonut-dates

  • BanAna couonut-Dates

Pattern pattern = Pattern.compile("([A-Z][a-z]++ )*([A-Z][a-z]++-[A-Z][a-z]++)");
pattern.matcher("Apple Banana Couonut-Dates").matches();

For input "Apple Banana Couonut-Dates" my expression returns false

Emma
  • 27,428
  • 11
  • 44
  • 69
  • 1
    I just put your regex and the test pattern into https://www.freeformatter.com/java-regex-tester.html#ad-output and it works... – BretC May 23 '19 at 10:10
  • Thanks BretC, I checked it one more time. It seems the issue was that I was using my language special characters. When I omit them it works fine. – Jarosław Kaznodzieja May 23 '19 at 10:16
  • You could use the [Unicode properties](https://stackoverflow.com/questions/10894122/java-regex-for-support-unicode) for your (polish?) language characters. – Sascha May 23 '19 at 12:13

1 Answers1

1

To match at least 2 uppercase words with an optional part with expression separated with - at the end or a single expression separated with - you might use:

^(?:[A-Z][a-z]+(?: [A-Z][a-z]+\b(?!-))+(?: [A-Z][a-z]+-[A-Z][a-z]+)?|(?:[A-Z][a-z]+ )?[A-Z][a-z]+-[A-Z][a-z]+)$
  • ^ Start of string
  • (?:Non capturing group
    • [A-Z][a-z]+ Match uppercased word
    • (?: [A-Z][a-z]+\b(?!-))+ Repeat 1+ times uppercased word asserting what is on the right is not a -
    • (?: [A-Z][a-z]+-[A-Z][a-z]+)? Optional part, match space and uppercaseword-uppercase word
    • | Or
    • (?:[A-Z][a-z]+ )? Match optional uppercased word with space
    • [A-Z][a-z]+-[A-Z][a-z]+
  • )$ End of string

Regex demo

Note in Java to double escape the backslash.

The fourth bird
  • 154,723
  • 16
  • 55
  • 70