1
^\\p{Alpha}[\\p{Alnum}_]{8,30}$

As per my understanding, this expression will match word having minimum 8 characters and maximum 30 characters, that starts with alphabetic character and can contain only alphanumeric character or/and underscore.

But its matching with the following word as well. "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaab"

Can someone help me understanding this

1 Answers1

3

The regex matches 9 to 31 characters.

^\\p{Alpha}[\\p{Alnum}_]{8,30}$
 | --1  --|| --- 8 to 30 ----| = > 9 to 31

Use

^\\p{Alpha}[\\p{Alnum}_]{7,29}$

to only match 8 to 30 characters.

Just a note on the usage in Java:

String pat = "^\\p{Alpha}[\\p{Alnum}_]{7,29}$";
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • @T.J.Crowder: If you mean [this one](http://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean), it is a canonical one for closing "please explain me this regex" questions. Sometimes, it is abused to close other questions, too. – Wiktor Stribiżew Sep 30 '16 at 07:45
  • can it be written someway using quantifier {8,30} for whole regular expression? – Yashoda Agrawal Sep 30 '16 at 07:47
  • @YashodaAgrawal: No, not with the difference between what the first character can be and what the subsequent ones can be. Well, not *reasonably*. This expression is nice and clear. It may be possible with lookaheads to do it with a single quantifier around a noncapturing group or something, but it would be much more complicated if it's possible. – T.J. Crowder Sep 30 '16 at 07:47
  • Well, you can actually do this with a lookahead, but why? `^(?=\\p{Alpha})[\\p{Alnum}_]{8,30}$` – Wiktor Stribiżew Sep 30 '16 at 07:48
  • 1
    Just forget to add: if you are using the regex in `String#matches`, you can remove `^` and `$` from the pattern. – Wiktor Stribiżew Sep 30 '16 at 07:52
  • @WiktorStribiżew (^\\p{Alpha}[\\p{Alnum}_]){8,30}$, If I put a bracket in whole regular expression then quantifier is applied in whole expression. So in that case, It should work. – Yashoda Agrawal Sep 30 '16 at 08:07
  • You should understand that a quantifier applied to a grouping, quantifies the whole *sequence* of subpatterns, so, [`(^\\p{Alpha}[\\p{Alnum}_]){8,30}$`](https://regex101.com/r/PjBvRz/1) actually matches 8 to 30 *sequences* of an alpha char and a alnum/`_` char. This pattern does not make sense, it won't ever match anything as there can only be 1 start of a string. – Wiktor Stribiżew Sep 30 '16 at 08:12
  • I didn’t test it, but I suppose that it should work when using a look-ahead, i.e. `^(?=\\p{Alpha}[\\p{Alnum}_]*$).{8,30}$`; this pattern tells that the match must have the desired form *and* it should consist of 8 to 30 characters. – Holger Sep 30 '16 at 08:41
  • This one is correct. But it might be a good idea to first check the length: `^(?=.{8,30}$)\\p{Alpha}[\\p{Alnum}_]*$"` – Wiktor Stribiżew Sep 30 '16 at 08:45