4

I've already gone through: Regex to match four repeated letters in a string using a Java pattern and Regular expression to match any character being repeated more than 10 times

But they aren't useful in my case. They are fine if I just want to check if a string is containing repeated characters (like 1111, abccccd, 12aaaa3b, etc.). What I want is to check if string is comprising entirely of repeated characters only i.e. aabb111, 1111222, 11222aaa, etc.

Can anyone help me out with this?

Community
  • 1
  • 1
Kiran Parmar
  • 788
  • 9
  • 26

1 Answers1

12

Use ((.)\2+)+ as pattern:

String pattern = "((.)\\2+)+";
System.out.println("a".matches(pattern));        // false
System.out.println("1aaa".matches(pattern));     // false
System.out.println("aa".matches(pattern));       // true
System.out.println("aabb111".matches(pattern));  // true
System.out.println("1111222".matches(pattern));  // true
System.out.println("11222aaa".matches(pattern)); // true
System.out.println("etc.".matches(pattern));     // false

About the pattern:

  • (...): capture matched part as group. (starting from 1)

    ((.)\2+)+
    ^^ 
    |+----- group 2
    +----- group 1
    
  • (.): match any character (except newline) and capture it as group 2 (because it come after enclosing parenthesis).
  • \2: backreference to the matched group. If (.) matched a character x, \2 matches another x (not any character, but only x).
  • PATTERN+: matches one or more matches of PATTERN.
  • (.)\2+: match repeating characters greedy.
falsetru
  • 357,413
  • 63
  • 732
  • 636
  • Good stuff +1, but might be an idea to explain; you're using a back inserter. – Bathsheba Jan 23 '14 at 08:25
  • Does this really match `aaa`? As in, why not `(.)(\1)*`? – Emil Lundberg Jan 23 '14 at 08:27
  • @EmilLundberg, It will return true for `1`. Also it will return false for `aaa111`. – falsetru Jan 23 '14 at 08:28
  • The pattern would perhaps have been more obvious with a non-capturing outer group. – Marko Topolnik Jan 23 '14 at 08:30
  • @MarkoTopolnik, Yes it would. But I wished to make a shorter one. – falsetru Jan 23 '14 at 08:31
  • @falsetru: Looks great.. gotta try it out. Before that, a small query.. if i replace (.) with ([a-zA-Z0-9]), it will match only those characters right? OR is there another shorter way to denote characters in range [a-zA-Z0-9]? – Kiran Parmar Jan 23 '14 at 08:49
  • @KiranParmar you can shorten `[0-9]` to `[\d]`, but beware that this doesn't work everywhere. For instance, in `grep` the corresponding "shortcut" is `[[:digit:]]`. – Emil Lundberg Jan 23 '14 at 08:53
  • 1
    @KiranParmar, You can use word character \w. But it also matches an underscore (_). See [Java Platform SE documentation - Summary of regular-expression constructs](http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#sum). – falsetru Jan 23 '14 at 09:03
  • @falsetru: In that case, ([a-zA-Z0-9]) works well for my requirements. Thanks for the answer :) I wish I knew how to come up with such regular-expressions. I've tried understanding it multiple times, but I always get confused when the information becomes too much. Any helpful pointers for people like me ? – Kiran Parmar Jan 24 '14 at 07:30
  • @KiranParmar, How about read the documentation I linked in previous comment? – falsetru Jan 24 '14 at 07:35
  • @falsetru: Seen it a lot of times, but I forget it on account of lack of use & as I said, it get's confusing as I move down. Anyways, thanks once again. – Kiran Parmar Jan 24 '14 at 07:39
  • @falsetru: Out of curiosity, I tried ([a-zA-Z0-9])\2+ assuming it will return true if string contains only a single character being repeated for 2 or more times, like aaaa. But it returns false! So what exactly will this regex return true for? – Kiran Parmar Jan 27 '14 at 10:12
  • @KiranParmar, As I said in the answer, capturing group number starts from `1`. So it should be `([a-zA-Z0-9])\1+`. BTW, did you escape the backslash(`\ `) in the string literal? – falsetru Jan 27 '14 at 10:13
  • @falsetru Yes I did. I thought number after \ meant that match that many characters. Just checked with the java doc & found that it means the group. Thanks for the help! :) By the way, how come you are always online when I ask something? :D – Kiran Parmar Jan 27 '14 at 10:17
  • @KiranParmar, Hmm. Maybe my words is not clear in the answer. – falsetru Jan 27 '14 at 10:29
  • @KiranParmar, Because your waking time overlaps with mine. And SO notify me when I get a mention. ;) – falsetru Jan 27 '14 at 10:31