3

I am trying to find all three letter substrings from a string in Java.

For example from the string "example string" I should get "exa", "xam", "amp", "mpl", "ple", "str", "tri", "rin", "ing".

I tried using the Java Regular expression "([a-zA-Z]){3}" but I only got "exa", "mpl", "str", "ing".

Can someone tell me a regex or method to correct this.

Juvanis
  • 25,802
  • 5
  • 69
  • 87
MLD_Saturn
  • 87
  • 1
  • 1
  • 7
  • 9
    This is hammer/nail syndrome. You have a brand new hammer (regex) and everything looks like a nail. This is a case where regex is the wrong tool to use. Just iterate from position 0 to length-3, taking the substring at each index. If you need to ignore spaces, build a temp string with spaces removed first. – Jim Garrison Aug 29 '13 at 02:58
  • 2
    @JimGarrison Just removing spaces won't work. You'll get the invalid results `les` and `est` in the example. – jpmc26 Aug 29 '13 at 03:24

4 Answers4

10

Implementing Juvanis' idea somewhat, iterate to get your substrings, then use a regular expression to make sure the substring is all letters:

String s = "example string";
for (int i = 0; i <= s.length() - 3; i++) {
    String substr = s.substring(i, i + 3);
    if (substr.matches("[a-zA-Z]+")) { System.out.println(substr); }
}
jpmc26
  • 28,463
  • 14
  • 94
  • 146
  • 1
    Apart from minor parentheses error in the if and print statement this is in my opinion the simplest correct solution thus far. – MLD_Saturn Aug 29 '13 at 03:58
  • @MLD_Saturn Thanks. The idea is mainly Juvanis'; I just implemented it and added verification that it contains letters. Please at least give her/him an upvote. – jpmc26 Aug 29 '13 at 16:19
3

When a character is consumed in one regex, it cannot be used in other regexes. In your example, a is consumed in exa so amp will not be listed as output. You should try traditional iterative approach. It is easier to implement.

Juvanis
  • 25,802
  • 5
  • 69
  • 87
3

try this

    Matcher m = Pattern.compile("([a-zA-Z]){3}").matcher("example string");
    for (int i = 0; m.find(i); i = m.start() + 1) {
        System.out.print(m.group() + " ");
    }

output

exa xam amp mpl ple str tri rin ing 
Evgeniy Dorofeev
  • 133,369
  • 30
  • 199
  • 275
1

This can be done using regex as follows:

  1. Find the position of all matches for the string using the regex \w(?=\w\w). This will give you the start index of the first character of each required sub-string.

    In this case, you would get: 0, 1, 2, 3, 4, 8, 9, 10 and 11.

  2. Get what you need by taking the sub-strings starting from each position going upto that plus 2.

    In this case, that would mean, my_string.substring(0,3), my_string.substring(1,4) and so on, as the begin index parameter is inclusive while the end index parameter is exclusive.

Roney Michael
  • 3,964
  • 5
  • 30
  • 45