0

I have text, for exp: Test numbers test count gggg aaaaaa

I need replace all words with count of characters 4(or other number) to "SUPER". What is the easiest way to do it?

Now I tried to do something this, but it not work properly:

String pattern = "[aA-zZ]+";
    Pattern p = Pattern.compile(pattern);
    Matcher m = p.matcher(myText);
    while (m.find()) {
        String word = myText.substring(m.start(), m.end());
        System.out.println("one word |" + word + "|");
        if (m.end() - m.start() == myWord.length) {
            m.replaceAll("SUPER");
        }
    }
azro
  • 53,056
  • 7
  • 34
  • 70
Tim
  • 1,606
  • 2
  • 20
  • 32

4 Answers4

3

str.replaceAll("\\b\\w{4}\\b", "SUPER"); should work, \\w means word character; \\b means word boundary

Procrastinator
  • 2,526
  • 30
  • 27
  • 36
Evgeniy Dorofeev
  • 133,369
  • 30
  • 199
  • 275
2

You can use this pattern : \b\w{4}\b a for-letter group with a word boundary at start and at end

public static String rplcWordWithSize(int size, String sentence) {
    return sentence.replaceAll("\\b\\w{" + size + "}\\b", "SUPER");
}

Example of use :

public static void main(String argv[]) {
    String str = "Test numbers test count gggg aaaaaa";
    System.out.println(rplcWordWithSize(3, str));  //Test numbers test count gggg aaaaaa
    System.out.println(rplcWordWithSize(4, str));  //SUPER numbers SUPER count SUPER aaaaaa
    System.out.println(rplcWordWithSize(5, str));  //Test numbers test SUPER gggg aaaaaa
}
azro
  • 53,056
  • 7
  • 34
  • 70
1

Note that [aA-zZ]+ matches more than just letters, as the A-z range matches [, \, ], ^, _, ` beside the English letters.

If you do not expect to replace "words" like 1234 or wrd5, and just want to replace natural language non-compound words, use either of the two solutions below.

This one is Unicode-aware, \p{L} matches any Unicode letters and \b (a word boundary) "supports" Unicode word boundaries thanks to the Pattern.UNICODE_CHARACTER_CLASS modifier embedded flag, (?U):

s = s.replaceAll("(?U)\\b\\p{L}{4}\\b", "SUPER");

Or, if you only plan to work with ASCII:

s = s.replaceAll("\\b[a-zA-Z]{4}\\b", "SUPER");

See the online Java demo:

System.out.println("Test numbers test count gggg aaaaaa".replaceAll("\\b[a-zA-Z]{4}\\b", "SUPER"));
// => SUPER numbers SUPER count SUPER aaaaaa
System.out.println("Маша ела кашу".replaceAll("(?U)\\b\\p{L}{4}\\b", "SUPER")); 
// => SUPER ела SUPER
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
0

Try this:

Matcher m = p.matcher(myText);
String word = myText.substring(m.start(), m.end());

String[] words = word.Split(" ");
String newword = "";
for(String w : words){
if(w.length == myWord.length){
newword += "SUPER ";
}
else{
newword += w + " ";
}
}
Console.println(newword);

Did this straight out of the texteditor so there could be some minor errors.