42

Problem: Turn

"My Testtext TARGETSTRING My Testtext" 

into

"My Testtext targetstring My Testtext"

Perl supports the "\L"-operation which can be used in the replacement-string.

The Pattern-Class does not support this operation:

Perl constructs not supported by this class: [...] The preprocessing operations \l \u, \L, and \U. https://docs.oracle.com/javase/10/docs/api/java/util/regex/Pattern.html

user85421
  • 28,957
  • 10
  • 64
  • 87
Andreas
  • 2,045
  • 5
  • 19
  • 30
  • 2
    I don't get this. What's wrong with `"my testtext TARGETSTRING my testtext".toLowerCase();` ? – WVrock Jun 09 '15 at 14:22
  • Sorry, the example was bad. toLowerCase does not work for "My Testtext TARGETSTRING My Testtext" – Andreas Jun 10 '15 at 17:48

5 Answers5

61

You can't do this in Java regex. You'd have to manually post-process using String.toUpperCase() and toLowerCase() instead.

Here's an example of how you use regex to find and capitalize words of length at least 3 in a sentence

    String text = "no way oh my god it cannot be";
    Matcher m = Pattern.compile("\\b\\w{3,}\\b").matcher(text);

    StringBuilder sb = new StringBuilder();
    int last = 0;
    while (m.find()) {
        sb.append(text.substring(last, m.start()));
        sb.append(m.group(0).toUpperCase());
        last = m.end();
    }
    sb.append(text.substring(last));

    System.out.println(sb.toString());
    // prints "no WAY oh my GOD it CANNOT be"

Note on appendReplacement and appendTail

Note that the above solution uses substring and manages a tail index, etc. In fact, you can go without these if you use Matcher.appendReplacement and appendTail.

    StringBuffer sb = new StringBuffer();
    while (m.find()) {
        m.appendReplacement(sb, m.group().toUpperCase());
    }
    m.appendTail(sb);

Note how sb is now a StringBuffer instead of StringBuilder. Until Matcher provides StringBuilder overloads, you're stuck with the slower StringBuffer if you want to use these methods.

It's up to you whether the trade-off in less efficiency for higher readability is worth it or not.

See also

Community
  • 1
  • 1
polygenelubricants
  • 376,812
  • 128
  • 561
  • 623
  • Since Java 9, [Matcher.appendReplacement](https://docs.oracle.com/javase/9/docs/api/java/util/regex/Matcher.html#appendReplacement-java.lang.StringBuilder-java.lang.String-) has an overload for [StringBuilder](https://docs.oracle.com/javase/9/docs/api/java/lang/StringBuilder.html) – MoonFruit Jun 16 '20 at 03:00
14

Java9+

From Java 9+ you can use Matcher::replaceAll where you can use a Function<MatchResult, String> for example we use the example of polygenelubricants :

String text = "this is just a test which upper all short words";
String regex = "\\b\\w{0,3}\\b";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(text);
String result = matcher.replaceAll(matche -> matche.group().toUpperCase());

System.out.println(result);

Or Just :

String result = Pattern.compile(regex)
        .matcher(text)
        .replaceAll(matche -> matche.group().toUpperCase());

Output

this IS just A test which upper ALL short words
     ^^      ^                  ^^^
joriki
  • 617
  • 5
  • 14
Youcef LAIDANI
  • 55,661
  • 15
  • 90
  • 140
12

To do this on regexp level you have to use \U to switch on uppercase mode and \E to switch it off. Here is an example how to use this feature in IntelliJ IDEA find-and-replace dialog which transforms set of class fields to JUnit assertions (at IDE tooltip is a result of find-and-replace transformation):

enter image description here

Andriy Kryvtsun
  • 3,220
  • 3
  • 27
  • 41
  • 3
    This is IntelliJ-specific though, plain Java regular expressions doesn't support this. – ddekany Mar 24 '17 at 18:32
  • There's also `\L` for lower case mode (which is also ended with `\E`). Of course this is IntelliJ-specific too. – ddekany Mar 24 '17 at 18:34
  • 1
    @ddekany technically, you are right: JDK lib doesn't support it (http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html see 'Comparison to Perl 5') but I guess IntelliJ IDEA uses some standalone regexp lib. – Andriy Kryvtsun Mar 24 '17 at 19:13
8

You could use the regexp capturing group (if you really need to use regex, that is, meaning if "TARGETSTRING" is complex enough and "regular" enough to justify being detected by a regex).
You would then apply toLowerCase() to the group #1.

import java.util.regex.*;

public class TargetToLowerCase {

  public static void main(String[] args) {
    StringBuilder sb= new StringBuilder(
            "my testtext TARGETSTRING my testtext");
    System.out.println(sb);
    String regex= "TARGETSTRING ";
    Pattern p = Pattern.compile(regex); // Create the pattern.
    Matcher matcher = p.matcher(sb); // Create the matcher.
    while (matcher.find()) {
      String buf= sb.substring(matcher.start(), matcher.end()).toLowerCase();
      sb.replace(matcher.start(), matcher.end(), buf);
    }
    System.out.println(sb);
  }
}
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • 3
    Is this supposed to be psuedo-code? The `"$1".toLowerCase()` obviously evaluates first, so replaceAll just sees `"$1"`, which means it doesn't do anything. – Matthew Flaschen May 05 '10 at 06:18
  • @Matthew: right, the actual regex-based solution is a bit more complex. I have amended the answer to reflect it. – VonC May 05 '10 at 06:22
  • NICE trick using `sb.replace` to take advantage of the fact that the replacement is always(?) the same length as the original string. Otherwise this wouldn't work. Very nice! – polygenelubricants May 05 '10 at 06:45
  • 1
    Unfortunately case switching doesn't preserve string length. See: [Does Java's toLowerCase() preserve original string length?](http://stackoverflow.com/q/2357315/44522). – MicSim Jan 13 '15 at 11:43
0

How about this transformation function in "Java 8"

/**
 * Searches the given pattern in the given src string and applies the txr to the
 * matches
 * 
 * @param src     The string to be converted
 * @param pattern the pattern for which the transformers to be applied.
 * @param txr     The transformers for the mathed patterns.
 * @return The result after applying the transformation.
 */
private static String fromTo(String src, String pattern, Function<String, String> txr) {
    Matcher m = Pattern.compile(pattern).matcher(src);

    StringBuilder sb = new StringBuilder();
    int last = 0;

    while (m.find()) {
        sb.append(src.substring(last, m.start()));
        sb.append(txr.apply(m.group(0)));
        last = m.end();
    }
    sb.append(src.substring(last));
    return sb.toString();
}
Kannan Ramamoorthy
  • 3,980
  • 9
  • 45
  • 63