1

Is there a simple way to convert java string to title case including diacritics, without third party library.

I have found this old question but I think that Java has got many improvement since.
Is there a method for String conversion to Title Case?

Examples:

  • JEAN-CLAUDE DUSSE
  • sinéad o'connor
  • émile zola
  • O'mALLey

Expected results:

  • Jean-Claude Dusse
  • Sinéad O'Connor
  • Émile Zola
  • O'Malley
Hugo
  • 31
  • 6

3 Answers3

4

I use this method with regex:

public static void main(String[] args) {

    System.out.println(titleCase("JEAN-CLAUDE DUSSE"));
    System.out.println(titleCase("sinéad o'connor"));
    System.out.println(titleCase("émile zola"));
    System.out.println(titleCase("O'mALLey"));
}

public static String titleCase(String text) {

    if (text == null)
        return null;

    Pattern pattern = Pattern.compile("\\b([a-zÀ-ÖØ-öø-ÿ])([\\w]*)");
    Matcher matcher = pattern.matcher(text.toLowerCase());

    StringBuilder buffer = new StringBuilder();

    while (matcher.find())
        matcher.appendReplacement(buffer, matcher.group(1).toUpperCase() + matcher.group(2));

    return matcher.appendTail(buffer).toString();
}

I have tested with your strings.

Here is the results to output:

Jean-Claude Dusse
Sinéad O'Connor
Émile Zola
O'Malley
Stéphane Millien
  • 3,238
  • 22
  • 36
0

Two ways to achieve this -

Using Apache Commons Library

public static String convertToTileCase(String text) {
    return WordUtils.capitalizeFully(text);
}

Custom function

private final static String WORD_SEPARATOR = " ";

public static String changeToTitleCaseCustom(String text) {
    if (text == null || text.isEmpty()) {
        return text;
    }

    return Arrays.stream(text.split(WORD_SEPARATOR))
            .map(word -> word.isEmpty()
                    ? word
                    : Character.toTitleCase(word.charAt(0)) + word.substring(1).toLowerCase()
            )
            .collect(Collectors.joining(WORD_SEPARATOR));
}

Calling above custom function -

System.out.println(
            changeToTitleCaseCustom("JEAN-CLAUDE DUSSE") + "\n" +
                    changeToTitleCaseCustom("sinéad o'connor") + "\n" +
                    changeToTitleCaseCustom("émile zola") + "\n" +
                    changeToTitleCaseCustom("O'mALLey") + "\n");

Output -

Jean-claude Dusse
Sinéad O'connor
Émile Zola
O'malley
Stéphane Millien
  • 3,238
  • 22
  • 36
0

Big fan of regular expressions, but if anyone want's a different solution, here's mine. As far as I'm aware, the only split characters should be , - and '.

public static String toTitleCase(String content) {
    if (content == null || content.isEmpty()) {
        return content;
    }

    HashSet<Character> splitSet = new HashSet<Character>(Arrays.asList(new Character[]{' ', '-', '\''}));
    char[] titleCased = new char[content.length()];
    for (int i = 0; i < content.length(); i++) {
        boolean applyTitleCase = i == 0 || splitSet.contains(content.charAt(i - 1));
        char targetChar = content.charAt(i);
        titleCased[i] = applyTitleCase ? Character.toTitleCase(targetChar) : Character.toLowerCase(targetChar);
    }
    return new String(titleCased);
}
maxp
  • 24,209
  • 39
  • 123
  • 201