2

I have a List of strings like this "Taxi or bus driver". I need to convert first letter of each word to capital letter except the word "or" . Is there any easy way to achieve this using Java stream. I have tried with Pattern.compile.splitasstream technique, I could not concat all splitted tokens back to form the original string any help will be appreciated.If any body needs I can post my code here.

Lino
  • 19,604
  • 6
  • 47
  • 65
Philip Puthenvila
  • 512
  • 1
  • 8
  • 19

3 Answers3

5

You need the right pattern to identify the location where a change has to be made, a zero-width pattern when you want to use splitAsStream. Match location which are

  • a word start
  • looking at a lower case character
  • not looking at the word “or”

Declare it like 

static final Pattern WORD_START_BUT_NOT_OR = Pattern.compile("\\b(?=\\p{Ll})(?!or\\b)");

Then, using it to process the tokens is straight-forward with a stream and map. Getting a string back works via .collect(Collectors.joining()):

List<String> input  = Arrays.asList("Taxi or bus driver", "apples or oranges");
List<String> result = input.stream()
    .map(s -> WORD_START_BUT_NOT_OR.splitAsStream(s)
        .map(w -> Character.toUpperCase(w.charAt(0))+w.substring(1))
        .collect(Collectors.joining()))
    .collect(Collectors.toList());
result.forEach(System.out::println);
Taxi or Bus Driver
Apples or Oranges

Note that when splitting, there will always be a first token, regardless of whether it matched the criteria. Since the word “or” usually never appears at the beginning of a phrase and the transformation is transparent to non-lowercase letter characters, this should not a problem here. Otherwise, treating the first element specially with a stream would make the code too complicated. If that’s an issue, a loop would be preferable.

A loop based solution could look like

private static final Pattern FIRST_WORD_CHAR_BUT_NOT_OR
                           = Pattern.compile("\\b(?!or\\b)\\p{Ll}");

(now using a pattern that matches the character rather than looking at it)

public static String capitalizeWords(String phrase) {
    Matcher m = FIRST_WORD_CHAR_BUT_NOT_OR.matcher(phrase);
    if(!m.find()) return phrase;
    StringBuffer sb = new StringBuffer();
    do m.appendReplacement(sb, m.group().toUpperCase()); while(m.find());
    return m.appendTail(sb).toString();
}

which, as a bonus, is also capable of handling characters which span multiple char units. Starting with Java 9, the StringBuffer can be replaced with StringBuilder to increase the efficiency. This method can be used like

List<String> result = input.stream()
    .map(s -> capitalizeWords(s))
    .collect(Collectors.toList());

Replacing the lambda expression s -> capitalizeWords(s) with a method reference of the form ContainingClass::capitalizeWords is also possible.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • oh damn, was about to hit submit with a pattern close to that :| – Eugene May 31 '18 at 10:33
  • Hi #Holger the first part of the solution is not working as expected, it is not converting to upper case. – Philip Puthenvila Jun 01 '18 at 00:07
  • 1
    Or, with java9's [`Matcher.replaceAll`](https://docs.oracle.com/javase/9/docs/api/java/util/regex/Matcher.html#replaceAll-java.util.function.Function-), `FIRST_WORD_CHAR_BUT_NOT_OR.matcher(phrase).replaceAll(mr->mr.group().toUpperCase())` – Misha Jun 01 '18 at 05:20
  • @PhilipPuthenvila there was a typo in the regex, it looked for upper case letters instead of lowercase. I've fixed it. Thanks. – Holger Jun 01 '18 at 05:46
2

Here is my code:

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

public class ConvertToCapitalUsingStreams {
    // collection holds all the words that are not to be capitalized
    private static final List<String> EXCLUSION_LIST = Arrays.asList(new String[]{"or"});

    public String convertToInitCase(final String data) {
        String[] words = data.split("\\s+");
        List<String> initUpperWords = Arrays.stream(words).map(word -> {
            //first make it lowercase
            return word.toLowerCase();
        }).map(word -> {
            //if word present in EXCLUSION_LIST return the words as is
            if (EXCLUSION_LIST.contains(word)) {
                return word;
            }

            //if the word not present in EXCLUSION_LIST, Change the case of
            //first letter of the word and return
            return Character.toUpperCase(word.charAt(0)) + word.substring(1);
        }).collect(Collectors.toList());

        // convert back the list of words into a single string
        String finalWord = String.join(" ", initUpperWords);

       return finalWord;
    }

    public static void main(String[] a) {
        System.out.println(new ConvertToCapitalUsingStreams().convertToInitCase("Taxi or bus driver"));

    }
}

Note: You may also want to look at this SO post about using apache commons-text library to do this job.

Naveen Kumar
  • 893
  • 11
  • 19
2

Split your string as words then convert first character to uppercase, then joining it to form original String:

String input = "Taxi or bus driver";
String output = Stream.of(input.split(" "))
                .map(w -> {
                     if (w.equals("or") || w.length() == 0) {
                         return w;
                     }
                     return w.substring(1) + Character.toUpperCase(w.charAt(0));
                })
                .collect(Collectors.joining(" "));
Mạnh Quyết Nguyễn
  • 17,677
  • 1
  • 23
  • 51
  • And your reason is? – Mạnh Quyết Nguyễn May 31 '18 at 07:55
  • your doing an unnecessary call to `Stream#of` that internally will call `Arrays#stream` – Eugene May 31 '18 at 08:07
  • 2
    @Eugene I would prefer `Arrays.stream` for semantic reasons. `String.split` returns an array and `Arrays.stream` is the right idiom to stream over an array. In contrast, `Stream.of(…)` is a *varargs* method which happens to be able to accept arrays due to the way, varargs have been implemented (and for compatibility with pre-Java 5 code). – Holger May 31 '18 at 10:22