2

I have a String str, which is comprised of several words separated by single spaces. If I want to create a set or list of strings I can simply call str.split(" ") and I would get I want.

Now, assume that str is a little more complicated, for example it is something like:

    str = "hello bonjour \"good morning\" buongiorno";

In this case what is in between " " I want to keep so that my list of strings is:

    hello
    bonjour
    good morning
    buongiorno

Clearly, if I used split(" ") in this case it won't work because I'd get

    hello
    bonjour
    "good
    morning"
    buongiorno

So, how do I get what I want?

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
user
  • 2,015
  • 6
  • 22
  • 39

3 Answers3

3

You can create a regex that finds every word or words between "".. like:

\w+|(\"\w+(\s\w+)*\")

and search for them with the Pattern and Matcher classes.

ex.

String searchedStr = "";
Pattern pattern = Pattern.compile("\\w+|(\\\"\\w+(\\s\\w+)*\\\")");
Matcher matcher = pattern.matcher(searchedStr);
while(matcher.find()){
    String word = matcher.group();
}

Edit: works for every number of words within "" now. XD forgot that

Ed Morales
  • 1,027
  • 5
  • 9
2

You can do something like below. First split the Sting using "\"" and then split the remaining ones using space" " . The even tokens will be the ones between quotes "".

public static void main(String args[]) {

    String str = "hello bonjour \"good morning\" buongiorno";
    System.out.println(str);
    String[] parts = str.split("\"");
    List<String> myList = new ArrayList<String>();
    int i = 1;
    for(String partStr : parts) {
        if(i%2 == 0){
            myList.add(partStr);
        }
        else {
            myList.addAll(Arrays.asList(partStr.trim().split(" ")));
        }
        i++;
    }

    System.out.println("MyList : " + myList);


}

and the output is

hello bonjour "good morning" buongiorno
MyList : [hello, bonjour, good morning, buongiorno]
Aniket Thakur
  • 66,731
  • 38
  • 279
  • 289
1

You may be able to find a solution using regular expressions, but what I'd do is simply manually write a string breaker.

List<String> splitButKeepQuotes(String s, char splitter) {
    ArrayList<String> list = new ArrayList<String>();
    boolean inQuotes = false;
    int startOfWord = 0;

    for (int i = 0; i < s.length(); i++) {
        if (s.charAt(i) == splitter && !inQuotes && i != startOfWord) {
            list.add(s.substring(startOfWord, i));
            startOfWord = i + 1;
        }
        if (s.charAt(i) == "\"") {
            inQuotes = !inQuotes;
        }
    }

    return list;
}
tophyr
  • 1,658
  • 14
  • 20