4

I need to split words by space separated in java, so I have used .split function in-order to achieve that like as shown below

String keyword = "apple mango ";
String keywords [] = keyword .split(" ");

The above code is working fine but the only is that I sometimes my keyword will contain keyword like "jack fruit" , "ice cream" with double quotes like as shown below

String keyword = "apple mango \"jack fruit\" \"ice cream\"";

In this case I need to get 4 words like apple, mango, jack fruit, ice cream in keywords array

Can anyone please tell me some solution for this

Nikaido
  • 4,443
  • 5
  • 30
  • 47
Alex Man
  • 4,746
  • 17
  • 93
  • 178

5 Answers5

4
List<String> parts = new ArrayList<>();
String keyword = "apple mango \"jack fruit\" \"ice cream\"";

// first use a matcher to grab the quoted terms
Pattern p = Pattern.compile("\"(.*?)\"");      
Matcher m = p.matcher(keyword);
while (m.find()) {
    parts.add(m.group(1));
}

// then remove all quoted terms (quotes included)
keyword = keyword.replaceAll("\".*?\"", "")
                 .trim();

// finally split the remaining keywords on whitespace
if (keyword.replaceAll("\\s", "").length() > 0) {
    Collections.addAll(parts, keyword.split("\\s+"));
}

for (String part : parts) {
    System.out.println(part);
}

Output:

jack fruit
ice cream
apple
mango
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
3

I'd do it with a regex and two capturing group, one for each pattern. I'm not aware of any other way.

    String keyword = "apple mango \"jack fruit\" \"ice cream\"";
    Pattern p = Pattern.compile("\"?(\\w+\\W+\\w+)\"|(\\w+)");      
    Matcher m = p.matcher(keyword);
    while (m.find()) {
        String word = m.group(1) == null ? m.group(2) : m.group(1);
        System.out.println(word);
    }
mprivat
  • 21,582
  • 4
  • 54
  • 64
  • I removed my solution seeing this one, I don't know why I didn't thought about using two groups ... – AxelH Dec 01 '16 at 14:28
  • While the OP has probably abandoned the thread, this solution is the most elegant one. – Murat Karagöz Dec 01 '16 at 14:44
  • how about if somebody wrongly put only one double quotes at the last like `apple mango"` – Alex Man Dec 01 '16 at 15:57
  • I'd think if you actually needs some more serious conditional parsing, you will have to pull out the heavy guns. Like JavaCC for example if you want to take this to the extreme and be able to detect this kind of error. The regex doesn't really have logic like that. If you throw in a wrench, it'll act funny. – mprivat Dec 02 '16 at 01:27
  • @mprivat If the keyword is `String keyword = "\"ice cream\" 192.168.214.125";` it is not giving the right result as `ice cream` and `192.168.214.125` – Alex Man Dec 12 '16 at 06:50
0

This solution works but i am sure that is not the best for performance / resources. It also works when you have fruits with more than two words. Feel free to edit or optimize my code.

public static void main(String[] args) {
        String keyword = "apple mango \"jack fruit\" \"ice cream\" \"one two three\"";
        String[] split = custom_split(keyword);
        for (String s : split) {
            System.out.println(s);
        }
    }

    private static String[] custom_split(String keyword) {
        String[] split = keyword.split(" ");
        ArrayList<String> list = new ArrayList<>();
        StringBuilder temp = new StringBuilder();
        boolean multiple = false;
        for (String s : split) {
            if (s.startsWith("\"")) {
                multiple = true;
                s = s.replaceAll("\"", "");
                temp.append(s);
                continue;
            }
            if (s.endsWith("\"")) {
                multiple = false;
                s = s.replaceAll("\"", "");
                temp.append(" ").append(s);
                list.add(temp.toString());
                temp = new StringBuilder();
                continue;
            }
            if (multiple) {
                temp.append(" ").append(s);
            } else {
                list.add(s);
            }
        }
        String[] result = new String[list.size()];
        for (int i = 0; i < list.size(); i++) {
            result[i] = list.get(i);
        }
        return result;
    }
AdrianES
  • 670
  • 3
  • 13
  • 29
0

You can't do that with String.split(). You need to come up with a regular expression for the target tokens, and collect them through a matcher, like this:

    final Pattern token = Pattern.compile( "[^\"\\s]+|\"[^\"]*\"" );

    List<String> tokens = new ArrayList<>();
    Matcher m = token.matcher( "apple mango \"jack fruit\" \"ice cream\"" );
    while( m.find() )
        tokens.add( m.group() );
shinobi
  • 351
  • 1
  • 8
0

This will split string on quotes, and then additionally split even members by spaces.

    String keyword = "apple mango \"jack fruit\" \"ice cream\"";
    String splitQuotes [] = keyword.split("\"");

    List<String> keywords = new ArrayList<>();

    for (int i = 0; i < splitQuotes.length; i++) {
        if (i % 2 == 0) {
            Collections.addAll(keywords, splitQuotes[i].split(" "));
        } else {
            keywords.add(splitQuotes[i]);
        }
    }
Crepi
  • 645
  • 5
  • 15
  • Empty cell can be in keywords only if we have inside of quotes "". If you meant it will add them when two quotes are one after another (like for \"jack fruit\" \"ice cream\"), it will just call addAll and add empty String array, so it won't affect result. Or there is some other scenario I'm not seeing? – Crepi Dec 01 '16 at 14:46