5

How can I split the following word in to an array

That's the code

into

array
0 That
1 s
2 the
3 code

I tried something like this

String str = "That's the code";

        String[] strs = str.split("\\'");
        for (String sstr : strs) {
            System.out.println(sstr);
        }

But the output is

That
s the code
user2095165
  • 185
  • 1
  • 4
  • 13

8 Answers8

17

To specifically split on white space and the apostrophe:

public class Split {
    public static void main(String[] args) {
        String [] tokens = "That's the code".split("[\\s']");
        for(String s:tokens){
            System.out.println(s);
        }
    }
}

or to split on any non word character:

public class Split {
    public static void main(String[] args) {
        String [] tokens = "That's the code".split("[\\W]");
        for(String s:tokens){
            System.out.println(s);
        }
    }
}
Kevin Bowersox
  • 93,289
  • 19
  • 159
  • 189
  • 2
    what the difference between [\\W] and [\\s'] – user2095165 Dec 22 '13 at 09:49
  • 2
    `\\W` represents a non-word character which is any character that is not a-z, A-Z, 0-9, including the _ (underscore) character. `\\s` represents a white space, so tabs, spaces, line breaks, etc. If I were to add something in parens `()` to the String `\\W` would split on each paren, however the `\\s` version would not. – Kevin Bowersox Dec 22 '13 at 09:52
  • Do you now the runtime complexity for this method? – Koogle Feb 08 '15 at 01:34
5

The best solution I've found to split by words if your string contains accentuated letters is :

String[] listeMots = phrase.split("\\P{L}+");

For instance, if your String is

String phrase = "Salut mon homme, comment ça va aujourd'hui? Ce sera Noël puis Pâques bientôt.";

Then you will get the following words (enclosed within quotes and comma separated for clarity) :

"Salut", "mon", "homme", "comment", "ça", "va", "aujourd", "hui", "Ce", 
"sera", "Noël", "puis", "Pâques", "bientôt".

Hope this helps!

Pierre C
  • 2,920
  • 1
  • 35
  • 35
4

You can split according to non-characters chars:

String str = "That's the code";
String[] splitted = str.split("[\\W]");

For your input, output will be:

That
s
the
code
Maroun
  • 94,125
  • 30
  • 188
  • 241
1

You can split by a regex that would be one of the two characters - quote or space:

String[] strs = str.split("['\\s]");
Szymon
  • 42,577
  • 16
  • 96
  • 114
1

If you want to split on non alphabetic chars

String str = "That's the code";
String[] strs = str.split("\\P{Alpha}+");
for (String sstr : strs) {
        System.out.println(sstr);
}

\P{Alpha} matches any non-alphabetic character and this is called POSIX character you can read more about it in this link It is very useful. + indicates that we should split on any continuous string of such characters.

and the output will be

That
s
the
code
Tareq Salah
  • 3,720
  • 4
  • 34
  • 48
  • 1
    +1 for Unicode version but this code may be not very clear for someone new to regex so you probably should expand your answer a little. – Pshemo Dec 22 '13 at 09:47
1

You should first replace the ' with " " (blank space), using str.replaceAll("'", " ") and then you can split the string on the blank space separator, using str.split(" ").You could alternatively use a regular expression to split on ' OR space.

umanganiello
  • 756
  • 4
  • 7
0

split uses regex and in regex ' is not special character so you don't need to escape it with \. To represent whitespaces you can use \s (which in String needs to be written as "\\s"). Also to create set of characters you can use "OR" operator | like a|b|c|d, or just use character class [abcd] which means exactly the same as (a|b|c|d).

To makes things simple you can use

String[] strs = str.split("'| ");

or

String[] strs = str.split("'|\\s");//to include all whitespaces

or

String[] strs = str.split("['\\s]");//equivalent of "'|\\s"
Pshemo
  • 122,468
  • 25
  • 185
  • 269
0

You can use OR in regular expression

public static void main(String[] args) {
    String str = "That's the code";
        String[] strs = str.split("'|\\s");
        for (String sstr : strs) {
            System.out.println(sstr);
        }
   }

The string will be split by single quote (') or space. The single quote doesn't need to be escaped. The output would be

run:
That
s
the
code
BUILD SUCCESSFUL (total time: 0 seconds)
Keerthivasan
  • 12,760
  • 2
  • 32
  • 53