0

I am trying to split a sentence using a string as a delimiter.

String sentence = "Java and Python are programming language. Unix and Windows are operating systems."
StringTokenizer tokens = new StringTokenizer(sentence, "and");

expected output is

Java
Python are programming language. Unix
Windows are operating systems.

But split occurs for each and every characters mentioned as a delimiter. Is there any way to use the string directly to split the sentence? also is there any way to use multiple strings as a delimiter?

Shriram
  • 4,343
  • 8
  • 37
  • 64

5 Answers5

2

The documentation for StringTokenizer says that:

Constructs a string tokenizer for the specified string. The characters in the delim argument are the delimiters for separating tokens. Delimiter characters themselves will not be treated as tokens.

So basically, you can't use multi character delimiters.

An alternative is to use String.split or Scanner, both of which takes a regular expression as a delimiter. This gives you lots more flexibility.

The closest to StringTokenizer would be Scanner. Here is an example usage:

Scanner scanner = new Scanner("Your String to Tokenize");
scanner.useDelimiter("and");
scanner.next(); // "next" is basically string tokeniser's "nextToken"

You can use multiple things as delimiters by separating them with |, e.g.:

"and|or"

As I've said, another way is to use String.split. It returns an array of strings:

String[] result = "Your String to Tokenize".split("and");
Sweeper
  • 213,210
  • 22
  • 193
  • 313
  • 2
    Whoever down voted can you kind explain your reason for doing so, so that I can improve my answer? – Sweeper Aug 19 '17 at 16:43
0

That is the correct behavior. If you read the documentation for Stringtokenizer you see that the delimiter argument is basically a list of characters that you want to split the input with. So each character in the string is treated as a separate delimiter.

To split a string using words as a delimiter you should use .split() and a regex as the delimiter. See here for examples.

SoroushA
  • 2,043
  • 1
  • 13
  • 29
0
String sentence = "Java and Python are programming language. Unix and Windows are operating systems.";

String removedAnd = sentence.replaceAll("^and$", "");

 System.out.println(removedAnd);
//Java  Python are programming language. Unix  Windows are operating systems.
Lena Kaplan
  • 756
  • 4
  • 13
0
String s = "Java and Python are programming language. Unix and Windows are operating systems.";
String tmp = s.replace("and", "\n");
System.out.println(tmp)

or you can use this code

String s = "Java and Python are programming language. Unix and Windows are operating systems.";
s = s.replace("and", "\n");
System.out.println(s);

if you don't want to create a new literal

And the output is:

Java 
 Python are programming language. Unix 
 Windows are operating systems.
Chirag
  • 555
  • 1
  • 5
  • 20
-1

Try this, this works just how you want it.

String sentence = "Java and Python are programming language. Unix and Windows are operating systems.";
String[] s = sentence.split("[\\]*sand[\\s]*");
for(int i=0;i<s.length;i++)
    System.out.println(s[i]);

Hope this helps.

Debanik Dawn
  • 797
  • 5
  • 28