54

The Java documentation doesn't seem to mention anything about deprecation for StringTokenizer, yet I keep hearing about how it was deprecated long ago. Was it deprecated because it had bugs/errors, or is String.split() simply better to use overall?

I have some code that uses StringTokenizer and I am wondering if I should seriously be concerned about refactoring it to use String.split(), or whether the deprecation is purely a matter of convenience and my code is safe.

ChrisWue
  • 18,612
  • 4
  • 58
  • 83
donnyton
  • 5,874
  • 9
  • 42
  • 60
  • 34
    `StringTokenizer` is a legacy class (i.e. there is a better replacement out there), but it's **not** deprecated. Deprecation only happens when the class/method has some *serious* drawbacks. A similar situation happens with `Vector`: you can *almost always* replace it with an `ArrayList`, but it's not "bad" or "broken", therefore it's not deprecated. – Joachim Sauer Aug 08 '11 at 14:49
  • 5
    @Joachim if comments could be accepted I would have – donnyton Aug 09 '11 at 14:54
  • 3
    StringTokenizer has a serious common-sense problem: It treats consecutive delimiters as one delimiter. This is not a common or traditional sense. For example, in csv 'a,,b' means 3 fields with 2nd field is empty. But in Stringtokenizer it defaultly see this as only 2 fields, the ',,' is regards as ','. This already confuses many programmers and make them experince unnecessary debugging efforts. *** JUST DON'T USE IT ANYMORE *** – Scott Chu Sep 21 '16 at 09:02
  • @Scott Chu: Agreed it's not as scrupulous as other methods. However if you create a new ST such that you include the separators in the token values, you can get back empty values, in a sense, by checking to see if each token merely holds a single delimiter as its value. Of course the burden is on the programmer to chop off the last character of each token because it will be a delimiter... unless it's the last token in the collection of tokens... ugh, such a pain. So I agree, using a diff. approach when you want to count empty token values as returned values is a better idea than using ST. – Matt Campbell Oct 20 '16 at 17:16
  • Yes it seems to me when StringTokenizer started to take a back seat to split() the JDK should have included some version of split that behaves like the split() that exists now but does not include the overhead or complexities of regex. – Chuck Mar 26 '17 at 18:00
  • When parsing a configuration value containing a comma-separated list, it may be desirable to skip consecutive delimiters: `Collections.list(new StringTokenizer(",X, ,,Y , Z,", ", "))` produces `[X, Y, Z]` – Eron Wright May 17 '17 at 17:46
  • If you're planning to parse text where the delimiters are repeated/interchangeable whitespace (spaces, tabs) then actually the `StringTokenizer` is a great solution. If you need more specific comma (etc.) delimiting then use `String.split`. – MasterHD Feb 01 '22 at 16:00

8 Answers8

84
  1. Java 10 String Tokenizer -- not deprecated
  2. Java 9 String Tokenizer -- not deprecated
  3. Java 8 String Tokenizer -- not deprecated
  4. Java 7 String Tokenizer -- not deprecated
  5. Java 6 String Tokenizer -- not deprecated
  6. Java 5 String Tokenizer -- not deprecated

If it is not marked as deprecated, it is not going away.

nicholas.hauschild
  • 42,483
  • 9
  • 127
  • 120
  • 8
    "StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead." – Miuler Aug 09 '12 at 22:53
  • 7
    Legacy is different than deprecated. – StackOverflowed Aug 17 '12 at 17:32
  • 2
    Personally for me, a code using StringTokenizer looks simpler and cleaner. – Audrius Meškauskas Jan 13 '13 at 16:56
  • Yeah, I'm with @h22, often StringTokenizer code is more elegant/readable than using split. I save split for when I really need the newer behavior. – Brian Knoblauch Dec 19 '16 at 21:02
68

From the javadoc for StringTokenizer:

StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

If you look at String.split() and compare it to StringTokenizer, the relevant difference is that String.split() uses a regular expression, whereas StringTokenizer just uses verbatim split characters. So if I wanted to tokenize a string with more complex logic than single characters (e.g. split on \r\n), I can't use StringTokenizer but I can use String.split().

Jason S
  • 184,598
  • 164
  • 608
  • 970
17

StringTokenizer is not deprecated in fact StringTokenizer is 4X faster than String.split() and in competitive programming it is used by many developers.

Source :- Faster Input for Java

Nishant Thapliyal
  • 1,540
  • 17
  • 28
1

There is an issue with StringTokenize ...

Split have to use regex, StringTokenizer is using String or CharSequence,

but

"a.b..".split(".") will return {"a","b",""}

and StringTokenizer of "a.b.." ... will return only {"a","b"}

And this is very tricky!!! Be Carefull!!!

Better and safer alternatives to StringTokenizer are:

Much better StrongTokenizer is in org.apache.common.lang3 ... it have much more flexibility or

com.google.common.base.Splitter

Aleksandr M
  • 24,264
  • 12
  • 69
  • 143
  • Exactly. And this issues took our programmers tremendous efforts to find a 'butterfly effect' bug. Why Java team invents such terrible stuff in the beginning?... Sigh! – Scott Chu Sep 21 '16 at 09:05
0
  1. StringTokenizer is not deprecated

  2. It is little different function and output ...

For example, if you have "aaa.aa.aa" and want to split it into the parts "aaa", "aa" and "a", you can just write:

new StringTokenizer("aaa.aa.aa", ".")

If you just use:

"aaa.aa.aa".split(".")

It returns an empty array, because it matches regular expressions where . is a spacial character. So you have to escape it:

"aaa.aa.aa".split("\\.")

So basically .. split enable you to use regex ... it can be very usefull

But StringTokenizer parse text by tokens ... and token can be even special character

Cryptjar
  • 1,079
  • 8
  • 13
0

StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

The following example shows how the String.split() method can be used to break up a string into its basic tokens:

 String[] result = "this is a test".split("\\s");
MSA
  • 1
  • 1
  • 3
0

I don't think so that the reason of that is String.split method, because split is slow way to parse the string - it compiles a pattern inside.

StringTokenizer just can be replaced with a more functional classes like java.util.Scanner or your can use pattern matcher to get the groups by regexp.

-3

Personally I feel StringTokenizer was deprecated because it was simply an complex way of doing something pretty simple. StringTokenizer as the name implies only applied to Strings so why not just made it a method in String. Further StringTokenizer didn't support RegularExpression does not support regular expression which became extremely common in the late 90's and early '00's hence rendering it practically useless.

Ali
  • 12,354
  • 9
  • 54
  • 83
  • 1
    You're right, it's not, it's just recommended that you dont use it. The difference being that there is no guaratee that deprecated classes will continue to be provided in future releases. – Ali Aug 08 '11 at 14:58
  • 5
    That's not correct either. There is an absolute guarantee of backwards binary compatibility. – user207421 Aug 09 '11 at 00:14