I have a big string having at most 100000 character. Instead of using string.charAt[index]
to read a character from the string, I converted that string into char array using string.toCharArray()
method and now i am working with charArray[index]
. which takes less time than string.charAt[index]
method. However i want to know that, is there any other way which is faster than string.toCharArray();
method?
Asked
Active
Viewed 1.4k times
7

ravi
- 6,140
- 18
- 77
- 154
-
How did you determine that that `string.charAt(index)` is slower? I wouldn't think it would be slower. – Louis Wasserman Mar 24 '12 at 11:35
-
1For your convenience, maybe I can suggest using [StringReader](http://docs.oracle.com/javase/6/docs/api/java/io/StringReader.html) – Jakub Zaverka Mar 24 '12 at 11:37
-
I noticed the `System.getTimeinMills()` and `string.charAt(index)` also uses the array indexing. so better to have an array. – ravi Mar 24 '12 at 13:22
-
1@Ravi Joshi: *"using string.charAt[index] to read a character from the string"*... String's *charAt* does *not* read a character from the String. It reads a Java *char*, which is totally inadequate to hold all the Unicode characters. A character, since Java 1.4, may need more than one Java *char* to be represented using *char*. A website like Stackoverflow, for example, fully supports Unicode and all the Unicode codepoints. Java's *char* primitive does not. – TacticalCoder Mar 24 '12 at 14:33
-
@TacticalCoder: Thank you for this information. i did not aware of this fact. However in my case the string is composed of only lower case alphabets i.e. a-z. – ravi Mar 24 '12 at 20:30
-
@TacticalCoder : what you say is wrong. A char primitive IS a unicode character. Maybe you are confusing with the byte primitive ? From the official doc : "The char data type is a single 16-bit Unicode character." source : http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html Example : char rr = '華'; – Pierre Henry Nov 01 '13 at 14:34
-
@Pierre Henry: no, I am not confusing anything ; ) Many Unicode codepoints need two Java char to be encoded. By using the .charAt(...) method on such a Unicode codepoint you'd be reading only part of that codepoint. That is why in this day and age methods like *charAt* and *length* are mostly broken. You want to use *codePointAt* instead. Example: how do you put the character 'U+1040B' inside a Java *char*? You simply can't do it. See answer from 100K+ SO user here: http://stackoverflow.com/questions/12280801 (*"...a Java char holds a UTF-16 code unit instead of a Unicode character..."*) – TacticalCoder Nov 03 '13 at 17:45
-
Yes, you are right, sorry about that. I was persuaded that unicode used only 16 bits max. Thanks for pointing this out. I am not looking forward to have to work with those "astral" planes ;) – Pierre Henry Nov 07 '13 at 15:42
-
This problem already discussed in stack http://stackoverflow.com/questions/8894258/fastest-way-to-iterate-over-all-the-chars-in-a-string – it's me Sep 04 '14 at 16:38
1 Answers
1
I do not think there is a faster way. But please correct me!
A String instance is backed by a char array. charAt() does some index checks which may be the cause for it being slower than working with the array returned by toCharArray(). toCharArray() simply does a System.arraycopy() of the backing array.

nansen
- 2,912
- 1
- 20
- 33
-
while using `string.charAt[index]`, Is every time an `char[string.length()]` is created? If yes, then it may be the reason of its less performance. – ravi Mar 24 '12 at 13:26