Using a profiler, I seem to be seeing the following with Apple's 1.6 Java:
I start with a moderately long Java string. I split it into tokens using String.split("\\W+")
. The code then holds references to some of the split up pieces.
It seems, if I believe my eyes in yourkit, that Java has helpfully not copied these strings, so that I'm in fact holding references to the lengthy originals. In my case this leads to a rather large waste of space.
Does this seem plausible? It's easy enough to add a loop making copies of these guys.