50

It's common knowledge that Java Strings are immutable. Immutable Strings are great addition to java since its inception. Immutability allows fast access and a lot of optimizations, significantly less error-prone compared to C-style strings, and helps enforce the security model.

It's possible to create a mutable one without using hacks, namely

  • java.lang.reflect
  • sun.misc.Unsafe
  • Classes in bootstrap classloader
  • JNI (or JNA as it requires JNI)

But is it possible in just plain Java, so that the string can be modified at any time? The question is How?

Jakub Vrána
  • 614
  • 4
  • 14
bestsss
  • 11,796
  • 3
  • 53
  • 63
  • java has no resizable arrays. all arrays `length` is final and immutable once instantiated. (`length` is not a field, though) – bestsss Jun 21 '12 at 20:33
  • You mean something different from `StringBuilder`, which is the recommended way of simulating mutability? – Gene Jun 21 '12 at 20:36
  • 4
    You have asserted that there exists a method to do this. Do you know that for a fact? Is this some kind of puzzle? – Greg Hewgill Jun 21 '12 at 20:36
  • @GregHewgill, sure it is – bestsss Jun 21 '12 at 20:37
  • What do you mean by "mutable string"? Any object of class java.lang.String will be immutable. So that can't be it. Writing a class that behaves mostly like String but is mutable is trivial. However, addition and literals in the source files won't work. So where are you going with this? – Jochen Jun 21 '12 at 20:49
  • @Jochen, the question clearly states, `java.lang.String` and mutable. I don't know how it can be stated differently. – bestsss Jun 21 '12 at 20:52
  • @bestsss Looking forward to your solution ;-) – Jochen Jun 21 '12 at 20:58
  • Do you allow byte code manipulation and/or serialization tricks ? – Emmanuel Bourg Jun 21 '12 at 21:01
  • Could I put an un-bounty on this question? Anything that can do this is evil. – Louis Wasserman Jun 21 '12 at 21:01
  • @EmmanuelBourg, absolutely as long as you do not put classes in java.lang via the bootstap classloader. And you can modify the instance of the string not the serialized one (this is useless) - the idea is just to be able to print differently `final String s=createModifiableString(); System.out.println(s); modify(s); System.out.println(s)` The second line has to differ. – bestsss Jun 21 '12 at 21:05
  • @LouisWasserman, the question is about raising alertness I'd say – bestsss Jun 21 '12 at 21:06
  • @Jochen, I'd wait for the answers first; I'd put my solution at the end w/ even more twists... – bestsss Jun 21 '12 at 21:07
  • 5
    This might have been a fit for http://codegolf.stackexchange.com/faq but I feel it's off topic here. Too bad one [cannot close while the bounty is active](http://meta.stackexchange.com/questions/121448/allow-users-to-vote-to-close-bountied-questions). – Arjan Jun 24 '12 at 11:41
  • 2
    @Arjan, you can always flag the question or edit. Close is rarely a good option – bestsss Jun 25 '12 at 07:09
  • Why can't use something like this? myString = myString.replaceAll("", ""); – Sri Harsha Chilakapati Jun 30 '12 at 01:15
  • @SriHarshaChilakapati. that results of a new instance - you mistake references with the real object the point to. – bestsss Jul 01 '12 at 11:01
  • @bestsss I am creating a new instance of the String class and changing the reference of old instance to point new one. Hence there will be no references for the old string and it is garbage collected. No memory problems arise in this case. – Sri Harsha Chilakapati Jul 02 '12 at 00:40

6 Answers6

85

Creating a java.lang.String with the Charset constructor, one can inject your own Charset, which brings your own CharsetDecoder. The CharsetDecoder gets a reference to a CharBuffer object in the decodeLoop method. The CharBuffer wraps the char[] of the original String object. Since the CharsetDecoder has a reference to it, you can change the underlying char[] using the CharBuffer, thus you have a mutable String.

public class MutableStringTest {


    // http://stackoverflow.com/questions/11146255/how-to-create-mutable-java-lang-string#11146288
    @Test
    public void testMutableString() throws Exception {
        final String s = createModifiableString();
        System.out.println(s);
        modify(s);
        System.out.println(s);
    }

    private final AtomicReference<CharBuffer> cbRef = new AtomicReference<CharBuffer>();
    private String createModifiableString() {
        Charset charset = new Charset("foo", null) {
            @Override
            public boolean contains(Charset cs) {
                return false;
            }

            @Override
            public CharsetDecoder newDecoder() {
                CharsetDecoder cd = new CharsetDecoder(this, 1.0f, 1.0f) {
                    @Override
                    protected CoderResult decodeLoop(ByteBuffer in, CharBuffer out) {
                        cbRef.set(out);
                        while(in.remaining()>0) {
                            out.append((char)in.get());
                        }
                        return CoderResult.UNDERFLOW;
                    }
                };
                return cd;
            }

            @Override
            public CharsetEncoder newEncoder() {
                return null;
            }
        };
        return new String("abc".getBytes(), charset);
    }
    private void modify(String s) {
        CharBuffer charBuffer = cbRef.get();
        charBuffer.position(0);
        charBuffer.put("xyz");
    }

}

Running the code prints

abc
zzz

I don't know how to correctly implement decodeLoop(), but i don't care right now :)

Francisco Spaeth
  • 23,493
  • 7
  • 67
  • 106
mhaller
  • 14,122
  • 1
  • 42
  • 61
  • lovely, this is the correct answer! Due to this 'feature' using new String(byte[], offset, len, Charset) totally blows also b/c the byte[] is copied entirely - i.e. using 1MB buffer and creating small string kills any performance. – bestsss Jun 21 '12 at 22:18
  • 7
    The good news it's not security vulnerability if `System.getSecurityManager()` is present as the returned `char[]` is copied. – bestsss Jun 21 '12 at 22:20
  • @Spaeth, it is very very mutable, the object itself DOES change its state – bestsss Jun 26 '12 at 08:26
  • Maybe there is a way to use outer list of char instead of inner char array? – alaster Jun 27 '12 at 15:32
  • 1
    Why has this answer a downvote? Did someone not like the idea of a mutable String? ;-) – DerMike Dec 18 '13 at 17:23
  • I have tested this code and it works in java 8 but in java 11 it does not mutate the string – Alex Shavlovsky Dec 09 '21 at 07:35
9

The question received a good answer by @mhaller. I'd say the so-called-puzzle was pretty easy and by just looking at the available c-tors of String one should be able to find out the how part, a

Walkthrough

C-tor of interest is below, if you are to break-in/crack/look for security vulnerability always look for non-final arbitrary classes. The case here is java.nio.charset.Charset


//String
public String(byte bytes[], int offset, int length, Charset charset) {
    if (charset == null)
        throw new NullPointerException("charset");
    checkBounds(bytes, offset, length);
    char[] v = StringCoding.decode(charset, bytes, offset, length);
    this.offset = 0;
    this.count = v.length;
    this.value = v;
}
The c-tor offers supposedly-fast way to convert byte[] to String by passing the Charset not the chartset name to avoid the lookup chartsetName->charset. It also allows passing an arbitrary Charset object to create String. Charset main routing converts the content of java.nio.ByteBuffer to CharBuffer. The CharBuffer may hold a reference to char[] and it's available via array(), also the CharBuffer is fully modifiable.

    //StringCoding
    static char[] decode(Charset cs, byte[] ba, int off, int len) {
        StringDecoder sd = new StringDecoder(cs, cs.name());
        byte[] b = Arrays.copyOf(ba, ba.length);
        return sd.decode(b, off, len);
    }

    //StringDecoder
    char[] decode(byte[] ba, int off, int len) {
        int en = scale(len, cd.maxCharsPerByte());
        char[] ca = new char[en];
        if (len == 0)
            return ca;
        cd.reset();
        ByteBuffer bb = ByteBuffer.wrap(ba, off, len);
        CharBuffer cb = CharBuffer.wrap(ca);
        try {
            CoderResult cr = cd.decode(bb, cb, true);
            if (!cr.isUnderflow())
                cr.throwException();
            cr = cd.flush(cb);
            if (!cr.isUnderflow())
                cr.throwException();
        } catch (CharacterCodingException x) {
            // Substitution is always enabled,
            // so this shouldn't happen
            throw new Error(x);
        }
        return safeTrim(ca, cb.position(), cs);
    }

In order to prevent altering the char[] the java developers copy the array much like any other String construction (for instance public String(char value[])). However there is an exception - if no SecurityManager is installed, the char[] is not copied.

    //Trim the given char array to the given length
    //
    private static char[] safeTrim(char[] ca, int len, Charset cs) {
        if (len == ca.length 
                && (System.getSecurityManager() == null
                || cs.getClass().getClassLoader0() == null))
            return ca;
        else
            return Arrays.copyOf(ca, len);
    }

So if there is no SecurityManager it's absolutely possible to have a modifiable CharBuffer/char[] that's being referenced by a String.

Everything looks fine by now - except the byte[] is also copied (the bold above). This is where java developers went lazy and massively wrong.

The copy is necessary to prevent the rogue Charset (example above) to be able alter the source byte[]. However, imagine the case of having around 512KB byte[] buffer that contains few String. Attempting to create a single small, few charts - new String(buf, position, position+32,charset) resulting in massive 512KB byte[] copy. If the buffer were 1KB or so, the impact will never be truly noticed. With large buffers, the performance hit is really huge, though. The simple fix would be to copy the relevant part.

...or well the designers of java.nio thought about by introducing read-only Buffers. Simply calling ByteBuffer.asReadOnlyBuffer() would have been enough (if the Charset.getClassLoader()!=null)* Sometimes even the guys working on java.lang can get it totally wrong.

*Class.getClassLoader() returns null for bootstrap classes, i.e. the ones coming with the JVM itself.

Mechanical snail
  • 29,755
  • 14
  • 88
  • 113
5

I would say StringBuilder (or StringBuffer for multithreaded use). Yes at the end you get a immutable String. But that's the way to go.

For example the best way to append Strings in a loop is to use StringBuilder. Java itself uses StringBuilder when you use "fu " + variable + " ba".

http://docs.oracle.com/javase/6/docs/api/java/lang/StringBuilder.html

append(blub).append(5).appen("dfgdfg").toString();

keiki
  • 3,260
  • 3
  • 30
  • 38
  • 1
    that's not String at any rate, CharSequence at best. – bestsss Jun 21 '12 at 20:36
  • a String is a CharSequence (thats why String implements Charsequence^^). – keiki Jun 21 '12 at 20:44
  • 2
    No string is a **final** class. CharSequence is an **interface**. On simalar grounds both extend (indirectly for StringBiuilder/Buffer) java.lang.Object. The question is about `java.lang.String` precisely. – bestsss Jun 21 '12 at 20:46
  • This still generates a String, however StringBuilder implements CharSequence. So you can often use the StringBuilder in place of a string, giving you a Mutable CharSequence that can avoid GC and such (I like to print a lot of strings very quickly sometimes and don't want GC to be a performance issue) – HaMMeReD Feb 20 '17 at 20:56
2
// How to achieve String Mutability

import java.lang.reflect.Field; 

public class MutableString {

    public static void main(String[] args) { 
        String s = "Hello"; 

        mutate(s);
        System.out.println(s); 

    } 

    public static void mutate(String s) {
        try {

            String t = "Hello world";
            Field val = String.class.getDeclaredField("value"); 
            Field count = String.class.getDeclaredField("count"); 
            val.setAccessible(true); 
            count.setAccessible(true); 

            count.setInt (s, t.length ());
            val.set (s, val.get(t));
        } 
        catch (Exception e) { e.printStackTrace(); }
    } 

}
jitendrak
  • 21
  • 3
  • 2
    I guess the part about java.lang.reflect in the question has escaped you. The code will fail on JDK 7+ also – bestsss Mar 19 '15 at 20:24
0

Don't reinvent the wheel. Apache commons provides just that.

MutableObject<String> mutableString = new MutableObject<>();
Roland Ettinger
  • 2,615
  • 3
  • 23
  • 24
-2

Simplier way to swap bootstrap class path of java and javac

1) Go to jdk installation and copy to separate folder rt.jar and src.zip

2) Unpack String.java from sources zip and change it private field value of inner char array to public

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {
    /** The value is used for character storage. */
    public final char value[];

3) Compile modified String.java with help of javac:

javac String.java

4) Move compiled String.class and other compiled classes to rt.jar in this directory

5) Create test class that use String private field

package exp;

    class MutableStringExp { 

        public static void main(String[] args) {
            String letter = "A";
            System.out.println(letter);
            letter.value[0] = 'X';
            System.out.println(letter);
        }
    }

6) Create empty dir target and compile test class

javac -Xbootclasspath:rt.jar -d target MutableStringExp.java

7) Run it

java -Xbootclasspath:rt.jar -cp "target" exp.MutableStringExp

output is:

A
X

P.S this will only work with modified rt.jar and use this option to override rt.jar is violation of jre licence.

fxrbfg
  • 1,756
  • 1
  • 11
  • 17