10

As I am studying java, I have learned that the proper way to compare 2 Strings is to use equals and not "==". This line

static String s1 = "a";
static String s2 = "a";
System.out.println(s1 == s2);  

will output true because the jvm seems to have optimized this code so that they are actually pointing to the same address. I tried to prove this using a great post I found here

http://javapapers.com/core-java/address-of-a-java-object/

but the addresses don't seem to be the same. What am I missing?

import sun.misc.Unsafe;
import java.lang.reflect.Field;
public class SomeClass {
    static String s1 = "a";
    static String s2 = "a";
    public static void main (String args[]) throws Exception {
        System.out.println(s1 == s2); //true

        Unsafe unsafe = getUnsafeInstance();
        Field s1Field = SomeClass.class.getDeclaredField("s1");
        System.out.println(unsafe.staticFieldOffset(s1Field)); //600

        Field s2Field = SomeClass.class.getDeclaredField("s2");
        System.out.println(unsafe.staticFieldOffset(s2Field)); //604

    }

    private static Unsafe getUnsafeInstance() throws SecurityException, 
        NoSuchFieldException, IllegalArgumentException, IllegalAccessException {
        Field theUnsafeInstance = Unsafe.class.getDeclaredField("theUnsafe");
        theUnsafeInstance.setAccessible(true);
        return (Unsafe) theUnsafeInstance.get(Unsafe.class);
    }
}

user584583
  • 1,242
  • 4
  • 18
  • 35

4 Answers4

11

I think you're confused on what staticFieldOffset is returning. It's returning the offset of the pointer to the String instance, not the address of the String itself. Because there are two fields, they have different offsets: ie, two pointers, which happen to have the same value.

A close reading of the Unsafe javadoc shows this:

Report the location of a given field in the storage allocation of its class. Do not expect to perform any sort of arithmetic on this offset; it is just a cookie which is passed to the unsafe heap memory accessors.

In other words, if you know where the actual Class instance is in memory, then you could add the offset returned by this method to that base address, and the result would be the location in memory where you could find the value of the pointer to the String.

sharakan
  • 6,821
  • 1
  • 34
  • 61
  • @AndrewW at the very least, before you play with Unsafe! – sharakan May 31 '13 at 15:01
  • @AndrewW Knowledge of C is indeed very essential because most current programming languages derive a lot of their semantics and mechanics from C. But that's C, not C++. Not very accurate to mix them like that. On the contrary, C++ "teaches" way too many C++-specific stuff. – Theodoros Chatzigiannakis May 31 '13 at 15:23
  • @Theodoros I didn't mean for semantics. (Though I agree it's useful for that.) I meant for learning how pointers work, and C and C++ would both work for that. – Andrew W May 31 '13 at 15:28
  • @AndrewW Oh, yeah, for understanding how offsets work, I definitely agree. Everyone should understand pointer arithmetic and its applications (even if it's done "implicitly"), regardless of language. – Theodoros Chatzigiannakis May 31 '13 at 15:30
3

You aren't missing anything. The Unsafe library is reporting what is actually happening.

Bytecode:

static {};
  Code:
   0:   ldc #11; //String a
   2:   putstatic   #13; //Field s1:Ljava/lang/String;
   5:   ldc #11; //String a
   7:   putstatic   #15; //Field s2:Ljava/lang/String;
   10:  return

Notice both Strings are put in different locations in memory, 13 and 15.

There is a difference between where the the variables are stored in memory, which needs a separate address, and whether a new Object is put on the heap. In this case, it assigns two separate addresses for two variables, but it does not need to create a new String Object as it recognizes the same String literal. So both variables reference the same String at this point.

If you want to get the Adress, you can use the answer found in this question,How can I get the memory location of a object in java?. Make sure you read the caveats before using, but I did a quick test and it seems to work.

Community
  • 1
  • 1
greedybuddha
  • 7,488
  • 3
  • 36
  • 50
  • But if they're at different locations, why is `==` returning true? – aardvarkk May 31 '13 at 14:57
  • @aardvarkk because staticFieldOffset gives you the location of the `s1` and `s2` within `SomeClass`. These are different references, they even have different names, `s1` and `s2`. However both these references will point to the same string object, and that's the fact `==` will evaluate. – nos May 31 '13 at 15:04
  • @greedybuddha You wrote "However both these references will point to the same string object". I know this is accurate, but is there a way to prove it by outputting some address (without looking at the bytecode). – user584583 May 31 '13 at 15:13
  • See my updated answer, I linked an answer with code to get the address – greedybuddha May 31 '13 at 15:53
3

In the Code above you are not comparing the addresses of the strings, but their "location of a given field in the storage allocation", i.e. the location of the variables holding a reference to (the same) string.

rli
  • 1,745
  • 1
  • 14
  • 25
-4

String declared in Java code are automatically interned.

So the result is the same as you would call String.intern() manually.

    String a = "aa";
    String b = new String(a);
    System.out.println("aa" == "aa");
    System.out.println(a == b);
    System.out.println(a.equals(b));
    System.out.println(a.intern() == b.intern());

output:

true

false

true

true

Community
  • 1
  • 1
Danubian Sailor
  • 1
  • 38
  • 145
  • 223