60

It's well known that GCs will sometimes move objects around in memory. And it's to my understanding that as long as all references are updated when the object is moved (before any user code is called), this should be perfectly safe.

However, I saw someone mention that reference comparison could be unsafe due to the object being moved by the GC in the middle of a reference comparison such that the comparison could fail even when both references should be referring to the same object?

ie, is there any situation under which the following code would not print "true"?

Foo foo = new Foo();
Foo bar = foo;
if(foo == bar) {
    System.out.println("true");
}

I tried googling this and the lack of reliable results leads me to believe that the person who stated this was wrong, but I did find an assortment of forum posts (like this one) that seemed to indicate that he was correct. But that thread also has people saying that it shouldn't be the case.

Kat
  • 4,645
  • 4
  • 29
  • 81
  • Yes, good question, because in some unreasonable way java compiler doesn't optimize this comparison – Andremoniy Jan 12 '16 at 09:15
  • 2
    Related question: [Is Object == Object safe in C#](http://stackoverflow.com/questions/22747768/is-object-object-safe-in-c-sharp) (by the one guy who claims that it isn't safe in java) – CodesInChaos Jan 12 '16 at 11:39
  • 6
    @CodesInChaos [That linked article](http://windward.net/blogs/java-programmers-never-use-object-object/#.VpVboOgrLuq) is so ridiculous it's almost funny. It would be absolutely funny if it wouldn't mislead other people into actually believing any word that guy says. – Voo Jan 12 '16 at 20:06
  • That linked article fortunately also contains a good refutation of what its OP says. – PJTraill Jan 13 '16 at 11:43

8 Answers8

47

Java Bytecode instructions are always atomic in relation to the GC (i.e. no cycle can happen while a single instruction is being executed).

The only time the GC will run is between two Bytecode instructions.

Looking at the bytecode that javac generates for the if instruction in your code we can simply check to see if a GC would have any effect:

// a GC here wouldn't change anything
ALOAD 1
// a GC cycle here would update all references accordingly, even the one on the stack
ALOAD 2
// same here. A GC cycle will update all references to the object on the stack
IF_ACMPNE L3
// this is the comparison of the two references. no cycle can happen while this comparison
// "is running" so there won't be any problems with this either

Aditionally, even if the GC were able to run during the execution of a bytecode instruction, the references of the object would not change. It's still the same object before and after the cycle.

So, in short the answer to your question is no, it will always output true.

EOF
  • 6,273
  • 2
  • 26
  • 50
mhlz
  • 3,497
  • 2
  • 23
  • 35
  • 2
    An interesting argument. Let's extend the question to `if(foo == bar && foo == bar)` ;-) – Marco13 Jan 12 '16 at 11:11
  • Well, as I said in the last paragraph a well behaving GC shouldn't change the outcome of a reference comparison regardless of when it was executed ; ) I just thought I'd provide a bit more background about the garbage collection process. – mhlz Jan 12 '16 at 11:13
  • Sure, I think the question is answered, I was just playing devils advocate on the bytecode instruction argument ;-) No offense – Marco13 Jan 12 '16 at 11:15
  • 12
    This answer would be better if you quoted the section of the specification that guarantees atomicity w.r.t. the GC. – CodesInChaos Jan 12 '16 at 11:37
  • 6
    Modern JVMs don't execute bytecode so this is rather besides the point. Also "the references of the object would not change" is wrong for every single JVM I've ever seen (I don't think there is any JVM out there where the reference of an object isn't its address). – Voo Jan 12 '16 at 20:00
  • 2
    `The only time the GC will run is between two Bytecode instructions.` This is not the case because a) this is not guaranteed by any spec and b) bytecodes do not map to hardware instructions in practice. You are arguing from implementation details. They do not matter for this questionand they are not temporally stable. What matters is what the Java language spec says. – usr Jan 12 '16 at 22:20
  • @Voo : There exists at least one JVM where references are doubly indirect. A reference is an index into a table of addresses. Certainly makes the GC's job easier since there can only ever be one copy of the address of an object. It's been a few years since I touched it and I don't recall enough details (right now) to identify it. – Eric Towers Jan 13 '16 at 05:36
  • @Eric Objective-c used to (does?) similar things, because it's rather necessary there - not so much in Java. The performance overhead (two indirections for object access!) are horrendous though. I assume some research JVM that just wanted to easiest possible approach to focus on other problems? – Voo Jan 13 '16 at 07:39
  • 3
    @Voo: Even if modern JVM's don't execute bytecode, they have to behave _as if_ they executed byte code. – MSalters Jan 13 '16 at 08:37
  • 1
    @MSalters True, but then you just have moved the goalpost to showing that "A GC cycle will update all references to the object on the stack" is true. Instead of that roundabout argument you can just show that the JLS requires the comparison to be true (and optionally also link to the rule that forbids optimisations from changing observable program behavior). – Voo Jan 13 '16 at 09:19
  • 1
    Although this answer arrives at the right conclusion, it's not correct that the GC can only execute inbetween bytecode instructions. The obvious counterexample is [`new`](https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-6.html#jvms-6.5.new), which may have to invoke the GC to free memory in order to allocate the new object. – Boann Jan 13 '16 at 21:41
  • So the reference in local variables will change after gc ? – shaoyihe Nov 03 '18 at 07:45
37

Source:

https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.21.3

The short answer is, looking at the java 8 specification: No.

The == operator will always perform object equality check (given that neither reference is null). Even if the object is moved, the object is still the same object.

If you see such an effect, you have just found a JVM bug. Go submit it.

It could, of course, be that some obscure implementation of the JVM does not enforce this for whatever strange performance reason. If that is the case, it would be wise to simply move on from that JVM...

Erik Nyström
  • 537
  • 4
  • 9
  • 6
    I don't see how §5.1.5 is relevant. The other reference is bang-on. Fundamentally, `==` is not comparing memory addresses, it's comparing object equality. As you say, a JVM that ever fails an `==` test because the address of the object changed is broken. :-) – T.J. Crowder Jan 12 '16 at 11:04
  • @T.J.Crowder Indeed. I also included 5.1.5 as it describes that the JVM is not supposed to do woodo magics when the types of objects may be difering. (I.E. an implicit cast is made). If others agree with you, I will edit my answer. – Erik Nyström Jan 12 '16 at 11:09
  • It seems you have misread the specification part you have linked. There is no exception for `null`, it’s still an object equality check if either or both operands are `null`. The specification mandates *either*, reference type or the null type for the operands. Basically, that excludes primitive types. – Holger Jan 14 '16 at 09:55
  • @Holger Indeed, although that is taken care of by the previous two clauses and https://docs.oracle.com/javase/specs/jls/se8/html/jls-5.html#jls-5.1.8. Perhaps it would be best to link to the entire 15.21, although the link does go straight for what the OP is asking about (two references). – Erik Nyström Jan 14 '16 at 18:48
  • I think, the direct link is perfect, the behavior for primitive types doesn’t need to be discussed for this question. Only your insertion “(given that neither reference is null)” is misleading. – Holger Jan 14 '16 at 19:22
12

TL;DR

You should not think about that kind of stuff what so ever, It's a dark place. Java has clearly stated out it's specifications and you should not doubt it, ever.

2.7. Representation of Objects

The Java Virtual Machine does not mandate any particular internal structure for objects.

Source: JVMS SE8.

I doubt it! If you may doubt this very basic operator you may find yourself doubt everything else, getting frustrated and paranoid with trust issues is not the place you want to be.

What if it happens to me? Such a bug should not be existed. The Oracle discussion you supplied reporting a bug that happened years ago and somehow discussion OP decided to pop that up for no reason, either without reliable documentation of such bug existed now days. However, if such bug or any others has occurred to you, please submit it here.

To let your worries go away, Java has adjusted the pointer to pointer approach into the JVM pointer table, you can read more about it's efficenty here.

Community
  • 1
  • 1
homerun
  • 19,837
  • 15
  • 45
  • 70
11

GCs only happen at points in the program where the state is well-defined and the JVM has exact knowledge where everything is in registers/the stack/on the heap so all references can be fixed up when an object gets moved.

I.e. they cannot occur between execution of arbitrary assembly instructions. Conceptually you can think of them occuring between bytecode instructions of the JVM with the GC adjusting all references that have been generated by previous instructions.

the8472
  • 40,999
  • 5
  • 70
  • 122
5

You are asking a question with a wrong premise. Since the == operator does not compare memory locations, it isn’t sensible to changes of memory location per se. The == operator, applied to references, compares the identity of the referred objects, regardless of how the JVM implements it.

To name an example that counteracts the usual understanding, a distributed JVM may have objects held in the RAM of different computers, including the possibility of local copies. So simply comparing addresses won’t work. Of course, it’s up to the JVM implementation to ensure that the semantics, as defined in the Java Language Specification, do not change.

If a particular JVM implementation implements a reference comparison by directly comparing memory locations of objects and has a garbage collector that can change memory locations, of course, it’s up to the JVM to ensure that these two features can’t interfere with each other in an incompatible way.

If you are curious on how this can work, e.g. inside optimized, JIT compiled code, the granularity isn’t as fine as you might think. Every sequential code, including forward branches, can be considered to run fast enough to allow to delay garbage collection to its completion. So garbage collection can’t happen at any time inside optimized code, but must be allowed at certain points, e.g.

  • backward branches (note that due to loop unrolling, not every loop iteration implies a backward branch)
  • memory allocations
  • thread synchronization actions
  • invoking a method that hasn’t been inlined/analyzed
  • maybe something special, I forgot

So the JVM emits code containing certain “safe points” at which it is known, which references are currently held, how to replace them, if necessary and, of course, changing locations has no impact on the correctness. Between these points, the code can run without having to care about the possibility of changing memory locations whereas the garbage collector will wait for code reaching a safe point when necessary, which is guaranteed to happen in finite, rather short time.

But, as said, these are implementation details. On the formal level, things like changing memory locations do not exist, so there is no need to explicitly specify that they are not allowed to change the semantics of Java code. No implementation detail is allowed to do that.

Holger
  • 285,553
  • 42
  • 434
  • 765
1

I understand you are asking this question after someone says it behaves that way, but really asking if it does behave that way isn't the right approach to evaluating what they said.

What you should really be asking (primarily yourself, others only if you can't decide on an answer) is whether it makes sense for the GC to be allowed to cause a comparison to fail that logically should succeed (basically any comparison that doesn't include a weak reference).

The answer to that is obviously "no", as it would break pretty much anything beyond "hello, world" and probably even that.

So, if allowed, it is a bug -- either in the spec or the implementation. Now since both the spec and the implementation were written by humans, it is possible such a bug exists. If so, it will be reported and almost certainly fixed.

jmoreno
  • 12,752
  • 4
  • 60
  • 91
0

No, because that would be flagrantly ridiculous and a patent bug.

The GC takes a great deal of care behind the scenes to avoid catastrophically breaking everything. In particular, it will only move objects when threads are paused at safepoints, which are specific places in the running code generated by the JVM for threads to be paused at. A thread at a safepoint is in a known state, where the positions of all the possible object references in registers and memory are known, so the GC can update them to point to the object's new address. Garbage collection won't break your comparison operations.

Boann
  • 48,794
  • 16
  • 117
  • 146
-9

Java object hold a reference to the "object" not to the memory space where the object is stored.

Java do this because it allow the JVM to manage memory usage by its own (e.g. Garbage collector) and to improve global usage without impacting the client program directly.

As instance for improvement, the first X int (I don't remember how much) are always allocated in memory to execute for loop fatser (ex: for (int i =0; i<10; i++))

And as example for object reference, just try to create an and try to print it

int[] i = {1,2,3};
System.out.println(i);

You will see that Java returning a something starting with [I@. It is saying that is point on a "array of int at" and then the reference to the object. Not the memory zone!

Kraiss
  • 919
  • 7
  • 22
  • 6
    Your first two paragraphs are correct and I was having trouble seeing why you'd got three downvotes, but then you veer off-point sharply starting with the third paragraph. – T.J. Crowder Jan 12 '16 at 11:06