0

While writing a piece of code, I observed an unusual behaviour. There is a class object obj1 which has an array list of another class object obj2 called as list1. See the code for reference:

PriorityQueue<Obj2> pq  = new PriorityQueue<>(Comparator);
  pq.addAll(new ArrayList<>(obj1.getList()));
  Obj2 obj2 = pq.poll();
  obj2.setField("any value");
  System.out.println(obj2); 
  System.out.println(obj1.getList().get(0));

Both of the sout statement above prints the value.
Why is this happening? I changed the value of obj2 reference in pq and not in Obj1 itself

While adding elements to the pq, if we don't use new ArrayList<>() then it's understandable if the both the references are pointing to same object but I have created a new ArrayList to add in pq, still this happening.

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
java user
  • 67
  • 1
  • 9
  • 2
    You might create a second list, but all that does is contain copies of the references, effectively containing the same list elements. Creating a copy of the list does not create copies of each element in that list. – f1sh Aug 19 '21 at 12:28
  • @f1sh Okay so I assume to overcome this problem, we should create a new object and copy the values of obj1 to it? – java user Aug 19 '21 at 12:38

2 Answers2

0

The JavaDoc for ArrayList(Collection<? extends E> c) says:

Constructs a list containing the elements of the specified collection, in the order they are returned by the collection's iterator.

It doesn't say "copy".

There's also no "deep copy" mechanism in Java. Even if you use Object.clone (which in general you shouldn't), you will only copy the references inside this object. The references itself will still point to the original contents.

For example:

class Obj {
  String a;
  int b;
  OtherObj c;
}

In memory this will look like this:

[Reference To String a] [Value of int b] [Reference to OtherObj c]

(only primitive types will be stored directly inside an object, everything else is a Class and will be stored as a reference. Even the primitive wrappers like Integer are classes and will be stored as references, but those primitive wrappers are immutable, though you can't change their inner values)

Though if you create a copy of this object, you will get a new memory location for the copy, but that memory location then will contain the same data: [Reference To String a] [Value of int b] [Reference to OtherObj c].

The same happens with ArrayList. In memory it looks like this:

[Reference to Element 1] [Reference to Element 2] [Reference to Element 3] ...

And if you copy that list, you'll get a copy of that part in memory. But all the references will still point to the very same objects.

This all may change with the introduction of Project Valhalla and Value types. But that may still take months or years.

Benjamin M
  • 23,599
  • 32
  • 121
  • 201
  • Thanks for detailed explanation, so how do we overcome this behaviour? By creating a whole new copy of object using deep copy ? – java user Aug 19 '21 at 12:48
  • If you really want a deep copy, you have basically 2 options: `1.` DIY: Create some methods that will do the job. Or `2.` use an Object Mapper: Those tools are meant to map between different data representations, but most of them can also create deep copy. `...` Or you can use immutable data structures, which force you to create new objects for every data modification. If you're using Java 16 (with enable-preview) or Java 17 (in a few weeks) you can use `record` instead of `class` for your data, which will enforce immutability. But again: `record`s have no built-in feature for copy. – Benjamin M Aug 19 '21 at 12:56
0

How does java handle object references while dealing with ArrayLists?

It's all references. Everything. All the way down.

Except primitives. The primitive types are int, long, double, float, byte, short, boolean, and char. That's it. The list is hardcoded and you can't make new primitive types.

So, aside from those, it is all pointers. When you write:

MyFoo foo = new MyFoo();

That's just syntax sugar for 2 separate statements:

MyFoo foo;
foo = new MyFoo();

Imagine the heap is a gigantic whiteboard.

So what's happening here is:

  1. MyFoo foo; This makes a little postit for yourself. This postit is named 'foo', it is yours, you can't hand it to anybody else ('local variable' - hence the name 'local'), and it has just enough room to write a coordinate for that gigantic whiteboard, that's all it can hold. It is blank, for now.
  2. new MyFoo() this goes to the whiteboard, finds some blank space on it anywhere, and writes a box, and then in that box, room for all the fields of your MyFoo class. If any fields are non-primitive, it's just enough room to write coordinates. (Each and every object is its own little box on this whiteboard and could be anywhere on it).
  3. The expression new MyFoo() resolves to the coordinates of where you made that box. You then assign this to foo, so, copy down the location of that box on your little postit.

If you then do:

someMethod(foo);

What that does is: Grabs a new postit, copies those coordinates over to the new postit, and then hands the postit off to someMethod. Specifically:

  • Even if someMethod changes foo directly (foo =), that is: "They scratch out what was on the postit you gave them and write something new on it", which obviously has no effect whatsoever on your postit.
  • Once that method is done, they burn the postits. You never get them back. Which is fine, you gave them a copy.
  • If they FOLLOW the coordinates on that postit and take out their pen and edit the whiteboard, and then later on your follow YOUR postit, you will observe what they changed! . and [] are the dereference operators: That's java-ese for: "Take those coordinates, go over to the whiteboard and find the box, and now we do something to the stuff in the box', whereas = is "edit the postit, scratching out what was there and writing something else in".

With all that context:

  • obj1.getList() gets you the coordinates to the list object. This list object is simply a big sack of coordinates - of postits. NOT a list of Obj1s! A list of Obj1 references - of coordinates.
  • new ArrayList<>(that) makes a new arraylist (new box on the whiteboard), that constructor will dutifully copy each and every COORDINATE over. It does not copy each object. It can't, java has no idea how to copy arbitrary objects, and Lists can hold anything, so it doesn't know how.
  • You then 'poll' the top coordinate from this newly created list. Which is the same coordinate as what obj1.getList() has.
  • You then go to the whiteboard, following this coordinate (obj2.setField - I see a dot, so, that's 'follow the coords and get out your whiteboard pen'), and modify what is there.

Hopefully that clears up how it works. Keep thinking of that whiteboard. When reasoning about this stuff.

Solutions

The simplest solution is to adopt immutables as much as is reasonable. An immutable object is, effectively, the notion of writing the object in permanent marker. A string is immutable. it has no set or add methods at all. For example, str.toLowerCase() does not lowercase the string that str is pointing at. That method makes a new string instead. It's the equivalent of going to the whiteboard which has "hEllo!" written on it someplace, and then instead of wiping out the E and writing an e in its place (that'd be mutating, and no method in string lets you do this), toLowerCase() just draws a new little box on the whiteboard somewhere and copies the characters over, lowercasing them on the fly. The toLowerCase() call then returns the location of this new box.

If you apply the same ideas to public class Obj1, this problem goes away. So, don't call .setField, call .withField (which makes a clone but with that one field changed) or some such.

If that's not an option, you'd have to deep-clone the list, yes. This is incredibly annoying, because how deep does deep-clone mean? ArrayList itself can't simply deep-clone, you'd have to write it yourself. Something like:

List<Obj1> clone = new ArrayList<>();
for (Obj1 o : original) clone.add(new Obj1(o));

And you'd have to write the Object1(Object1 original) {} constructor yourself, copying each field. And, of course, for each non-primitive field pointing at a mutable object, you'd have to clone that too.

rzwitserloot
  • 85,357
  • 5
  • 51
  • 72