Weirdness in Equated Java Arrays: References vs. Pointers

Question

Having a problem understanding what's going on n the code below. The behavior of arrays c and d is what I would expect. But what's going on with a and b? (I also tried this with normal, scalar variables, and nothing surprising happens in either case.)

The output is copied to the RH comments.

import java.util.Arrays;

public class ArraysParadox {

    public static void main(String[] args) {

        int[] c = {1, 2, 3};
        int[] d = {6, 5, 4, 3};

        System.out.print("c:       ");
        System.out.println(Arrays.toString(c)); // c:       [1, 2, 3]

        System.out.print("d:       ");
        System.out.println(Arrays.toString(d)); // d:       [6, 5, 4, 3]

        System.out.println("--- swap ---");
        int[] tmp = c;
        c = d;
        d = tmp;    // <----- Magic?

        System.out.print("c' (=d): ");
        System.out.println(Arrays.toString(c)); // c' (=d): [6, 5, 4, 3]

        System.out.print("d' (=c): ");
        System.out.println(Arrays.toString(d)); // d' (=c): [1, 2, 3]

        System.out.println("--- c = 0 ---");
        Arrays.fill(c, 0);
        System.out.print("c (=0):  ");
        System.out.println(Arrays.toString(c)); // c (=0):  [0, 0, 0, 0]

        System.out.print("d (=c):  ");
        System.out.println(Arrays.toString(d)); // d (=c):  [1, 2, 3]

        System.out.println("--- d = 1 ---");
        Arrays.fill(d, 1);
        System.out.print("c (=d):  ");
        System.out.println(Arrays.toString(c)); // c (=d):  [0, 0, 0, 0]

        System.out.print("d (=1):  ");
        System.out.println(Arrays.toString(d)); // d (=1):  [1, 1, 1]

        System.out.println("================");

        int[] a = {1, 2, 3};
        int[] b = {6, 5, 4, 3};

        System.out.print("a:       ");
        System.out.println(Arrays.toString(a)); // a:       [1, 2, 3]

        System.out.print("b:       ");
        System.out.println(Arrays.toString(b)); // b:       [6, 5, 4, 3]

        a = b;
        System.out.print("a (=b):  ");
        System.out.println(Arrays.toString(a)); // a (=b):  [6, 5, 4, 3]

        System.out.println("--- α = 0 ---");
        Arrays.fill(a, 0);
        System.out.print("a (=0):  ");
        System.out.println(Arrays.toString(a)); // a (=0):  [0, 0, 0, 0]
        System.out.print("b (=a?): ");
        System.out.println(Arrays.toString(b)); // b (=a?): [0, 0, 0, 0]    ???

        System.out.println("--- b = 1 ---");
        Arrays.fill(b, 1);
        System.out.print("b (=1):  ");
        System.out.println(Arrays.toString(b)); // b (=1):  [1, 1, 1, 1]
        System.out.print("a (=b?): ");
        System.out.println(Arrays.toString(a)); // a (=b?): [1, 1, 1, 1]
    }
}

The swapability of c and d indicates pass-by-value according to this post: Java is Pass-by-Value, Dammit!. (I also looked at java array pass by reference does not work?, but I can't understand the asker's English, and the method call obscures the example.)

Notice that with the line d = tmp; commented out, c and d exhibit the same odd behavior as a and b. Still I don't know what to make of it.

Can anyone explain how a and b's behavior can be explained with pass-by-value?

Edit: Addendum

It turns out the main issue in my post is not pass-by-value, but aliasing. To be clear about the distinction between pass-by-value and pointers, I added the following method to my code and used it to (try to) swap c and d (suggested by an article linked by JavaDude's article linked above).

static <T> void swap (T c, T d) {
    T tmp = c;
    c = d;
    d = tmp;
}

The result is that c and d come back unchanged. This would have worked if Java (like C) passed along pointers to c and d to the method, but instead it simply passes their values, leaving the original variables unchanged.

Changing a = b to a = b.clone(); or to a = Arrays.copyOf(b, b.length); gives the behavior I was expecting. This code also works:

    int[] tmp = new int[b.length];
    System.arraycopy( b, 0, tmp, 0, b.length );
    a = tmp;

Relative timing descried here.

See my answer here: http://stackoverflow.com/questions/9404625/java-pass-by-reference/9404727#9404727 — Eng.Fouad, Jun 07 '12 at 00:03
@Eng.Fouad, thanks, that is helpful. Could you point me to an explanation of `setAttribute`? — JohnK, Jun 07 '12 at 13:33

Sergey Kalinichenko · Accepted Answer · 2012-06-07T13:48:25.577

11

There is nothing "weird" going on here: array variables are references to the actual arrays (also known as pointers in other languages). When you manipulate array variables, all you do is manipulating pointers.

When you assign an array variable to another one, you create an alias to the array pointed to by the variable you assign, and make the array previously pointed to by the variable being assigned eligible for garbage collection. Because the assignment a = b makes a an alias of b, filling b with data acts exactly the same as filling a with data: once the assignment is complete, a and b are merely two different names for the same thing.

As far as pass by value is concerned, none of it is going on in your example: the concept of passing by value applies only when you pass objects as parameters to the methods that you call. In your example, variables a, b, c, and d are not method parameters, they are local variables. You do pass them by reference to methods toString and fill (or more precisely, you pass by value the references to your objects to toString and fill, because in Java everything is passed by value), that is why modifications to your arrays done by fill are visible upon the return from the method.

edited Jun 07 '12 at 13:48

answered Jun 07 '12 at 00:02

Sergey Kalinichenko

714,442
84
1,110
1,523

2

To be clear, Java is entirely pass-by-value, but you always pass _references_ by value, not the objects themselves. (That's different from pass-by-reference.) – Louis Wasserman Jun 07 '12 at 01:23
@LouisWasserman You are right, this whole business of "pass reference by value" has a great potential of confusing things. Thanks for the note! – Sergey Kalinichenko Jun 07 '12 at 01:52
1

@JohnK The confusion comes from the fact that Java adopted the word *reference* to mean what's often meant by a *pointer*. There is a subtle difference: while you can say that "any manipulation done to the reference variable directly changes the original variable", you cannot say the same thing about pointers. They add an important limitation - manipulations change the original variable *except for reassignments*. In other words, *mutating* an object (e.g. filling an array with new values) will show through both variables, while reassigning one of them will break the alias instead. – Sergey Kalinichenko Jun 07 '12 at 12:56
Thank you, dasblinkenlight. I think my underlying problem is not understanding what we mean by "reference." What you say about it being an alias is making sense. The JavaDude doc I linked puts it this way, "A 'reference' is an alias to another variable. So any manipulation done to the reference variable directly changes the original variable." So it sounds like a **reference** works like a **pointer**, but without being a physical address in the machine (and so safe: cannot point outside the sandbox). Is that right? – JohnK Jun 07 '12 at 12:59
@JohnK Java references are like pointers to objects, not like pointers to pointers. They are implemented as addresses (physical or virtual, depending on the underlying architecture) but you can reassign them to point to another address. `a=b` copies the address from `b` into `a`, making an alias for the object pointed to by variable `b`, but not to the variable `b` itself. If I change the object pointed to by `b`, `a` will see the change, but if I change `b` itself (but not the object that it points to) then `a` will not see the change. – Sergey Kalinichenko Jun 07 '12 at 13:07
Thanks again, dasblinkenlight. This is making sense. If I may beg your further indulgence, a couple remaining issues. (1) I'm still unclear how `Arrays.fill(a, 0);` can change `a` if `a` is just passed by value. (2) Why does the swap of `c` and `d` (viz., `int[] tmp = c; c = d; d = tmp;`) **not** work like `a = b`? It would seem that `tmp` becomes an alias of `c`, then `c` becomes and alias of `d`, then `d` becomes an alias of `tmp`, so that all three are aliases of the same object. – JohnK Jun 07 '12 at 13:40
1

@JohnK `fill` can modify array elements because it gets a pointer to a mutable object. It cannot modify the array itself (e.g. it cannot make `a` point to an array with fewer elements or an array with more elements), but elements inside the existing array are fair game. Your swap sequence goes like this: `tmp` becomes an alias of `c`, then `c` stops being an alias of `tmp` and becomes and alias of `d` instead, then `d` stops being an alias of `c` and becomes an alias of `tmp`. – Sergey Kalinichenko Jun 07 '12 at 13:53

cheeken · Answer 2 · 2012-06-07T00:11:05.443

2

When you make an assignment like a = b;, if a and b are not primitives, then they now reference the same object. In your case, they now reference the same array. Any change you make to either one will also affect the other (because you're only updating one thing, and both a and b are pointing at it).

Note that this behavior is unrelated to how parameters are passed in Java.

edited Jun 07 '12 at 00:11

answered Jun 07 '12 at 00:04

cheeken

33,663
4
35
42

Thanks. Fwiw I tried making a and b Integer scalars, but it works just as if they were int scalars. – JohnK Jun 07 '12 at 12:56

score 1 · Answer 3 · answered Jun 07 '12 at 00:00

1

The array initializer is calling new under the hood (so in this case it's syntactic saccharine). Your swap nearer the top is just swapping references, the one below is performing exactly how you'd expect a reference to perform.

The linked article refers to parameters... In Java all parameters are by value, it's just that references themselves get passed by value i.e. changes to the REFERENCE (not its dereferenced content) won't be reflected outside the scope of the subroutine.

answered Jun 07 '12 at 00:00

Jeff Watkins

6,343
16
19

Hi Jeff, thanks for your reply. So when you say "The array initializer is calling new under the hood," you're saying `int[] tmp = c;` is first declaring `tmp` as a whole new array, before equating it to `c`? Viz. `int[] tmp = new int[dims of c]; tmp = c`. Still, how does that explain the odd behavior of `a` and `b`? Could you expand your answer? – JohnK Jun 07 '12 at 00:08
The "int[] a = { blah };" is actually doing "int[] a = new int[4]" and then adding the items in the initializer. Thus all assignments are at the reference level. – Jeff Watkins Jun 07 '12 at 00:18

ytoamn · Answer 4 · 2017-02-16T08:00:39.267

There is an important point of arrays that is often not taught or missed in java classes. When arrays are passed to a function, then another pointer is created to the same array ( the same pointer is never passed ). You can manipulate the array using both the pointers, but once you assign the second pointer to a new array in the called method and return back by void to calling function, then the original pointer still remains unchanged.

You can directly run the code here : https://www.compilejava.net/

import java.util.Arrays;

public class HelloWorld
{
    public static void main(String[] args)
    {
        int Main_Array[] = {20,19,18,4,16,15,14,4,12,11,9};
        Demo1.Demo1(Main_Array);
        // THE POINTER Main_Array IS NOT PASSED TO Demo1
        // A DIFFERENT POINTER TO THE SAME LOCATION OF Main_Array IS PASSED TO Demo1

        System.out.println("Main_Array = "+Arrays.toString(Main_Array));
        // outputs : Main_Array = [20, 19, 18, 4, 16, 15, 14, 4, 12, 11, 9]
        // Since Main_Array points to the original location,
        // I cannot access the results of Demo1 , Demo2 when they are void.
        // I can use array clone method in Demo1 to get the required result,
        // but it would be faster if Demo1 returned the result to main
    }
}

public class Demo1
{
    public static void Demo1(int A[])
    {
        int B[] = new int[A.length];
        System.out.println("B = "+Arrays.toString(B)); // output : B = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
        Demo2.Demo2(A,B);
        System.out.println("B = "+Arrays.toString(B)); // output : B = [9999, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
        System.out.println("A = "+Arrays.toString(A)); // output : A = [20, 19, 18, 4, 16, 15, 14, 4, 12, 11, 9]

        A = B;
        // A was pointing to location of Main_Array, now it points to location of B
        // Main_Array pointer still keeps pointing to the original location in void main

        System.out.println("A = "+Arrays.toString(A)); // output : A = [9999, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
        // Hence to access this result from main, I have to return it to main
    }
}
public class Demo2
{
    public static void Demo2(int AAA[],int BBB[])
    {
        BBB[0] = 9999;
        // BBB points to the same location as B in Demo1, so whatever I do
        // with BBB, I am manipulating the location. Since B points to the
        // same location, I can access the results from B
    }
}

Weirdness in Equated Java Arrays: References vs. Pointers

Edit: Addendum

4 Answers4

Linked

Related