25

I have read some frameworks source code recently and noticed that they written the clear() method of a list-like data structure like this. Delete the elements one by one.

while (_arr.length > 0 )
{

    remove(_arr[0]);

}       

(maybe the above look like a little confusing,but it's because the native array type in this language itself was a dynamic array) or

 for (int i = 0; i < size; i++)
  { elementData[i] = null;}
 size = 0;

But i remembered i have written some code like this. The list decorated native array type,and I have written clear() method like this.

 _arr=new Array();
 _size=0;

instantiate a new native array type directly.

and this code are written in the language that has the garbage collection. so I think all elements finally will be collected,so why there need a loop?Is a new will be fast?

BlackCat
  • 449
  • 5
  • 16
  • sorry i don't writed the code clearly first time,all way set the size to 0 – BlackCat Aug 22 '17 at 10:40
  • out of curiosity - in which implementation did you find the `remove`-loop? – Hulk Aug 22 '17 at 10:41
  • a actionscript3 libray called flint – BlackCat Aug 22 '17 at 10:42
  • 1
    Related: [Why Was java.util.Arraylist#clear implemented the way it was in OpenJDK?](https://stackoverflow.com/questions/18232601/why-was-java-util-arraylistclear-implemented-the-way-it-was-in-openjdk) – jmehrens Aug 22 '17 at 15:58

6 Answers6

20

I guess the motivation is re-using the existing backing array instead of allocating a new one. This is important especially when the backing array is very large (which in some rare cases can even mean the new array cannot be allocated before the old one is garbage collected).

Allocating a new array (and garbage collecting the old one) is probably more time consuming than iterating over the existing array and setting the references of all the elements to null.

EDIT: As mentioned in the comments, setting the references to null in an array based List is not enough. You also have to indicate that the List is empty. In java.util.ArrayList this is done by setting the size property to 0.

Eran
  • 387,369
  • 54
  • 702
  • 768
  • Setting all the elements to null doesn't clear the collection. It only changes the element values. You have to change the size. Setting the elements to null is nice for garbage collection purposes, but it is neither necessary nor sufficient. – user207421 Aug 22 '17 at 10:24
  • @EJP Well, you also have to mark the collection as empty. For example, for ArrayList you set the size property to 0. – Eran Aug 22 '17 at 10:26
  • Exactly so. I don't see that in your answer. – user207421 Aug 22 '17 at 10:28
  • 1
    @EJP The question was about why the loop that sets the references to null was necessary, so I didn't feel it was necessary to mention the size property. – Eran Aug 22 '17 at 10:30
  • 1
    @Eran btw *for primitives* you are also paying the price for zeroing the array when allocating a new one; this adds to the total price – Eugene Aug 22 '17 at 11:55
  • 1
    @Eugene Do you mean for collections having primitive elements? As far as I know there are no such collections in `java.util`. – Eran Aug 22 '17 at 11:59
  • 1
    @Eran no, I only meant *in general*; not directly related to this answer – Eugene Aug 22 '17 at 12:00
  • @Eugene Oh, I see. Thanks for the comment. – Eran Aug 22 '17 at 12:01
  • @Eran the `EDIT` is spot-on btw! `ArrayList#clear` does this too `for (int to = size, i = size = 0; i < to; i++) {es[i] = null;}` one plus – Eugene Aug 22 '17 at 12:09
  • "Allocating a new array (and garbage collecting the old one) is probably more time consuming than iterating over the existing array" how? – CodeYogi Aug 30 '17 at 20:19
7

In some contexts, it is more efficient to clear and reuse the backing array. (Obviously, you need a size field and you need to set that to zero when you clear the list!)

  • Clearing reduces the garbage generated when you empty the list.
  • If you grow the list incrementally (i.e. without a capacity hint), clearing also reduces garbage generated by reallocations.

But the flipside is that it isn't always a good idea to clear. For example;

  • If you call clear on a ArrayList, the size of the backing array remains the same size as it was when the list was full. If the list was very long, the array can be very big big. And if you never need the list to be that long again, the large array can waste a lot of space.

  • The GC has to check every cell in an ArrayList's backing array anyway, irrespective of the current value of the size field. And it needs to copy the entire array to the "to" space.

  • That also means that you need to assign null to the "cleared" cells to avoid memory leaks.

  • There can be secondary performance issues due to generational and/or locality factors if you are continually putting "new" objects into a large "old" array.

In short, if an array-backed list is likely to survive multiple GC cycles, then clearing (i.e. the procedure of clearing out and reusing the backing array) is a dubious proposition.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • Your last sentence confused me. You must clear the array cells in order to prevent memory leaks (we surely agree). The problem is with clearing and re-using the array instead of simply forgetting it and creating a new one. – maaartinus Aug 22 '17 at 18:11
  • I may have been skimming through the text a bit too fast, but yes: the clearing is rather mandatory when you want to reuse the array, so I'd use either lone "reusing" or "clearing and reusing". IMHO now it's better. – maaartinus Aug 23 '17 at 00:41
  • OK ... lets call it "addressed". Thanks for your feedback. – Stephen C Aug 23 '17 at 01:22
4

Should probably be a comment, but it will not fit as such I assume.

There is also the fact that besides allocating an array might be too expensive and clearing the previous array would be cheaper, there is also the fact that allocating an array in the form new byte[100] or whatever has to also do the zeroing the contents, which might be expensive also.

As such in java-9 String concatenation uses the UNSAFE.allocateUninitializedArray that specifically says Allocates an array of a given type, but does not do zeroing. and of course also adds:

This method should only be used in the very rare cases where a high-performance code overwrites the destination array completely, and compilers cannot assist in zeroing elimination. In an overwhelming majority of cases, a normal Java allocation should be used instead.

And this is how the actual method looks like:

    @ForceInline
    private static byte[] newArray(int length, byte coder) {
        return (byte[]) UNSAFE.allocateUninitializedArray(byte.class, length << coder);
    }

This is just to prove that there are cases where it is in fact too expensive to create arrays and it's cheaper not to. Especially since arrays might require contiguous memory allocation - but this is not mandated by the spec.

Eugene
  • 117,005
  • 15
  • 201
  • 306
2

Agree with @Eran response on creating a new array or re-using the existing one!

I want to add one more info here.

The source code for clear() is as below:

public void clear() {
    modCount++;
    for (int i = 0; i < size; i++)
        elementData[i] = null;

    size = 0;
}

The source code for removeAll()(defined in AbstractCollection):

public boolean removeAll(Collection<?> c) {
    boolean modified = false;
    Iterator<?> e = iterator();
    while (e.hasNext()) {
        if (c.contains(e.next())) {
            e.remove();
            modified = true;
        }
    }
    return modified;
}

clear() is of course faster since it doesn't have to deal with all those extra calls. So It's better to set the element to null instead of removing it.

nagendra547
  • 5,672
  • 3
  • 29
  • 43
  • 2
    Setting the elements to null is semantically incorrect without setting the size to zero. – user207421 Aug 22 '17 at 10:26
  • 2
    `removeAll` does *not* clear the collection (in general), it only removes the elements in the given `Collection> c`, but leaves everything else as it was. – Hulk Aug 22 '17 at 10:34
  • 2
    Indeed. The `removeAll` method is not relevant to the question asked. – Stephen C Aug 22 '17 at 10:37
  • Question is clearly indicating removing all element one by one. That's why I mentioned removeAll – nagendra547 Aug 22 '17 at 12:06
  • The question asks which is better, using a new array or setting everything in the original to `null`. You don't answer that question, and don't explain in the answer itself why you use `removeAll`. And I agree that `removeAll` is not relevant here. – Ivo Aug 22 '17 at 21:41
2

In my opinion When creating a new Array and instantiating it with zeros, it should be slower. In this case the initialization and the iteration is done to set default value. In the case of using an existing array, only the iteration bit is done. Therefore using an existing array and iterate over it should be faster.

In our project we do frequently create array object during some calculation in a long running batch process.After create a pool from where you can get and return after use show a significant improvement.

gati sahu
  • 2,576
  • 2
  • 10
  • 16
-1

Neither. Just set _arr to an empty array. You should certainly not call remove() in a loop, for efficiency reasons, and setting all the elements to null is semantically incorrect.

user207421
  • 305,947
  • 44
  • 307
  • 483
  • I'm not the downvoter, but: Neither `ArrayList` (nulls the elements) nor `LinkedList` (nulls all links between Nodes) currently chooses to stick to the absolute minimum that would strictly be required. So it seems there are non-trivial performance considerations to take into account, and the runtime of `clear()` is only one of the aspects (and probably not the most important one). – Hulk Aug 22 '17 at 10:55
  • 1
    Another problem. Setting elements to null is not semantically incorrect when combined with other actions. It is not necessary to rub the OP's nose in it .... for an obvious oversight. – Stephen C Aug 22 '17 at 12:04
  • @StephenC It is semantically incorrect by itself, which is how it was presented in the OP's OP. It is neither necessary nor sufficient. Your choice of terminology is merely bizarre. – user207421 Aug 22 '17 at 12:35
  • What? Don't you know what "rubbing someone's nose in it" means? Look it up in a dictionary. And your choice of insults is merely unimaginative! – Stephen C Aug 22 '17 at 13:09