311

Assuming that arraylist is defined as ArrayList<String> arraylist, is arraylist.removeAll(arraylist) equivalent to arraylist.clear()?

If so, can I assume that the clear() method is more efficient for emptying the array list?

Are there any caveats in using arraylist.removeAll(arraylist) instead of arraylist.clear()?

CaptJak
  • 3,592
  • 1
  • 29
  • 50
ateiob
  • 9,016
  • 10
  • 44
  • 55
  • A possible corollary to this question: When might one be used instead of the other? – Corey Ogburn Aug 11 '11 at 20:04
  • 3
    @Corey: when might one every want to use `arraylist.removeAll(arraylist)`? I see absolutely no reason to do that. – Joachim Sauer Aug 11 '11 at 20:07
  • @Joachim Sauer That's exactly what I wanted to verify. Thanks +2. But is the difference between `elementData[i] = null` and `e.remove()` significant? – ateiob Aug 11 '11 at 20:12
  • There's no sane reason to do `arrList.removeAll(arrList)` instead of `arrList.clear()`. `arrList1.removeAll(arrList2)` is a different matter. – Vlad Aug 11 '11 at 20:15
  • 3
    If only removeAll()'s implementation started with this line, then this whole discussion could have been so much more entertaining!!! `if (c == this && !isEmpty()) { clear(); return true; }`. I'll have to submit this to OpenJDK as a patch! ;-) – Julius Musseau Aug 11 '11 at 20:31
  • the reason you want to use ArrayList.removeAll() is if you have some Collection parameters: removeAll(Collection> c) witch by the way return an boolean – Java bee Oct 07 '20 at 14:57
  • @Java Can you please elaborate? Maybe even write an answer to this question, as the other answers haven't address the last question above, "Are there any caveats in using `arraylist.removeAll(arraylist)` instead of `arraylist.clear()`"? – auspicious99 May 13 '21 at 10:51
  • Array list. clear() is always better then removeAll() as the performance of clear is O(n) and the performance of removeAll is O(n^2). – Fakhar Ullah Aug 17 '21 at 07:29

9 Answers9

426

The source code for clear():

public void clear() {
    modCount++;

    // Let gc do its work
    for (int i = 0; i < size; i++)
        elementData[i] = null;

    size = 0;
}

The source code for removeAll()(As defined in AbstractCollection):

public boolean removeAll(Collection<?> c) {
    boolean modified = false;
    Iterator<?> e = iterator();
    while (e.hasNext()) {
        if (c.contains(e.next())) {
            e.remove();
            modified = true;
        }
    }
    return modified;
}

clear() is much faster since it doesn't have to deal with all those extra method calls.

And as Atrey points out, c.contains(..) increases the time complexity of removeAll to O(n2) as opposed to clear's O(n).

Anatolii
  • 14,139
  • 4
  • 35
  • 65
Jeffrey
  • 44,417
  • 8
  • 90
  • 141
  • 34
    A note that `c.contains(...)` squares the time complexity of the operation would make this answer complete. – Atreys Aug 11 '11 at 20:09
  • @Jeffrey What is the difference between `elementData[i] = null` and `e.remove()`? – ateiob Aug 11 '11 at 20:11
  • 8
    The source is strong in this one. (To all other answers: Use the source, Luke.) Notice how clear() could be implemented as just the one line, size=0; but garbage-collection wouldn't know to collect the elements in the unreachable portions of the array. – Julius Musseau Aug 11 '11 at 20:14
  • @ateiob `elementData[i]=null` clears the internal array and lets the garbage collector do its work. `e.remove()` (as stated in the [javadoc](http://download.oracle.com/javase/6/docs/api/java/util/Iterator.html)) removes from the underlying collection the last element returned by the iterator. (It probably calls `ArrayList.remove(Object)`.) – Jeffrey Aug 11 '11 at 20:17
  • 2
    e.remove() is way more complex! e.remove() also squares the complexity, just as c.contains(...) does. On an ArrayList, e.remove() calls ArrayList.remove(int index), which has to shift the remainder of the array over by one. – Julius Musseau Aug 11 '11 at 20:18
  • 1
    @ateiob e.remove() is two extra method calls, a range check, and an object return (internal to `AbstractList.Itr.remove()` and `ArrayList.remove(int)`), too – Atreys Aug 11 '11 at 20:19
  • @Julius Davies @ Atreys That's exactly what I wanted to find out. +1 to both of you. Now everything is clear. – ateiob Aug 11 '11 at 20:20
  • 2
    @julius If it did this: `size = 0; elementData = new Object[10];` all the rest would be garbage collected, since the backing array has no outside references. – corsiKa Aug 11 '11 at 20:25
  • @Totalfrickinrockstarfrommars, neat point, but what if the consumer initially used `new ArrayList(int initialCapacity)` instead of `new ArrayList()`? Unfortunately ArrayList does not remember non-default initial capacity. – Julius Musseau Aug 11 '11 at 20:38
  • @julius That's true, and I had checked to see if that exists too. In any event, they could keep the behavior exactly the same, but do it in `O(1)` with `elementData = new Object[elementData.length];` - I'm guessing they don't because it's very possible the structure could be taking up half or more of memory on small devices, which would cause this to incur an out of memory error. – corsiKa Aug 11 '11 at 20:40
  • 1
    @Totalfrickinrockstarfrommars, elementData = new Object[elementData.length] brings us back to O(n), because the JVM always makes sure to set every element of a new array to null. Still, I thought your original idea was excellent, and I don't mean to detract from it. We need a clear(newInitialCapacity) method!!! – Julius Musseau Aug 11 '11 at 21:13
63

The time complexity of ArrayList.clear() is O(n) and of removeAll is O(n^2).

So yes, ArrayList.clear is much faster.

Vasily Kabunov
  • 6,511
  • 13
  • 49
  • 53
Geoff
  • 3,129
  • 23
  • 11
20

The clear() method removes all the elements of a single ArrayList. It's a fast operation, as it just sets the array elements to null.

The removeAll(Collection) method, which is inherited from AbstractCollection, removes all the elements that are in the argument collection from the collection you call the method on. It's a relatively slow operation, as it has to search through one of the collections involved.

Ernest Friedman-Hill
  • 80,601
  • 10
  • 150
  • 186
  • I thought it just sets all, not some elements to null. If it's not the case, how it decided on which elements should be set to null? – Farid Oct 05 '19 at 14:10
  • 2
    @Farid sorry, my English is just too informal here. I indeed meant that it sets all of the elements to null. I’ll fix it! – Ernest Friedman-Hill Oct 05 '19 at 14:12
10

Unless there is a specific optimization that checks if the argument passed to removeAll() is the collection itself (and I highly doubt that such an optimization is there) it will be significantly slower than a simple .clear().

Apart from that (and at least equally important): arraylist.removeAll(arraylist) is just obtuse, confusing code. It is a very backwards way of saying "clear this collection". What advantage would it have over the very understandable arraylist.clear()?

Joachim Sauer
  • 302,674
  • 57
  • 556
  • 614
7

They serve different purposes. clear() clears an instance of the class, removeAll() removes all the given objects and returns the state of the operation.

lucapette
  • 20,564
  • 6
  • 65
  • 59
5

clear() will go through the underlying Array and set each entry to null;

removeAll(collection) will go through the ArrayList checking for collection and remove(Object) it if it exists.

I would imagine that clear() is way faster then removeAll because it's not comparing, etc.

Nicholas
  • 7,403
  • 10
  • 48
  • 76
2

Clear is faster because it does not loop over elements to delete. This method can assume that ALL elements can be deleted.

Remove all does not necessarily mean delete all elements in the list, only those provided as parameters SHOULD be delete. Hence, more effort is required to keep those which should not be deleted.

CLARIFICATION

By 'loop', I mean it does not have to check whether the element should be kept or not. It can set the reference to null without searching through the provided lists of elements to delete.

Clear IS faster than deleteall.

Jérôme Verstrynge
  • 57,710
  • 92
  • 283
  • 453
  • 1
    I'm pretty sure that `ArrayList.clear()` has to loop as well. – Joachim Sauer Aug 11 '11 at 20:04
  • @JVerstry Do you mean that clear() **doesn't delete** the elements it removes from the ArrayList? – ateiob Aug 11 '11 at 20:04
  • 1
    Wrong, clear does loop over the internal array and sets all references to null in order to let the garbage collector do its work. – devconsole Aug 11 '11 at 20:05
  • @ateiob It removes them from the list. If there is not other references to them in the code, the garbage collector will eventually collect them. Else, they will remain in memory. Clear deletes references to the items, not the items themselves. – Jérôme Verstrynge Aug 11 '11 at 20:07
  • 1
    @Joachim, @devconsole: I think he meant it won't have to loop/iterate over the list given as parameter. `target.removeAll(param)` will iterate over `param` and then call `target.contains(...)` which iterates over `target`. – Vlad Aug 11 '11 at 20:12
  • I am surprized to be downvoted on this one when I am saying nothing different than other answers. – Jérôme Verstrynge Aug 11 '11 at 20:20
  • 2
    -3 is a bit harsh. If JVerstry wanted, he could write his own Java implementation from scratch that didn't loop. clear() *can* feasibly be implemented in O(1), without a loop, whereas removeAll() MUST have some kind of O(n) algorithm, there is no way to satisfy the removeAll() API's contract without examining all the elements. – Julius Musseau Aug 11 '11 at 20:26
1

clear() will be much more efficient. It will simply remove each and every item. Using removeAll(arraylist) will take a lot more work because it will check every item in arraylist to see if it exists in arraylist before removing it.

CDelaney
  • 1,238
  • 4
  • 19
  • 36
-8

Array => once the space is allocated for an Array variable at the run time, the allocated space can not be extended or removed.

ArrayList => This is not the case in arraylist. ArrayList can grow and shrink at the run time. The allocated space can be minimized or maximized at the run time.

  • 1
    This doesn't answer the question which is the difference between ArrayList.clear() and ArrayList.removeAll(), not the difference between an Array and an ArrayList. – Pierre Jul 09 '14 at 14:06
  • 1
    This answer is unnecessary. That's not what the question is about. – Serafim Costa Jun 10 '16 at 19:01