9

During a code review using Sonar , the following code has been detected as a bad one:

ArrayList<String> ops = new ArrayList<String>();
ops.add("test");
ops.removeAll(ops);

Sonar is complaining about the removeAll on called by the collection on itself.

I agree that it's ugly but can this introduce bugs?

NB: This is not my code , I'm reviewing it .

Raedwald
  • 46,613
  • 43
  • 151
  • 237
isoman
  • 742
  • 2
  • 9
  • 34
  • 2
    What possible reason would you have for doing that rather than `ops.clear()`? – T.J. Crowder Feb 09 '16 at 14:20
  • 4
    Yes it can; you iterate over the same collection which you remove from, this can lead to a `ConcurrentModificationException`. – fge Feb 09 '16 at 14:21
  • Please see my modification. – isoman Feb 09 '16 at 14:22
  • The question still applies: Regardless of whether it will cause a problem at runtime, it's just a silly thing to do. It obfuscates its intent, has to work harder to reach the same result, etc. – T.J. Crowder Feb 09 '16 at 14:23
  • It's unclear and *can* lead to bugs. Just use different collections. If you're reviewing it, reject it. – Olivier Grégoire Feb 09 '16 at 14:24
  • 1
    silly sonar. defeat it by `ops2=ops; ops=removeAll(ops2);` ha! – ZhongYu Feb 09 '16 at 14:38
  • 1
    @bayou.io it is just silly work around that simply hided a bad code. In my opinion, reason why this should be changed is simple: result is equivalent if `clear()` method, however it is much more complex and time consuming process. An inefficient example is concatenation of many strings int one instead using `StringBuilder`. Bad way works, but it should be done better. – T.G Feb 09 '16 at 14:55
  • @T.G - you are right. maybe we can clone ops as ops2, then `removeall(ops2)` will be safe --- I was just joking:) – ZhongYu Feb 09 '16 at 14:57

5 Answers5

8

Yes, this will introduce bugs. The default removeAll works on the principle of an Iterator, if you modify the collection without using the iterator it will give a ConcurrentModificationException. If it gives this exception or not depends on the internal design of the Collection you are using, and cannot be relied on.

Even though the current version doesn't use an iterator(), this isn't documented, and Oracle MAY change this without notice.

To clear a collection, you can use .clear().

Ferrybig
  • 18,194
  • 6
  • 57
  • 79
  • 1
    Whether or not it throws an exception depends on both the collection's implementation of removeAll and whether its iterator correctly detects the concurrent modification. Regardless, it's never a good idea to do this when `.clear()` is guaranteed to always work –  Feb 09 '16 at 14:25
  • 1
    And just looked at `ArrayList`'s `removeAll`. No iterators. – T.J. Crowder Feb 09 '16 at 14:25
  • 1
    @T.J.Crowder It depends on the implementation, ArrayList and AbstractList doesn't have this bug for example. But this isn't documentated behaviour – Ferrybig Feb 09 '16 at 14:25
  • 1
    @Ferrybig: Similarly, `ConcurrentModificationException` isn't listed as a possibility for `removeAll`. But it **is** a runtime exception, so it doesn't have to be. To be clear: I'm not saying the code in the question is anything but a Very Bad Idea. – T.J. Crowder Feb 09 '16 at 14:28
8

The issue is whether a ConcurrentModificationException, or list corruption, or endless looping, or failure to remove entries, or similar may result.

ArrayList, specifically, in Oracle's JDK8, seems to be written such that those issues won't occur.

Does that mean that that code is okay, then?

No, it's not okay.

That code:

  • Relies on the implementation of the list's removeAll to be smart enough to handle a very strange use-case

  • Is unnecessarily complex to read and understand, thus creating maintenance issues

  • Is doing unnecessary work, thus taking longer to do its job than it needs to (not that this is likely to be a big deal)

You said this in in the context of code review. I'd flag it and talk with the author about why they used it, and explain why ops.clear(); or ops = new ArrayList<String>(); (depending on context) would almost certainly be a better choice, from a reliability, maintenance, and (very minor) performance perspective.

T.J. Crowder
  • 1,031,962
  • 187
  • 1,923
  • 1,875
  • and another reason to stick with clear is time complexity - http://stackoverflow.com/questions/7032070/what-is-the-difference-between-arraylist-clear-and-arraylist-removeall – Balaji Krishnan Feb 09 '16 at 14:32
  • In addition, this strange usage could indicate you are passing the wrong list, or calling `removeAll` on the wrong list. – Mark Rotteveel Feb 18 '16 at 15:48
4

This could absolutely introduce bugs. Specifically, depending on the implementation of a collection this could throw ConcurrentModificationException.

To understand how this could happen, consider this pseudo-implementation:

void removeAll(Collection<?> collection) {
    for (Object o : collection) {
         for (int i = 0 ; i != this.length() ; i++) {
             Object item = this.get(i);
             if (item.equals(o)) {
                 this.remove(i);
                 break;
             }
         }
    }
}

The moment we remove an item from this list, the iterator of collection becomes invalid. The outer for-each loop is unable to continue, causing an exception.

Of course the actual implementation is different, preventing this kind of bugs. However, the is no guarantee anywhere in the documentation to say that all implementations must be "defensive" in this way; hence the warning.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • but the OP does not have a single for-loop – cy3er Feb 09 '16 at 14:26
  • 1
    @cy3er this does not mean that the _implementation_ of .removeAll() does not contain such a loop – fge Feb 09 '16 at 14:27
  • @cy3er I did mention that this is not a real implementation of `removeAll`, only a possibility that *would* have a bug. The problem is not that the bug is there, but that the bug-free behavior is not guaranteed by documentation. – Sergey Kalinichenko Feb 09 '16 at 14:28
1

Actually, it depends on the version of jdk you are using. In jdk6, most of the collection class inherit the removeAll method from the AbstractCollection class:

public boolean removeAll(Collection<?> c) {
    boolean modified = false;
    Iterator<?> e = iterator();
    while (e.hasNext()) {
        if (c.contains(e.next())) {
        e.remove();
        modified = true;
        }
   }
   return modified;
}

So, in some scenario, this will cause a CME(ConcurrentModificationException), for example:

ArrayList<Integer> it = new ArrayList<Integer>();
it.add(1);
it.add(2);
it.add(3);
List<Integer> sub = it.subList(0,5);
it.removeAll(sub);

however, in jdk8, most collections has its own removeAll implementation, such as ArrayList. The ArrayList class has its own removeAll method,which call a private method batchRemove:

private boolean batchRemove(Collection<?> c, boolean complement) {
        final Object[] elementData = this.elementData;
        int r = 0, w = 0;
        boolean modified = false;
        try {
            for (; r < size; r++)
                if (c.contains(elementData[r]) == complement)
                    elementData[w++] = elementData[r];
        } finally {
            // Preserve behavioral compatibility with AbstractCollection,
            // even if c.contains() throws.
            if (r != size) {
                System.arraycopy(elementData, r,
                                 elementData, w,
                                 size - r);
                w += size - r;
            }
            if (w != size) {
                // clear to let GC do its work
                for (int i = w; i < size; i++)
                    elementData[i] = null;
                modCount += size - w;
                size = w;
                modified = true;
            }
        }
        return modified;
    }

This using for loop iterate over the underlying array, and changing the variable modCount only once after the end of this method, during the iteration of the sublist( invoking by c.contains(elementData[r])), the modCount is not changing, so no CME will be thrown.

Jade Tang
  • 321
  • 4
  • 23
0

More than bugs,it looks like the wrong usage of removeAll() on arrays. Java doc says "removeAll()" removes all of this collection's elements that are also contained in the specified collection. After this call returns, this collection will contain no elements in common with the specified collection.

So, if the intention is to clear all the elements, clear() is the method to call rather than removeAll. The latter should be used if the intention is to remove the elements that are also contained in the specified collection.

Hope this helps.

Mincy George
  • 101
  • 1