Why does Iterator define the remove() operation?

Question

In C#, the IEnumerator interface defines a way to traverse a collection and look at the elements. I think this is tremendously useful because if you pass IEnumerable<T> to a method, it's not going to modify the original source.

However, in Java, Iterator defines the remove operation to (optionally!) allow deleting elements. There's no advantage in passing Iterable<T> to a method because that method can still modify the original collection.

remove's optionalness is an example of the refused bequest smell, but ignoring that (already discussed here) I'd be interested in the design decisions that prompted a remove event to be implemented on the interface.

What are the design decisions that led to remove being added to Iterator?

To put another way, what is the C# design decision that explicitly doesn't have remove defined on IEnumerator?

score 8 · Accepted Answer · answered Jul 25 '12 at 11:18

8

Iterator is able to remove elements during iteration. You cannot iterate collection using iterator and remove elements from target collection using remove() method of that collection. You will get ConcurrentModificationException on next call of Iterator.next() because iterator cannot know how exactly the collection was changed and cannot know how to continue to iterate.

When you are using remove() of iterator it knows how the collection was changed. Moreover actually you cannot remove any element of collection but only the current one. This simplifies continuation of iterating.

Concerning to advantages of passing iterator or Iterable: you can always use Collection.unmodifireableSet() or Collection.unmodifireableList() to prevent modification of your collection.

answered Jul 25 '12 at 11:18

AlexR

114,158
16
130
208

`unmodifiableXYZ` is just a runtime thing though, I like static invariants. I was more interested in the design decision that led to `remove` being added and perhaps why C# chose not too. – Jeff Foster Jul 25 '12 at 11:21
The worse thing to use `unmodifiableXYZ` is that if someone passed an `XYZ` that is already wrapped, you may wrap it twice, and you have no way to avoid it unless you use reflection to work it out. Although I believe the implementation would optimize this, it is still looks too bad to me. – Earth Engine Mar 15 '13 at 04:42
Looked at the OpenJDK implementation, it does not seem to avoid any multiple wrapping. That means in a complicated application you may finally wrap a simple `ArrayList` for 100 times or more. – Earth Engine Mar 15 '13 at 04:51

score 2 · Answer 2 · answered Jul 25 '12 at 11:21

It is probably due to the fact that removing items from a collection while iterating over it has always been a cause for bugs and strange behaviour. From reading the documentation it would suggest that Java enforces at runtime remove() is only called once per call to next() which makes me think it has just been added to prevent people messing up removing data from a list when iterating over it.

Stephen C · Answer 3 · 2012-07-25T11:41:30.180

1

There are situations where you want to be able to remove elements using the iterator because it is the most efficient way to do it. For example, when traversing a linked data structure (e.g. a linked list), removing using the iterator is an O(1) operation ... compared to O(N) via the List.remove() operations.

And of course, many collections are designed so that modifying the collection during a collection by any other means than Iterator.remove() will result in a ConcurrentModificationException.

If you have a situation where you don't want to allow modification via a collection iterator, wrapping it using Collection.unmodifiableXxxx and using it's iterator will have the desired effect. Alternatively, I think that Apache Commons provides a simple unmodifiable iterator wrapper.

By the way IEnumerable suffers from the same "smell" as Iterator. Take a look at the reset() method. I was also curious as to how the C# LinkedList class deals with the O(N) remove problem. It appears that it does this by exposing the internals of the list ... in the form of the First and Last properties whose values are LinkedListNode references. That violates another design principle ... and is (IMO) far more dangerous than Iterator.remove().

edited Jul 25 '12 at 11:41

answered Jul 25 '12 at 11:19

Stephen C

698,415
94
811
1,216

I realize that `reset` is the same smell, but (as I tried to state in the question), I'm not really interested in that. `reset` doesn't allow anyone to delete the contents of the container. My main point of contention is that C#'s iterable will leave the collection in the same way no matter what the method does with it (disregarding reflection). – Jeff Foster Jul 25 '12 at 11:37
@JeffFoster - as I said, Java iterators don't provide that guarantee ... but if you want it, there are simple and inexpensive ways to achieve it. – Stephen C Jul 25 '12 at 11:44
yup, I just want to get to why they don't provide that guarantee. What set of compromises did Java / C# make at the library design level in order to support (or not support) `remove`. – Jeff Foster Jul 25 '12 at 11:45
I think my answer covers that the design compromise. It is to enable efficient removal during a collection traversal without exposing the innards of the collection data structure to clients. On top of that, it is clear that the Java designers (obviously) *did not believe it was important* to provide an unmodifiable iterator. I kind of agree with that. – Stephen C Jul 25 '12 at 11:51
I don't think it does (at least not the other side of the coin). If it allows efficient removal, why did C# need not add it? – Jeff Foster Jul 26 '12 at 13:33
@JeffFoster - isn't it obvious? Having exposed the innards of linked lists for clients to modify directly, they can use *that* approach to do efficient removal, insertion, splicing and so on. – Stephen C Jul 26 '12 at 13:49

score -1 · Answer 4 · answered Jul 25 '12 at 11:27

-1

This is actually an awesome feature of Java. As you may well know, when iterating through a list in .NET to remove elements (of which there are a number of use cases for) you only have two options.

var listToRemove = new List<T>(originalList);
foreach (var item in originalList)
{
    ...
    if (...)
    {
        listToRemove.Add(item)
    }
    ...
}

foreach (var item in listToRemove)
{
    originalList.Remove(item);
}

or

var iterationList = new List<T>(originalList);
for (int i = 0; i < iterationList.Count; i++)
{
    ...
    if (...)
    {
        originalList.RemoveAt(i);
    }
    ...
}

Now, I prefer the second, but with Java I don't need all of that because while I'm on an item I can remove it and yet the iteration will continue! Honestly, though it may seem out of place, it's really an optimization in a lot of ways.

answered Jul 25 '12 at 11:27

Mike Perrenoud

66,820
29
157
232

On the contrary, I see it as abhorrent. It stops me being able to pass a set of elements to a method without knowing whether the method will toast my container :) I'm interested in the design decision that led to it being added, and particularly why it isn't in C#. – Jeff Foster Jul 25 '12 at 11:30
Follow @StephenC's post to keep it from `toasting your container` - but I think you're asking this question in the wrong place if you're trying to find out why the Java designers implemented it and why Microsoft did not. Know what I mean? But consider the above and think about how much more efficient that is. Also, is this a method you control or is it one from a framework you're consuming? – Mike Perrenoud Jul 25 '12 at 11:34
Your example is completely wrong. Both C# and Java provides `RemoveAt` method for `List`. So your Java style of code works for C# as well. If your C# code uses `foreach` structure you should use `for(:)` in Java as well. And you will see that you cannot actually call `remove` method in such a loop (as you do not have an explicit reference to the `Iterator`), so your "benefit" of Java gone away. – Earth Engine Mar 15 '13 at 04:34

Why does Iterator define the remove() operation?

4 Answers4