10

I would think that Iterator.copy() would be quite a handy function. You could implement iterator filters in a much better way.

For example, the only reason in Googles Java Collection for the filter (and similar) functions to use UnmodifiableIterator (which is just an Iterator without remove) is because you cannot implement such a filter Iterator otherwise without being able to copy it at some point. (Really, that is not possible with the current interface; try yourself.)

Another advantage would be that you could use an iterator in a for-each-loop: Because a copy-able iterator would automatically also be iterable. See also this question. Right now, the main design reason to not allow this is because an Iterator which implements Iterable and Iterator<T> iterator() { return this; } would invalidate the iterator. By having a copy function, it is as simple as Iterator<T> iterator() { return copy(); } and it would not invalidate the original iterator. Thus there is no reason anymore to not allow this.

Is there any reason? Just to make it less complicated to implement it?

Community
  • 1
  • 1
Albert
  • 65,406
  • 61
  • 242
  • 386
  • 1
    So with your suggestion, all existing implementors of Iterator would now have to implement a new method? That will break a lot of code... – Kirk Woll Sep 30 '10 at 15:12
  • 1
    ... especially for those folks who already implemented custom iterators with an additional "copy" method – Andreas Dolk Sep 30 '10 at 15:17
  • @Kirk: It is not a suggestion, it is the question why it has not been like this in the first place. – Albert Sep 30 '10 at 15:39
  • 3
    fair enough, but the same point still applies. If it was like that from the beginning, *every* time you implement an Iterator you would now have to implement *another* method. Having to stub out `remove()` is already annoying enough. In other languages (such as C#) it's possible for third-parties to enhance the behavior of pre-existing interfaces by exposing new methods on them that weren't put there by the author. (i.e. all LINQ operators) "copy" would be a fine candidate if such a facility were available in Java. Sadly there is not. – Kirk Woll Sep 30 '10 at 15:45

11 Answers11

10

Although they usually are, Iterators do not theoretically have to be linked to a collection. The copy method over an input stream, for instance, would be difficult to implement, and would very easily cause obscure memory problems.

ILMTitan
  • 10,751
  • 3
  • 30
  • 46
  • When copying an iterator, I don't meant to copy the underlying collection. Naturally, they would link of course to the same collection. (You have that also in other languages. That is just the most natural behavior.) – Albert Sep 30 '10 at 16:00
  • 2
    But streams don't hold very much data, so to copy a stream based iterator, one would have to store that data so it could be re-iterated. – ILMTitan Sep 30 '10 at 18:00
  • But you can already create multiple separate iterators of the same stream. Copying an iterator would naturally behave in the same way as another iterator on the same stream which was created in a somewhat other way. – Albert Sep 30 '10 at 18:18
  • 2
    I would think that an iterator and its copy would always return the same sequence of objects. Two different iterators from the same stream, however, I would expect to always pull the next element from the stream (leaving aside that two living iterators of the same stream would have all sorts of concurrency issues). – ILMTitan Sep 30 '10 at 20:13
  • 1
    You don't create an `Iterator` on a stream. An `Iterator` *is* the stream. If there is functionality to get another stream from the same `Iterable`, then it belongs on the `Iterable`, not on the `Iterator`. The `Iterator` has no business knowing anything about the implementation of the `Iterable` or about other streams. When you have a `Collection` with a simple cursor it seems easy to have copyable `Iterators`, not so for less predictable sources. – Christoffer Hammarström Feb 20 '12 at 23:07
  • @ILMTitan but then the type of all the elements of the stream would need to copyable as well. So your proposal destroys orthogonality. – Shelby Moore III Apr 22 '16 at 15:45
  • @ShelbyMooreIII 1. I was envisioning returning references to the same object. 2. It was a counter example to explain why the proposed copy method is problematic. If it is not clear how it should work in this instance .... good! – ILMTitan Apr 22 '16 at 17:47
  • @ILMTitan, yeah :) – Shelby Moore III Apr 23 '16 at 17:39
5

An Iterator represents a position in a stream from a source (Iterable in java speak), and there is no guarantee that it is possible to copy or even access the source of the stream.

For example, you could be iterating over bytes as they are streamed from a webserver, in which case it would be impossible to tell the webserver mid-stream to "From this position on, i want you to send me the same bytes twice, but asynchronously as i request them."

There is only the one stream, and it can't be copied.

The fact that most of the Iterators you usually see are over a Collection, is incidental.

Christoffer Hammarström
  • 27,242
  • 4
  • 49
  • 58
  • Thanks for the explanation!. Then `CollectionIterator: public Iterator` should be present in the Classes Hierarchy. Just Java unlike C++ has some missed aspects in the API design, which prevents doing things optimally for the particular case... – luart Aug 21 '17 at 15:36
2

The only reason why Google have UnmodifiableIterator is to basically guarantee immutability in their collections. They're making sure that there's no way that you can change the internal state of a collection.

Don't forget that the original idea for an iterator is that it's a pointer to the current element during transveral, and it manages to next/previous transversal (for reverse for doubly-linked iterators) to the element next/previous to it.

There is no actual reason why iterators aren't Cloneable, quite simply, cloning an iterator will still mean having an iterator pointing to the same collection elements (except it now lives in 2 different address space). Unless you want the cloned iterator to point to another collections, there is no point.

Buhake Sindi
  • 87,898
  • 29
  • 167
  • 228
  • No, that is not the reason why they use `UnmodifiableIterator` for functions like `filter`. There is just no reason to not have `filter` or similar functions also for normal iterators. By doing that, it would also work for `Unm..Iterator` and it would be more general. The point is, this is not possible because `Iterator` is not copyable. Check out their implementation. Or try to implement `filter` yourself on a normal `Iterator`. – Albert Sep 30 '10 at 15:25
  • @Albert, Don't forget that Google strictly follow Joshua Bloch Collections API (which is standard in Java) and thus is compatible with the java JDK. `ImmutableMap`, `ImmutableList`, `ImmutableSet` all are immutable collections and according to the Collections API, you need to return an iterator. If you can provide a non-modifiable iterator, I can simple do `iterator.remove()` and the state of the collection has changed. That's not what Google wanted. Download Guava and read the source code. – Buhake Sindi Sep 30 '10 at 15:30
  • Elite: I have. That was also not my point. My point is, it is not possible to implement `filter` otherwise. If it would be, i.e. if there would be a `copy`, they would have made it more general. And by using `Unm..Iterator` together with that more general `filter`, it would still be unmodifiable. – Albert Sep 30 '10 at 15:41
1

You can always implement your own CopyableIterator that implements Iterator. And then you can do

new CopyableItereator(collection);

The class would be like this

class CopyableIterator implements Iterator{
Iterator iterator;
Collection collection;
int index=0;

public CopyableIterator(Collection collection){
super();
this.collection = collection;
this.iterator = collection.iterator();
}

public CopyableIterator(Collection collection, int index){
super();
this.collection =collection;
this.iterator = collection.iterator();
this.advanceToIndex(iterator,index); //This function just moves the iterator till the index.
this.index=index;
}

//Override the functions of Iterator here returning iterator.function()

@Override
public Object next(){
index++;
return this.iterator.next();
}

public CopyableIterator copy(){
return new CopyableIterator(this.collection,this.index)

}

}

Disclaimer: This is roughly the class. It has not been tested.

Buhake Sindi
  • 87,898
  • 29
  • 167
  • 228
pakore
  • 11,395
  • 12
  • 43
  • 62
1

I wanted something like this, here is what I've done (based on some work done on Lambdaj).
The main flaw is that this Iterator will basically fill a List with all the supposed content of the Iterator which could be really heavy in memory.

Why did I used a List, because sometimes an Iterator iterates in a specific order, so the "sub-Iterators" must do the same (and the ListIterator really helps me here).

public class IterableIterator<T> implements Iterable<T>, Iterator<T> {
    //The content of the given iterator. Will be filled by its iterators.
    private final List<T> iteratorContent = new ArrayList<T>();
    private final Iterator<T> originalIterator;
    private final Iterator<T> innerIterator;

    public IterableIterator(Iterator<T> originalIterator) {
        this(originalIterator, false);
    }

    public IterableIterator(Iterator<T> originalIterator, boolean cache) {
        if (originalIterator == null) {
            throw new IllegalArgumentException("Parameter can't be null");
        }

        this.originalIterator = originalIterator;
        if (cache) {
            while (originalIterator.hasNext()) {
                iteratorContent.add(originalIterator.next());
            }
        }

        innerIterator = iterator();
    }

    @Override
    public Iterator<T> iterator() {
        return new IteratorIterator();
    }

    @Override
    public boolean hasNext() {
        return innerIterator.hasNext();
    }

    @Override
    public T next() {
        return innerIterator.next();
    }

    @Override
    public void remove() {
        innerIterator.remove();
    }

    private class IteratorIterator implements Iterator<T> {
        private ListIterator<T> innerIterator = iteratorContent.listIterator();

        @Override
        public boolean hasNext() {
            return innerIterator.hasNext() || originalIterator.hasNext();
        }

        @Override
        public T next() {
            if (!innerIterator.hasNext() && originalIterator.hasNext()) {
                T item;
                synchronized (originalIterator) {
                    item = originalIterator.next();
                    iteratorContent.add(item);
                }
                innerIterator = iteratorContent.listIterator(innerIterator.nextIndex());
            }
            if (innerIterator.hasNext()) {
                try {
                    return innerIterator.next();
                } catch (ConcurrentModificationException e) {
                    //Quick and dirty solution if you have a concurrent modification.
                    //It can't happen from the outside, so you can easily suppose that another originalIterator
                    //from this class has been called and had added elements to the list.
                    //Best thing to do, reset the originalIterator to the current position.
                    innerIterator = iteratorContent.listIterator(innerIterator.nextIndex());
                    return innerIterator.next();
                }
            }

            throw new NoSuchElementException();
        }

        @Override
        public void remove() {
            throw new UnsupportedOperationException();
        }
    }
}
Colin Hebert
  • 91,525
  • 15
  • 160
  • 151
1

As a simplistic example of why you would want to copy an iterator, consider the following code which finds the first pair of matching values in a single array.

for(int i=0;i<size;i++)
{
  x = array[i];

  for(int j=i+1;j<size;j++)
  {
    y = array[j];
    if(x == y)
    {
      doSomething();
      break;
    }
}

Note the "j=i+1". That's where you run into problems with an iterator. Oh well, workarounds are fairly common in Java it seems...

Jin
  • 320
  • 2
  • 7
  • (Below) Bad design decision? Not necessarily. Sounds like it's too big a pain to be worth the benefit. It doesn't matter anyway, just copy the items you care about into a flat array and manage it yourself. If int is your iterator, you're fine. – Jin Oct 22 '11 at 09:38
0

ILMTitan's and Christoffer Hammarström's imply but don't state concretely that copying the stream may not be possible because it requires that the stream elements have an implementation of a copy function, in order to save the state that a copyable iterator would require. Realize that elements could be mutable (or reference dynamic values) and they may reference other structures and semantics which require a customized copy function.

Thus copyable iterators is not orthogonal to copyable stream elements, so this is why copyable iterators are not possible in general.

Another more obscure reason is that the act of copying has side-effects on memory allocation and deallocation. Even the stream elements' copy function(s) might have other side effects.

Another reason is that some low-level optimizations might not be possible when compiling to assembly language.

Shelby Moore III
  • 6,063
  • 1
  • 33
  • 36
  • 1
    @YvetteColomb I posted [a comment there](http://meta.stackoverflow.com/questions/342784/can-we-keep-the-good-results-to-an-off-topic-question-on-the-site#comment449600_342784). Thanks. – Shelby Moore III Feb 19 '17 at 07:31
0

Is there any reason? Just to make it less complicated to implement it?

It would be simple to design and implement an Iterator wrapper class that supported a copy operation. I'm not sure it would be generally useful though, not least because in the general case it would be an expensive operation. This alone would be sufficient reason for the Java designers to not want to add copy() to the Iterator interface.

FOLLOWUP

This is the kind of thing I'm thinking of:

public class CopyableIterator<T> implements Iterator<T> {
    private Iterator<T> it;
    private List<T> copy = new ArrayList<T>();
    private int pos;
    public CopyableIterator(Iterator<T> it) {
        while (it.hasNext()) {
            copy.append(it.next());
        }
        this.it = copy.iterator();
    }
    public T next() {
        T res = next();
        pos++;
        return res;
    }
    public boolean hasNext() {
        return it.hasNext();
    }
    public Iterator<T> copy() {
        return copy.sublist(pos, copy.size()).iterator();
    }
    public void remove() {
        throw new UnsupportedOperationException();
    }
}

The reasoning is this:

  • If I'm wrapping an opaque Iterator, then the only way I can copy it is by reading it using next() and hasNext() and constructing the copy Iterator from that.

  • But I have to do that before I start using the original iterator.

  • The simple way to do that is to make that copy of the iterator's content before I start using it. (It could possibly be done with a lazy incremental copying, but the implementation could get very complicated ... especially when you consider copying copied iterators.)

The method proposed in the other answer is limited to plain collection iterators. If you have a wrapped iterator, or an iterator from some other source that (for instance) doesn't implement Iterable, then you are toasted.

And even with that precondition, the method above doesn't return a true copy of the iterator. Rather, it returns a new iterator for the underlying collection. That is an important distinction. Unless you actually copy the iterated element references, there's no guarantee that the iterators will return the same sequence. Look at the documented behaviour of the iterators for the Concurrent... collection types.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • `Iterator` wrapper around what? And why would it be expensive? I cannot think of a single case where copying an iterator would be an expensive operation (as it is not much more than a pointer). – Albert Sep 30 '10 at 16:02
  • An Iterator wrapper around an Iterator. And it would be expensive because it has to keep its own internal copy of the iterated objects. The only cases where you can implement `copy` efficiently are for iterators that directly iterate an in-memory collection or a random access (seekable) stream. – Stephen C Sep 30 '10 at 16:23
  • Why do they need a copy of the iterated objects? – Albert Sep 30 '10 at 16:43
0

What exactly would it mean to copy an Iterator? Do you mean than it should be able to create a new Iterator just like itself except starting at the beginning? That's the responsibility of an Iterable... it doesn't make sense to duplicate that functionality, especially given the stateful nature of iterators... it would just confuse things.

What would you expect to happen if you wrote:

Iterator<Foo> iter = someIterable.iterator();
iter.next();
iter.next();
for (Foo foo : iter) {
  ...
}

Would you expect the for loop to iterate over every item the iterator would return, or every one except the first two? Would you expect the iterator to be empty after the for loop completes?

ColinD
  • 108,630
  • 30
  • 201
  • 202
  • A copy of an iterator would just mean the most natural thing: having an independent copy of it (like in most other languages which have an iterator concept). I.e. advancing the first one would not advance the second one. – Albert Sep 30 '10 at 15:27
  • And what I would expect is quite obvious and natural: The for-loop would iterate from third element to the end. – Albert Sep 30 '10 at 15:42
  • @Albert: While it seems obvious and natural to you, I don't think it can be objectively said that it IS natural. The expectation with an iterator is that as you iterate through it, its cursor moves forward. After you've iterated to the end, you're done. It would be unnatural to iterate through the elements in a for loop and have the iterator still at the beginning after it completes. That said, I think the main issue is the extra burden on implementations of `Iterator`. – ColinD Sep 30 '10 at 15:57
  • @CollinD I think Albert means a bit different thing than you do. The iterator copying would be handy if you for example want to go through a list of numbers and mark those that are divisible by 3 by storing their iterator to some other list for further processing. You cannot really implement that without copying the current iterator in the for loop. – Karel Petranek Sep 30 '10 at 17:51
  • @dark_charlie: I'm not sure what you mean by "storing their iterator to some other list". If what you mean is that you want to get a list of numbers that are divisible by 3 from the iterator, you could do that just fine without copying the iterator, unless you're saying you want the iterator to remain unchanged. In that case, the object you're working with should probably be an `Iterable` and not an `Iterator` to begin with since that's exactly the function of an `Iterable`! – ColinD Sep 30 '10 at 19:41
  • The numbers probably weren't a good example. Consider a case when you loop over a list and need to make "bookmarks" when you find an interesting object. You do not want to save references to the interesting objects themselves because you want to be able to take the bookmark and find an item in the list that follows/precedes it (for example). The most elegant way to achieve this is to store a copy of the iterator as the bookmark. – Karel Petranek Sep 30 '10 at 21:00
  • @dark_charlie: Even if you could make a copy of an iterator, that wouldn't work well for your example because iterators are for iterating, not marking specific locations in a list. If you found an item and made a copy of the iterator there, there wouldn't be any convenient way to get the item you're "bookmarking" because an iterator only has `next()` (`prev()` too for list iterators) and no `current()`. Iterator cursors are always BETWEEN items, not on them. That only differs for `remove()`, which uses the most recent item the iterator returned. – ColinD Sep 30 '10 at 21:21
  • @ColinD You can always dereference an iterator (what would be the point of it if that weren't possible), therefore it can be used for marking positions. As you say, in Java this is done using the next() method which is another weird design in Java iterators - the method actually has two responsibilities. This, of course, makes "bookmarking" more difficult. What do you suggest to use for the "bookmarking" if not iterators? What objects should one store instead? – Karel Petranek Sep 30 '10 at 21:57
  • @dark_charlie: There's nothing weird about `next()` if you consider iterators as objects that are solely intended for _iteration_ use, which they are. I'd say that if you have some concept of a specific location in the sequence, you should probably be using a `List` and storing indices in that list as bookmarks. If you're dealing with something other than a list, you may need to come up with some other solution. Iterators just aren't intended for such a thing. – ColinD Sep 30 '10 at 22:06
  • @ColinD The Java iterators apparently are not. In other languages the iterator concept is used as a more general term, more like pointers to the sequence that can be dereferenced, incremented (and possibly decremented). Is there any generic interface in Java that serves this generalized "bookmarking" purpose? I suppose not. And that's what Albert's question is basically about - why not? Storing indices is not really a solution, just a workaround. – Karel Petranek Sep 30 '10 at 22:29
-1

You can't possibly copy an iterator - it's basically meaningless. For some, this is obvious from the Iterator interface, but let's demonstrate it with a concrete example. In fact, let's demonstrate with an example about concrete.

bar with pieces of concrete

This is a picture of a concrete iterator over a bar of concrete. Iteration in our case means applying the crow bar to break a piece off of the bar. Now, note that:

  • The bar is not a collection of pieces (though some of it has faults): We're creating pieces as we iterate.
  • The result of an iteration via the iterator (of next()) can never be the result of another iteration of the bar. The result has been removed from it.
  • The iteration may produce a different piece depending on the weather, on the amount of force you apply, or maybe on some kind of thermal noise (think: randomness).
  • The result of an iteration via the iterator (of next()) can never be the result of another iteration of the bar - as the probability space of exact iteration result is continuous, and no specific resulting piece has a non-zero probability measure.

Any of the above should convince you not to try to "copy the iterator", that's silly...

einpoklum
  • 118,144
  • 57
  • 340
  • 684
  • I would argue that an Iterator should not have side effects - such as removing entries from a list. – Albert Jan 26 '15 at 08:04
  • @Albert: Well, this doesn't follow from the interface. What if your iterator is of the outputs from a Casino slot machine? Iterating requires inserting a coin... – einpoklum Jan 26 '15 at 08:44
  • The public defined Java functions are not the only thing which define an Iterator - it's just the only thing the compiler currently can check and ensure. There is also certain behavior what you would expect. Of course you can do it different - however, I would not expect from an Iterator to modify things and would consider it bad practice. – Albert Jan 26 '15 at 09:16
-1

Iterators were created to step through all the objects in a collection one at a time, using data backed by said collection.

Iterator<T> is almost always implemented using a private inner class that may use state that is part of the outer class. Thus, you can't really modify an Iterator's behavior without writing your own Collection (or whatever) as well.

Copying the Iterator could cause a host of new problems, such as falling out of sync with the backing collection.

Powerlord
  • 87,612
  • 17
  • 125
  • 175
  • But you already have the same situation already if you call `iterator()` several times on an `Iterable` object. – Albert Sep 30 '10 at 18:15