1

I know that code like

for ( Object o: collection){
    if (condition(i)){
        collection.remove(i);
    }
}

will throw a ConcurrentModificationException, and I understand why: modifying the collection directly could interfere with the Iterator's ability to keep track of its place, by, for instance, leaving it with a reference to an element that's no longer a part of the collection, or causing it to skip over one that's just been added. For code like the above, that's a reasonable concern, however, I would like to write something like

for (Object o: set){// set is an instance of java.util.LinkedHashSet
    if (condition(o)){
        set.remove(other(o));
    }
}

Where other(o) is guaranteed to be "far" from o in the ordering of set. In my particular implementation it will never be less than 47 "steps" away from o. Additionally, if if condition(o) is true, the loop in question will be guaranteed to short-circuit well before it reaches the place where other(o) was. Thus the entire portion of the set accessed by the iterator is thoroughly decoupled from the portion that is modified. Furthermore, the particular strengths of LinkedHashSet (fast random-access insertion and removal, guaranteed iteration order) seem particularly well-suited to this exact sort of operation.

I suppose my question is twofold: First of all, is such an operation still dangerous given the above constraints? The only way that I can think that it might be is that the Iterator values are preloaded far in advance and cached, which I suppose would improve performance in many applications, but seems like it would also reduce it in many others, and therefore be a strange choice for a general-purpose class from java.util. But perhaps I'm wrong about that. When it comes to things like caching, my intuition about efficiency is often suspect. Secondly, assuming this sort of thing is, at least in theory, safe, is there a way, short of completely re-implementing LinkedHashSet, or sacrificing efficiency, to achieve this operation? Can I tell Collections to ignore the fact that I'm modifying a different part of the Set, and just go about its business as usual? My current work-around is to add elements to an intermediate collection first, then add them to the main set once the loop is complete, but this is inefficient, since it has to add the values twice.

Gabriel Burns
  • 345
  • 1
  • 2
  • 9
  • If you want unspecified behavior, ConcurrentModificationException counts as that. – user2357112 Feb 02 '17 at 06:09
  • Anyway, no, you can't tell the collection to skip concurrent modification checks. – user2357112 Feb 02 '17 at 06:15
  • Fair enough, but I think my question is clear enough as to what I'm looking for. The documentation for ConcurrentModificationException reads "... Iterators that do this are known as fail-fast iterators, as they fail quickly and cleanly, rather that risking arbitrary, non-deterministic behavior at an undetermined time in the future...". I want it to take the risk, because in this particular circumstance, such non-deterministic behavior seems unlikely. Alternatively, if there's a reasonably simple and comparably efficient way to rewrite the code, I'm just as happy to do that. – Gabriel Burns Feb 02 '17 at 06:16
  • The closest would be to use concurrent data structures and `CopyOnWriteArrayList` that a fail-safe. – Aimee Borda Feb 02 '17 at 06:20
  • Ooh, that seems a lot slower. I think I'd be better off just writing my own unholy union of HashMap and LinkedList that decouples the two a little more. I'd like to avoid that, though. – Gabriel Burns Feb 02 '17 at 06:32
  • See the answer in http://stackoverflow.com/questions/223918/iterating-through-a-collection-avoiding-concurrentmodificationexception-when-re – D M Feb 02 '17 at 06:52
  • @DM you're missing the point. I want to modify a completely different part of the list than the part I'm iterating over. Doing this with iterator.remove() is terribly inefficient. Whatever solution I settle on, it definitely won't be that. – Gabriel Burns Feb 02 '17 at 07:02

1 Answers1

2

The ConcurrentModificationException is thrown because your collection may not be able to handle the removal (or addition) at all times. For example, what if the removal you performed meant that your LinkedHashSet had to reduce/increase the space the underlying HashMap takes under the hood? It would have to make a lot of changes, which would possibly render the iterator useless.

You have two options:

Use Iterator to iterate elements and remove them, e.g. calling Iterator iter = linkedHashSet.iterator() to get the iterator and then remove elements by iter.remove()

Use one of the concurrent collections available under the java.util.concurrent package, which are designed to allow concurrent modifications

This question contains nice details on using Iterator

UPDATE after comments:

You can use the following pattern in order to remove the elements you wish without causing a ConcurrentModificationException: gather the elements you wish to remove in a List while looping through the LinkedHashSet elements. Afterwards, loop through each toBeDeleted element in the list and remove it from the LinkedHashSet.

Community
  • 1
  • 1
JChrist
  • 1,734
  • 17
  • 23
  • The thing about LinkedHashSet is that the iteration is handled by a linked list. Even if the HashSet had to be rebuilt, the change to the linked list would still be very localized, and far away from my modification, so the iterator should be unaffected. – Gabriel Burns Feb 02 '17 at 06:23
  • 1
    Yes, but what you are saying means that the `LinkedHashSet` would have to keep track all active iterators and check each modification whether it is _close_ or _far_ from _all_ currently active iterators. Since these modifications may be happening in other threads, it would mean it would also have to do some kind of synchronization to perform this check. – JChrist Feb 02 '17 at 06:33
  • You're _guessing_ that it's "far away from your modification." If `LinkedHashSet` could support this operation behavior it wouldn't throw `ConcurrentModificationException`. It's an option `LinkedHashSet` _chose_ because its designers knew it couldn't support it. – Louis Wasserman Feb 02 '17 at 06:45
  • I understand that it's not possible to make this work in a general situation. I don't want LinkedHashSet to perform any such checks. I just want it to trust me that they are unnecessary. Or, more likely, I want a way to get the same behavior through different means. Incidentally, there's no multithreading involved in this case. – Gabriel Burns Feb 02 '17 at 06:46
  • For your own scenario, you can always roll out your own `LinkedHashSet` and provide your own `Iterator`, which does not make any modification checks – JChrist Feb 02 '17 at 06:52
  • @LouisWasserman I'm not guessing. It is of course possible that my code has bugs, but the underlying algorithmic idea is sound. I can prove that the modification will always be "far" from the iterator location. The only question is whether or not "far" is far enough, however even that shouldn't be an issue, because the loop will always short-circuit before the modified portions can be reached. The modification and iteration happen in the same thread, so I can say this with confidence. – Gabriel Burns Feb 02 '17 at 06:52
  • @JChrist you're right, and that's what I figured I'd have to do, but I was hoping there was some clever way to avoid it. That adds a significant amount of complexity to my code, and a lot of extra debugging to what was supposed to be a quick hobby project. That's why my question specified "short of re-implementing LinkedHashSet". – Gabriel Burns Feb 02 '17 at 06:54
  • 1
    The easiest solution to your issue I can think of is keep a hold of the elements you wish to move in a `List` in your first loop and make a second loop over your _toBeDeleted_ list and remove them from your `LinkedHashSet` – JChrist Feb 02 '17 at 07:14
  • 1
    I took your advice about recording values to be removed, and improved it by using a `Set` instead of a `List` and changing the time the loop is performed. It is now run before the new elements are generated, and serves as a *banned* set rather than a *toRemove* list. This saves me from adding elements, only to remove them later. – Gabriel Burns Feb 02 '17 at 19:30