0

I want to remove values between a certain range in a TreeSet as follows:

TreeSet<Integer> ts = new TreeSet<Integer>();

for(int i = 0; i < (int)1e7; i++){
    ts.add((int)(Math.random() * 1e9));
}

System.out.println("ts: " + ts.size());
SortedSet<Integer> s = ts.subSet(500, (int)8e8);
s.clear();
System.out.println("ts: " + ts.size());

What is the runtime of this operation?

Is it O(log(n)) or O(n)? I found some references stating that the subSet() operation takes log(n). I've also read that the clear() operation is O(1), but wouldn't the individual nodes to be cleared, making it potentially O(n)?

Abundance
  • 1,963
  • 3
  • 24
  • 46
  • 1
    Creating a `subSet` should be O(1), since it does not copy any elements, but simply returns a view of the backing set. I don't know how your linked answer came to the conclusion that it would be O(log n) – knittl Dec 18 '22 at 19:00
  • @OldDogProgrammer, yes I've realized that and updated the example to reflect such concerns. My question still remains in the general case when the Treeset / subset both have O(n) elements. Also aware of the autoboxing, but more interested in the Treeset internals. – Abundance Dec 18 '22 at 19:12

1 Answers1

3

Calling TreSet#subSet returns a view of the original collection. Creating this view is O(1). No elements are copied when creating the subset view. Most operations on the view are delegated to the underlying, original collection.

clear() however is an exception. Calling clear on a TreeSet (or TreeMap, really) is O(1), since it only sets the size to 0 and clears the root node.

This is not possible for a subset view of the TreeSet, which needs to remove all elements in the view one-by-one, delegating to the AbstractCollection#clear method, which is implemented as follows:

public void clear() {
    Iterator<E> it = iterator();
    while (it.hasNext()) {
        it.next();
        it.remove();
    }
}

Rmeoving a single item from the red-black-tree implementation of TreeMap (of which TreeSet is only a wrapper) is O(log(n)). Removing N items, which needs to be done for a subset view, is therefore O(n * log(n)).

knittl
  • 246,190
  • 53
  • 318
  • 364
  • Why is this not true for a subset? – Abundance Dec 18 '22 at 19:49
  • I did some digging (https://github.com/AdoptOpenJDK/openjdk-jdk11/blob/master/src/java.base/share/classes/java/util/TreeSet.java#L323), and it looks like the subset is implemented using a Treeset, which wouldn't go to the AbstractCollection#clear method. – Abundance Dec 18 '22 at 19:50
  • @Abundance it's not true because the subset cannot simply say "size = 0", because it does not have a size. It's a view on the original set. Changes to the original set become visible in the subset and vice-versa. To clear a subset, you have to remove all elements of the subset from the original set. This is only possible by calling `remove(element)` in a loop. The subset is implement by wrapping a `subMap` (there's a package-private special constructor). The submap delegates to clearing AbstractCollection. I confirmed with a debugger, feel free to do the same :) – knittl Dec 18 '22 at 20:04