28

Is the computational complexity of TreeSet methods in Java, same as that of an AVLTree?

Specifically, I want to know the computational complexity of the following methods: 1.add 2.remove 3.first 4.last 5. floor 6. higher

Java Doc for method description: http://docs.oracle.com/javase/6/docs/api/java/util/TreeSet.html

For an AVL Tree, there are all O(logn)? Whats the complexity of the above TreeSet Methods?

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
user1628340
  • 901
  • 4
  • 14
  • 27

3 Answers3

46

EDIT: It should be clarified that the time order usually refers to the number of comparisons. Some operations have no comparisons, so the time order could be taken from the number of sub-tasks

The code below prints the following in Java 8

O(1) comparisons, however the number of indirections could be O(ln N)
Performing 10,000 first operations -> 0.0 comparisons/operation
Performing 10,000 last operations -> 0.0 comparisons/operation
Performing 10,000 size operations -> 0.0 comparisons/operation
Performing 10,000 isEmpty operations -> 0.0 comparisons/operation
Performing 10,000 comparator operations -> 0.0 comparisons/operation
Performing 10,000 pollFirst operations -> 0.0 comparisons/operation
Performing 10,000 pollLast operations -> 0.0 comparisons/operation
Performing 10,000 headSet operations -> 1.0 comparisons/operation
O(ln N) comparisons
Performing 10,000 add operations -> 12.9 comparisons/operation
Performing 10,000 ceiling operations -> 12.9 comparisons/operation
Performing 10,000 contains operations -> 12.9 comparisons/operation
Performing 10,000 floor operations -> 12.9 comparisons/operation
Performing 10,000 higher operations -> 13.9 comparisons/operation
Performing 10,000 lower operations -> 13.9 comparisons/operation
Performing 10,000 remove operations -> 10.9 comparisons/operation
O(N ln N) comparisons
Performing 10,000 equals operations -> 128853.0 comparisons/operation

the code

public class TreeSetOrderMain {
    public static void main(String[] args) {
        System.out.println("O(1) comparisons, however the number of indirections could be O(ln N)");
        testOrder("first", (s, i) -> s.first());
        testOrder("last", (s, i) -> s.last());
        testOrder("size", (s, i) -> s.size());
        testOrder("isEmpty", (s, i) -> s.isEmpty());
        testOrder("comparator", (s, i) -> s.comparator());
        testOrder("pollFirst", (s, i) -> s.pollFirst());
        testOrder("pollLast", (s, i) -> s.pollLast());
        testOrder("headSet", TreeSet::headSet);

        System.out.println("O(ln N) comparisons");
        testOrder("add", TreeSet::add);
        testOrder("ceiling", TreeSet::ceiling);
        testOrder("contains", TreeSet::contains);
        testOrder("floor", TreeSet::floor);
        testOrder("higher", TreeSet::higher);
        testOrder("lower", TreeSet::lower);
        testOrder("remove", TreeSet::remove);

        System.out.println("O(N ln N) comparisons");
        Set<Long> set = LongStream.range(0, 10_000).mapToObj(i -> i).collect(Collectors.toSet());
        testOrder("equals", (s, i) -> s.equals(set));
    }

    static void testOrder(String desc, BiConsumer<TreeSet<Long>, Long> op) {
        final LongComparator comparator = new LongComparator();
        TreeSet<Long> longs = new TreeSet<>(comparator);
        final int count = 10_000;
        for (long i = 0; i < count; i++)
            longs.add(i);
        comparator.count = 0;
        for (long i = 0; i < count; i++)
            op.accept(longs, i);
        System.out.printf("Performing %,d %s operations -> %.1f comparisons/operation%n",
                count, desc, (double) comparator.count / count);
    }

    static class LongComparator implements Comparator<Long> {
        long count = 0;

        @Override
        public int compare(Long o1, Long o2) {
            count++;
            return Long.compare(o1, o2);
        }
    }
}

Operations which work on a single element are all O(ln n) comparisons except first and last which are O(1) comparisons or O(ln N) node search time.

comparator(), iterator(), clear(), first(), isEMpty(), size(), last(), pollFirst(), pollLast(), headSet() are O(1)

add(), ceiling(), contains(), floor(), higher(), lower(), remove(), subSet(), tailSet() are O(ln N)

clone(), hashCode(), toArray() and toString() are O(n) for operations, but no comaprisons

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • @PeterSmith - are they all O(ln n) in terms of node search time? (n= number of nodes)? – user1628340 Jan 17 '13 at 13:07
  • Correct. As comparisons can be *much* more expensive than skipping from one node to another this can more useful to consider. e.g. a first/last() on a TreeSet with 2 billion elements (the largest) can be faster than one comparison e.g. in a Set with one element. – Peter Lawrey Jan 17 '13 at 13:09
  • @PeterLawrey How about the time complexity of removing element from iterator of TreeSet? Still O(ln n) or O(1)? – xudifsd Dec 07 '17 at 02:35
  • I think not all are o(logn) the implementation provides guaranteed log(n) time cost for the basic operations (add, remove and contains). – Ahmed Gamal Oct 17 '18 at 14:22
  • @AhmedGamal Thank you for the correction. As a percentage, most methods are not O(ln N), the operations which work on one element are. – Peter Lawrey Oct 17 '18 at 17:12
  • Why are `add()`, `remove()`, etc. O(ln n) instead of the usual O(log_2 n)? Is there some special optimization that allows it to be executed base e instead of base 2? – Kevin Sheng Nov 21 '20 at 05:35
  • @KevinSheng Scale doesn't matter so O(ln N) and O(log2 N) are the same, the first is simpler. – Peter Lawrey Nov 23 '20 at 13:52
  • 2
    This is wrong. first(), last() can not be O(1) as they are implemented as red black balanced BST. It's O(log n) – voidMainReturn Jul 15 '22 at 22:01
  • 1
    I agree with @voidMainReturn. In the actual implementation, first() and last() have logarithmic runtime. They could have cached the min and max values, but they didn’t. – Andrew Puglionesi Nov 06 '22 at 11:16
  • @AndrewPuglionesi Generally, the time complexity is the number of comparisons required to find an entry. This is O(1). In terms of the number of indirections, I agree it would be O(log N) – Peter Lawrey Nov 07 '22 at 13:05
6

first(), last(), pollFirst(), pollLast(): O(lgn), not O(1).

Ast15
  • 71
  • 1
  • 3
1

At least in the current version of JDK, no special handles are applied on first or last (see source code in openjdk/jdk. The code looks like:

    final Entry<K,V> getLastEntry() {
        Entry<K,V> p = root;
        if (p != null)
            while (p.right != null)
                p = p.right;
        return p;
    }

It should be O(logN), not an exception.

Xinyi Li
  • 852
  • 8
  • 9