59

I am familiar with Java Collection Framework which contains basic interfaces: Collection and Map. I am wondering why the Framework doesn't contain structures as Tree and Graph which are basic collections. Both can be regarded as sub types of Collection.

By the way, I know TreeSet is implemented by Red-Black Tree underlying. However, the TreeSet is not a Tree but a Set, so there's no real Tree in the framework.

卢声远 Shengyuan Lu
  • 31,208
  • 22
  • 85
  • 130

6 Answers6

33

I am wondering why the Framework doesn't contain structures as Tree and Graph which are basic collections. Both can be regarded as sub types of Collection.

This is a good question. I think it simply boils down to scoping. The core features that Collections API provides classes for are:

  • iteration order: Lists and sorted maps have specified iteration order, most sets don't.

  • duplicates: Lists allow duplicates, sets do not

  • index: List values are indexed by integers, map values are indexed by other objects.

This gets us very far and I assume Joshua Bloch et al argued that more feature rich collections (graphs and trees which require internal relationship between elements, sets with multiplicity, bi-directional maps, ...) can be implemented on top of these three core features, and are thus better off in libraries.

aioobe
  • 413,195
  • 112
  • 811
  • 826
17

The java.util package contains data structures to organize data of any kind. It deals basically with abstract data structures (like List, Set, Map) which are defined via their methods and behavior (e.g. a Set does contain no elements twice, a List maintains order and allows duplicates, etc.).

You as a developer are free to choose which implementation of these data structures are best suited for the kind of data you deal with (HashSet vs. TreeSet / LinkedList vs. ArrayList / etc.). For example for Sets and Maps you may choose between hash-based implementations and tree-based implementations, which may or may not be suited for what you want to do (in most cases a hash-based implementation will be the best choice, while sometimes, when order is important, a tree may be better suited for your needs - See also HashSet vs TreeSet (here at Stackoverflow)).

If you are thinking of a Tree as a special kind of Graph (which it is), then you’re interested in specific properties which apply to graphs, not to collections in general (which, essentially, are lists, and are in turn used to implement things like graphs).

As mentioned in this thread, if you’re interested in modeling Graphs, there are plenty of choices for Graph libraries. Personally I can recommend JGraphT.

I don’t know why there is no graph library within the JDK (and I don’t know whether that’s a good thing to ask?), but I guess Sun decided to leave this to developers, since most applications that require graphs also require very unique implementations.

Community
  • 1
  • 1
scravy
  • 11,904
  • 14
  • 72
  • 127
  • This is a very good answer. Yet, in java even HashMap is a tree-based impl. of a hashtable. So are Linked/Concurrent versions. IdentityHashMap is linear probe, though. I can't really imagine a standard graph structure and its uses. – bestsss Feb 13 '11 at 02:23
  • I'm baffled as to why this post has 9 upvotes. The title of the question is *"Why Java Collection Framework doesn't contain Tree and Graph"* and this post avoids answering the question completely. The concluding paragraph even says *"I don't know why there is no graph library within the JDK"*. – aioobe Nov 30 '14 at 19:53
  • @aioobe I do answer the question by "but I guess Sun decided to...". There is no generic answer to why X is not included anywhere and I don't want to pretend that I knew the policy for including a certain feature in the standard library or not. There's things in there which in my opinion should never have been included, there are things which have to be provided via SPI, etc. Yet I give you an idea about why more fundamental classes are part of it and that some of them even implement the thing (TreeMap, TreeSet are trees) which was asked for. Why is no in-memory database included? But CORBA? – scravy Oct 09 '18 at 06:39
  • Technically any heap is your in memory database. – Dragas Mar 12 '20 at 09:15
7

I suspect that the answer is that it is a combination of two things:

  • A generic tree or graph interface would be "feature poor".
  • It is easier and more efficient to implement a tree or graph using fields to represent child and (if you need them) parent pointers.

Note that neither Apache commons or Google commons have generic graph or tree support. However, I did come across a couple of generic tree/graph hierarchies:

  • The jsfcompounds project on JavaNet has the "com.truchsess.util" graph and tree framework.
  • The OpenJGraph project (inactive) on SourceForge includes tree and graph libraries.
Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
4

Original answer

The main problem with graphs (and more concretely trees, as you know, a special case where there's a one only root, there can be no loops and all nodes are connected) is that each item in that supposed collection should be hold in another structure: the node. It's tough to make standard that kind of treatment as it requires a double layer of "containers".

If anyone ever happens to work with the TreeModel for JTree would notice that is almost impossible to avoid the fact that there are TreeNodes behind (inside?) and you have to manipulate both, nodes and trees.

Anyway, I agree that the structure would be useful but it's very hard to make it standard, just notice that, for instance, neither Apache Commons Collections nor Google Guava, two big collection extensions for the API, don't have them either.

Update

According to the overview of the API:

The main design goal was to produce an API that was reasonably small, both in size, and, more importantly, in "conceptual weight." It was critical that the new functionality not seem alien to current Java programmers; it had to augment current facilities, rather than replacing them. At the same time, the new API had to be powerful enough to provide all the advantages described above.

So while graphs are proper collections and it'd be maybe possible to make it standard, the size and complexity of the implementation would be too big to fit this design goal, because of the previously explained need of a double layer of containers.

Also, Google Guava has added support for graphs. Apache had a sandbox (development in progress) for graphs but seems dormant since 2011.

2

I am afraid the answer is: it's too troublesome to design and maintain a generic tree structure and interfaces (see answer to this post). Most users will need the performance of tree-based implementations of lists and sets, but will not be concerned with the internals, so most of them are hidden.

Community
  • 1
  • 1
rodion
  • 14,729
  • 3
  • 53
  • 55
  • Hmm I've recently made my own generic tree structure and it is really ease to use. You can make like: Tree and specify comparator for tree or let tree use default since its T extends Comparable super T> – Xorty Feb 12 '11 at 15:31
  • 1
    I guess guys at Sun decided that tree was not a top priority feature. Just like lots of other "useful" feature that are omitted, e.g. Bag type, map/reduce/collect methods for collections, etc. I personally, have used `TreeSet` and `TreeMap` plenty of times but never needed a tree interface, a `Set` and `Map` interfaces always sufficed. If you need the features, 3rd party library is the way to go. – rodion Feb 12 '11 at 16:21
  • A big user and fan of the Map and MultiMap in C++ STL, I'm quite surprised it's absent from vanilla Java. Wow. –  May 05 '14 at 23:36
1

There is an interface for trees - javax.swing.tree.TreeModel, which in fact works for arbitrary directed graphs (with a distinguished "root" node). It does not implement the Collection interface, though (since Swing is a bit older than the collection framework, and it is not really clear that being a collection would really be appropriate here). (One implementation of TreeModel is DefaultTreeModel, which uses a tree build from TreeNode objects, which is itself an interface implementable by users.)

Another type of trees are given by the XML-DOM frameworks. Some specific use trees (or mostly "tree nodes") are defined by java.io.File, java.awt.Component (and subclasses), the Compiler tree API (com.sun.source.tree.*), the Doclet API (com.sun.javadoc.*), the Reflection API (java.lang.Class), the language model API (javax.lang.**).

If you compare these API

The problem is, there is no clear general-purpose useful interface for trees - what should such a tree be able to do? Is a tree simply a collection of other trees (those one level lower), or simply a pointer to the root node, where each node itself contains more nodes? Or is a tree a collection of all the nodes? Or a collection of all the contents of all the nodes? Do all nodes have to have "content", or only the leaf nodes? If a tree were a collection of some content elements (and not the nodes itself), how should an iterator behave? What would be the size of a tree?

Here is a proposal for a tree node (or in fact it could be a general directed graph node, if not for the parent-pointer) interface:

/**
 * A general interface for a tree node.
 * @param <N> the concrete node type.
 */
public interface Node<N extends Node<N>> {

   /**
    * equivalent to {@link #children children()}.{@link Collection#isEmpty isEmpty()}.
    * @returns true, if this is a leaf node, else false.
    */
   public boolean isLeaf();

   /**
    * returns a collection of all children of this node.
    * This collection can be changed, if this node is mutable.
    */
   public Collection<N> children();
   /**
    * returns the parent of this node.
    * @return null, if this is the root node, or the node does not know its parent, or has multiple parents.
    */
   public N parent();
}

/**
 * a tree node with content objects.
 * @param <N> the concrete node type
 * @param <C> the type of the content of this node.
 */
public interface ContentNode<C,N extends ContentNode<C,N>>
          extends Node<N>
{
   /**
    * returns the content of this node, if any.
    */
   public C content();
}

Would this be enough for all types of trees, for example the ones listed above? I don't think so.

Paŭlo Ebermann
  • 73,284
  • 20
  • 146
  • 210
  • I have added slight modifications of these interfaces to my [github repository](https://github.com/ePaul/stackoverflow-examples/tree/master/src/de/fencing_game/paul/examples). – Paŭlo Ebermann Mar 04 '11 at 21:36