35

By persistent collections I mean collections like those in clojure.

For example, I have a list with the elements (a,b,c). With a normal list, if I add d, my original list will have (a,b,c,d) as its elements. With a persistent list, when I call list.add(d), I get back a new list, holding (a,b,c,d). However, the implementation attempts to share elements between the list wherever possible, so it's much more memory efficient than simply returning a copy of the original list. It also has the advantage of being immutable (if I hold a reference to the original list, then it will always return the original 3 elements).

This is all explained much better elsewhere (e.g. http://en.wikipedia.org/wiki/Persistent_data_structure).

Anyway, my question is... what's the best library for providing this functionality for use in java? Can I use the clojure collections somehow (other that by directly using clojure)?

bm212
  • 1,429
  • 11
  • 12
  • 2
    you mean something like LinkedList for java? I see what you mean now, check out http://functionaljava.org/, it may help you. – Matt Dec 20 '11 at 12:56
  • Do you know off-hand in functionaljava.org implements the sort of structure I mention above? (A bit cheeky I know, I'll go and look at the source code otherwise) – bm212 Dec 20 '11 at 13:40
  • re my previous comment - yes it does, but I'd rather use the Clojure ones if possible (as I know they are battle-hardened) – bm212 Dec 20 '11 at 20:22

13 Answers13

17

Just use the ones in Clojure directly. While obviously you might not want to use the language it's self, you can still use the persistent collections directly as they are all just Java classes.

import clojure.lang.PersistentHashMap;
import clojure.lang.IPersistentMap;

IPersistentMap map = PersistentHashMap.create("key1", "value1");

assert map.get("key1").equals("value1");
IPersistentMap map2 = map.assoc("key1", "value1");

assert map2 != map;
assert map2.get("key1").equals("value1");

(disclaimer: I haven't actually compiled that code :)

the down side is that the collections aren't typed, i.e. there are no generics with them.

Gareth Davis
  • 27,701
  • 12
  • 73
  • 106
  • 2
    lack of generics will be a problem alas, I'll look into that. Otherwise a very good suggestion, especially as these are widely used. – bm212 Dec 20 '11 at 13:33
  • The assertion `map2 != map` actually fails. There seems to be some internal optimization that reuses an existing map if it is possible. – vitaly Jul 17 '17 at 23:43
  • Also, it is `.valAt()` and not `.get()` – vitaly Jul 17 '17 at 23:43
  • This worked for me: IPersistentMap map = PersistentHashMap.create("key1", "value1"); assert map.valAt("key1").equals("value1"); IPersistentMap map2 = map.assoc("key1", "value2"); assert map2 != map; assert map2.valAt("key1").equals("value2"); – vitaly Jul 17 '17 at 23:46
12

What about pcollections?

You can also check out Clojure's implementation of persistent collections (PersistentHashMap, for instance).

seanf
  • 6,504
  • 3
  • 42
  • 52
lsoliveira
  • 4,560
  • 3
  • 20
  • 31
  • Yes, I stumbled across pcollections. Does anyone know if they have been used much in production systems (this is for a work project)? Perhaps that should be a separate question. – bm212 Dec 20 '11 at 13:23
  • Not sure why I didn't think of just using Clojure's collections directly, that might be a better idea overall. – bm212 Dec 20 '11 at 13:23
  • 1
    The only problem with using Clojure is bringing the entire jar into your project (apart from generics, but you can hide that with some wrapper classes). That might or might not be a problem for you. I never used PCollections myself and if you must absolutely rely on a rock solid implementation, go with Clojure (check if the license and/or talk to the developers as you might be able to just extract the implementations you need). – lsoliveira Dec 20 '11 at 13:53
  • 1
    [totallylazy](https://code.google.com/p/totallylazy/) looks much better than Clojure, because it not only has the implementations of persistent collections, but also has a huge number of convenient methods to work with these collections. In addition to the fact that Clojure collections are only usable within Clojure which is a dynamic language (and not usable from Java). – ZhekaKozlov Mar 15 '14 at 10:44
6

I was looking for a slim, Java "friendly" persistent collection framework and took TotallyLazy and PCollections mentioned in this thread for a testdrive, because they sounded most promising to me.

Both provide reasonable simple interfaces to manipulate persistent lists:

// TotallyLazy
PersistentList<String> original = PersistentList.constructors.empty(String.class);
PersistentList<String> modified = original.append("Mars").append("Raider").delete("Raider");

// PCollections
PVector<String> original = TreePVector.<String>empty();
PVector<String> modified = original.plus("Mars").plus("Raider").minus("Raider");

Both PersistentList and PVector extend java.util.List, so both libraries should integrate well into an existing environment.

It turns out, however, that TotallyLazy runs into performance problems when dealing with larger lists (as already mentioned in a comment above by @levantpied). On my MacBook Pro (Late 2013) inserting 100.000 elements and returning the immutable list took TotallyLazy ~2000ms, whereas PCollections finished within ~120ms.

My (simple) test cases are available on Bitbucket, if someone wants to take a more thorough look.

[UPDATE]: I recently had a look at Cyclops X, which is a high performing and more complete lib targeted for functional programming. Cyclops also contains a module for persistent collections.

Stefan Haberl
  • 9,812
  • 7
  • 72
  • 81
5

https://github.com/andrewoma/dexx is a port of Scala's persistent collections to Java. It includes:

  • Set, SortedSet, Map, SortedMap and Vector
  • Adapters to view the persistent collections as java.util equivalents
  • Helpers for easy construction
Andrew
  • 51
  • 1
  • 1
5

Paguro provides type-safe versions of the actual Clojure collections for use in Java 8+. It includes: List (Vector), HashMap, TreeMap, HashSet, and TreeSet. They behave exactly the way you specify in your question and have been painstakingly fit into the existing java.util collections interfaces for maximum type-safe Java compatibility. They are also a little faster than PCollections.

Coding your example in Paguro looks like this:

// List with the elements (a,b,c)
ImList<T> list = vec(a,b,c);

// With a persistent list, when I call list.add(d),
// I get back a new list, holding (a,b,c,d)
ImList<T> newList = list.append(d);

list.size(); // still returns 3

newList.size(); // returns 4

You said,

The implementation attempts to share elements between the list wherever possible, so it's much more memory efficient and fast than simply returning a copy of the original list. It also has the advantage of being immutable (if I hold a reference to the original list, then it will always return the original 3 elements).

Yes, that's exactly how it behaves. Daniel Spiewak explains the speed and efficiency of these collections much better than I could.

GlenPeterson
  • 4,866
  • 5
  • 41
  • 49
3

Functional Java implements a persistent List, lazy List, Set, Map, and Tree. There may be others, but I'm just going by the information on the front page of the site.

I am also interested to know what the best persistent data structure library for Java is. My attention was directed to Functional Java because it is mentioned in the book, Functional Programming for Java Developers.

Jesse Hallett
  • 1,857
  • 17
  • 26
3

There's pcollections (Persistent Collections) library you can use:

http://code.google.com/p/pcollections/

Shantanu Kumar
  • 1,240
  • 1
  • 12
  • 14
3

The top voted answer suggest to directly use the clojure collections which I think is a very good idea. Unfortunately the fact that clojure is a dynamically typed language and Java is not makes the clojure libraries very uncomfortable to use in Java.

Because of this and the lack of light-weight, easy-to-use wrappers for the clojure collections types I have written my own library of Java wrappers using generics for the clojure collection types with a focus on ease of use and clarity when it comes to interfaces.

https://github.com/cornim/ClojureCollections

Maybe this will be of use to somebody.

P.S.: At the moment only PersistentVector, PersistentMap and PersistentList have been implemented.

Cornelius
  • 640
  • 8
  • 22
3

May want to check out clj-ds. I haven't used it, but it seems promising. Based off of the projects readme it extracted the data structures out of Clojure 1.2.0.

Jake McCrary
  • 1,180
  • 9
  • 8
  • It appears to solve all of the problems that arise from using the Clojure collections directly. I'll give it a try. Thanks. – bm212 Dec 20 '11 at 21:11
2

In the same vein as Cornelius Mund, Pure4J ports the Clojure collections into Java and adds Generics support.

However, Pure4J is aimed at introducing pure programming semantics to the JVM through compile time code checking, so it goes further to introduce immutability constraints to your classes, so that the elements of the collection cannot be mutated while the collection exists.

This may or may not be what you want to achieve: if you are just after using the Clojure collections on the JVM I would go with Cornelius' approach, otherwise, if you are interested in pursuing a pure programming approach within Java then you could give Pure4J a try.

Disclosure: I am the developer of this

Rob Moffat
  • 465
  • 5
  • 11
1

totallylazy is a very good FP library which has implementations of:

  • PersistentList<T>: the concrete implementations are LinkedList<T> and TreeList<T> (for random access)
  • PersistentMap<K, V>: the concrete implementations are HashTreeMap<K, V> and ListMap<K, V>
  • PersistentSortedMap<K, V>
  • PersistentSet<T>: the concrete implementation is TreeSet<T>

Example of usage:

import static com.googlecode.totallylazy.collections.PersistentList.constructors.*;
import com.googlecode.totallylazy.collections.PersistentList;
import com.googlecode.totallylazy.numbers.Numbers;

...

PersistentList<Integer> list = list(1, 2, 3);

// Create a new list with 0 prepended
list = list.cons(0);

// Prints 0::1::2::3
System.out.println(list);

// Do some actions on this list (e.g. remove all even numbers)
list = list.filter(Numbers.odd);
// Prints 1::3
System.out.println(list);

totallylazy is constantly being maintained. The main disadvantage is the total absence of Javadoc.

levant pied
  • 3,886
  • 5
  • 37
  • 56
ZhekaKozlov
  • 36,558
  • 20
  • 126
  • 155
  • 1
    The other problem is that it is quite slow. For example, [this](https://gist.github.com/anonymous/751a036ba7cc2d703b82) takes ~700-800ms to run. Similar thing using [clj-ds](https://github.com/krukow/clj-ds), i.e. [this](https://gist.github.com/anonymous/fb95d20f0e34341b16ce), takes ~9x less (around 85ms). For comparison, Java's HashMap takes ~25ms. – levant pied May 22 '15 at 22:21
  • @levantpied How big is the `num` in your [example](https://gist.github.com/anonymous/751a036ba7cc2d703b82)? – Stefan Haberl Oct 22 '15 at 14:20
  • @StefanHaberl `final int num = 100000;` – levant pied Oct 22 '15 at 19:21
1

I'm surprised nobody mentioned vavr. I use it for a long time now.

http://www.vavr.io

Description from their site:

Vavr core is a functional library for Java. It helps to reduce the amount of code and to increase the robustness. A first step towards functional programming is to start thinking in immutable values. Vavr provides immutable collections and the necessary functions and control structures to operate on these values. The results are beautiful and just work.

Ruslan Pilin
  • 133
  • 2
  • 4
0

https://github.com/arnohaase/a-foundation is another port of Scala's libraries.

It is also available from Maven Central: com.ajjpj.a-foundation:a-foundation

mkdev
  • 972
  • 9
  • 12