2

I'm looking for a solution like this one proposed by Eric Lippert. It is a great implementation, as it is immutable plus the append time is O(1), but it downside is the O(i) random access time.

On the other side there is a great implementation of collection with O(1) on both append and random access. The only problem is that it strongly relies on mutability.

My questions is how to implement a collection which combines the benefits of both solutions? That is:

  1. immutability
  2. O(1) append time
  3. O(1) random access time

Memory complexity is not that big issue for me.

Community
  • 1
  • 1
Maciej Sz
  • 11,151
  • 7
  • 40
  • 56
  • Try the C5 Generics: http://www.itu.dk/research/c5/ – Darek Jul 08 '14 at 12:43
  • 1
    Which one in particular? – Maciej Sz Jul 08 '14 at 12:45
  • I believe, HashedSet, HashedBag, HashedArrayList, HashedLinkedList fit your speed need, while GuardedCollection provides read-only access. Not exactly immutability, but should suffice. – Darek Jul 08 '14 at 12:50
  • 3
    "Should suffice"? I don't think so [from docs]: "Structural modifcations such as adding, inserting, updating or removing items will throw ReadOnlyCollectionException.". It does not provide appending. –  Jul 08 '14 at 13:06
  • How does one define immutability? I.e. Strings are not mutable. You can't append a new character to the end of the string. Perhaps I misunderstood the question. The original HashedSet or HashedBag remain mutable, you can add/remove elements to it. But if one is concerned about immutability, one can use GuardedList. – Darek Jul 08 '14 at 13:11
  • I'm not sure how you would get `O(1)` append *and* `O(1)` random access. If you want to be able to append elements, you can either copy the source collection, which keeps `O(1)` random access but gives you `O(n)` append; or you can do what Eric did, and retain the old list segment(s), which gives you `O(1)` append time but `O(n)` random access. You could wrap a map for index-based lookups, but I don't know of any *immutable* map implementations that offer `O(1)` inserts and `O(1)` lookups, so you run into the same problem there. – Mike Strobel Jul 08 '14 at 13:27
  • "It is not possible" is also a valid answer (if it is correct). I would then search for a next-best solution which for my case is `O(log n)` lookup. – Maciej Sz Jul 08 '14 at 13:30
  • 1
    @MaciejSz The [Scala documentation](http://docs.scala-lang.org/overviews/collections/performance-characteristics.html) claims "effectively constant" add/lookup complexity for its immutable HashMap. It may be worth looking at how it works. Though it doesn't offer random access, you could wrap such a map in a list implementation like Eric's and use it as an index-based lookup. – Mike Strobel Jul 08 '14 at 13:44
  • @MikeStrobel That sounds reasonably close to my needs. However my knowledge of Scala is very limited at this point, that is why I asked this question, hoping somebody who already did this would explain in any language that I know. – Maciej Sz Jul 08 '14 at 13:50
  • OK, I promised I will shut up, but I can't ... Can Eric Lippert's implementation truly be called immutable? I mean, when a string is appended to and assigned back to the original, every characters is copied to a new location. So if a collection was deemed immutable, wouldn't each of its elements had to be copied to a new location as well, when a new element is added to it. Does Eric's implementation provide that? Not to mention that the constructor private List(int i, List list) is missing in the example. – Darek Jul 08 '14 at 14:09
  • @Darek A simplistic example of Eric's implementation at work is if you had a list `[1, 2, 3]` and appended the element `4`: the resulting list would have two pointers: one to the first segment of the combined list, containing the original list (or the array wrapped by the original list), `[1, 2, 3]`, and a second segment containing the newly appended element(s), `[4]`. The original list is not copied element-wise; a direct reference to the original list is incorporated to avoid the copy. It is immutable because the original is never *modified*. – Mike Strobel Jul 08 '14 at 14:16
  • @MikeStrobel Thanks for your explanation. I do agree, that the original is never modified. but it is never copied either, which is a different behavior than string immutability. I'd call it more of a read-only, but my purist days are way over. – Darek Jul 08 '14 at 14:22
  • @Darek immutability does not require copying; it would be perfectly valid for strings to be implemented with a tree of immutable segments. It is enough that the data cannot be mutated. – Mike Strobel Jul 08 '14 at 14:34
  • If you need O(1) update and access, the simplest thing to do is to drop the immutability requirement. Writing software with mutable collections shouldn't be that hard to do (IMHO I have never had a problem with them), and they are usually more performant. – Peter Lawrey Jul 09 '14 at 06:05
  • 1
    @PeterLawrey this becomes a problem when you aggregate the collection in another immutable object and then want to do some operation on that object. If this operation yields new instance of that object with modified contents of the aggregated collection then there is no choice, but to have the collection immutable as well, or you break the entire chain of immutability. Another option would be to clone the entire collection, which (in Eric's words) of course would be dumb. – Maciej Sz Jul 09 '14 at 08:23

3 Answers3

5

I do not know of a way to implement a list which has all your requirements -- immutability, persistence, O(1) insertion, O(1) removal, O(1) random access.

My suggestion to you is that (1) if you are interested in this topic, read Chris Okasaki's book. (Or, get a copy of his thesis, which was the basis of the book.) And (2) Chris Okasaki suggests the data structure described here for your purposes:

http://www.codeproject.com/Articles/9680/Persistent-Data-Structures#RandomAccessLists

This list is O(1) insert and O(1) removal to the head and O(lg) for random access.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
1

I'm not sure how you would get O(1) append and O(1) random access unless you include another data structure.

Typically, if you want to be able to append elements, you can either copy the source collection, which keeps O(1) random access but gives you O(n) append; or you can do what Eric did, and retain the old list segment(s), which gives you O(1) append time but O(n) random access. Assuming the constant append time is critical, that leaves you with the option of incorporating a second data structure to provide constant-time random access.

The Scala documentation claims "effectively constant" add and lookup times for its immutable HashMap. If true, I suggest looking at their implementation. You could take a solution like Eric's and add an efficient immutable map of indexes to the elements themselves. This would add a bit of memory overhead, though, and while append operations would be efficient, insertions would not.

I am, however, a bit skeptical of the Scala HashMap performance claims. Other immutable map implementations claim log32(n) complexity, which presumably applies to both add and lookup operations. My gut tells me that you're not going to get better than logarithmic complexity, though log32(n) is pretty reasonable.

Mike Strobel
  • 25,075
  • 57
  • 69
  • _take a solution like Eric's and add an efficient immutable of list indexes_: that is what I've been looking into before asking this question. The problem is that _efficient immutable of list indexes_ is actually the same data structure that I need. Can you see the infinite recursion here? :) – Maciej Sz Jul 08 '14 at 13:56
  • @MaciejSz Ah, that should have read "*map* of indexes". And, yes, the problem is pretty much the same, but you might have better luck looking for immutable *map* structures than immutable *list* structures, given your performance requirements. – Mike Strobel Jul 08 '14 at 13:58
-1
        var bag = new HashBag<int>
                  {
                      1,
                      2,
                      3
                  };
        var g = new GuardedCollection<int>(bag);
        bag.Add(4);
        g.Add(5);

The HashBag remains mutable, but you can still pass it to another consumer as an immutable GuardedCollection.

Darek
  • 4,687
  • 31
  • 47
  • 1) In real immutable collection (in my point of view), you do not need to rember underlying collection) - but ok, we could create own wrapper. 2) Is adding elements to HashBag thread safe? –  Jul 08 '14 at 13:17
  • Ok, but what will happen now if the consumer wishes to append another item? He would have to extract the underlying elements from the `g:GuardedCollection` which is `O(n)`. The linked list implementation I've mentioned is free of this cost. – Maciej Sz Jul 08 '14 at 13:19
  • @MaciejSz I am not sure if I get your requirements. You want a collection immutable and you want to be able to append to it? That seems to be a contradiction. – Darek Jul 08 '14 at 13:22
  • @pwas None of the four are thread-safe, but that was not a requirement stated by Maciej. For thread-safe collections I rely explicitly on ConcurrentBag and ConcurrentDictionary from TPL. – Darek Jul 08 '14 at 13:24
  • The important thing: modyfing HashBag affects the previous collection. What Maciej is want to get i think: `var coll1 = new ImmutableCollection() { "a", "b" }` and `var coll2 = coll1.Add("c")`. In this case, 'coll2' is new instance of collection with 3 elements, but `coll1` remains with two first elements. –  Jul 08 '14 at 13:24
  • @Darek Look at the first link I've provided. Yes, this list is both immutable and you can append to it, as each new added item creates a new list object while reusing the existing ones. – Maciej Sz Jul 08 '14 at 13:24
  • 1
    @Darek - immutable collections are thread safe, so if such a collection is desired, I think the other solution should also meet that requirement. –  Jul 08 '14 at 13:24
  • I see, @MaciejSz, so you want your collection to remain mutable, but its elements to stay immutable. Interesting challenge ... – Darek Jul 08 '14 at 13:28
  • 1
    @Darek The collection is immutable; appending yields a new collection, much like appending two strings yields a new string. In both cases, the original was immutable and remains untouched. – Mike Strobel Jul 08 '14 at 13:31
  • @MikeStrobel In that scenario, can Add of O(1) be achieved? Every time you add a new element you have to copy the entire collection. – Darek Jul 08 '14 at 13:33
  • @Darek read [the first link](http://stackoverflow.com/a/10045677/1697320). Really. – Maciej Sz Jul 08 '14 at 13:34
  • It might be a tad over my head @MaciejSz .. I will shut up now :) – Darek Jul 08 '14 at 13:36