Immutable Data Structures in C#

Question

I was reading some entries in Eric Lippert's blog about immutable data structures and I got to thinking, why doesn't C# have this built into the standard libraries? It seems strange for something with obvious reuse to not be already implemented out of the box.

EDIT: I feel I might be misunderstood on my question. I'm not asking how to implement one in C#, I'm asking why some of the basic data structures (Stack, Queue, etc.) aren't already available as immutable variants.

because sometimes, you need side effects, well may be most of the time. — DarthVader, Dec 14 '11 at 18:43
@DarthVader - if you needed side effects most of the time then functional languages wouldn't exist. — TrueWill, Dec 14 '11 at 18:49
What would you do with an immutable Stack or Queue? There is a ReadOnlyCollection class that provides an immutable view of mutable collection types. — Dan Bryant, Dec 14 '11 at 19:05
@DanBryant Take a look at some of the blog entries in the link I gave, and also this: http://blogs.msdn.com/b/ericlippert/archive/2007/12/04/immutability-in-c-part-two-a-simple-immutable-stack.aspx for the question on `Stack`s. ReadOnlyCollection isn't quite the same thing as what I'm talking about. — CassOnMars, Dec 14 '11 at 19:12
@d_r_w, I see, interesting. I'd never heard of this style of collection before, so I'm guessing the BCL developers either a) never even thought of including it or b) thought it would be used too infrequently to warrant the effort of developing/maintaining the feature. — Dan Bryant, Dec 14 '11 at 19:20
@TrueWill yes you need side effects most of the time, that s why functional langs are not as popular as the non-functional ones. — DarthVader, Dec 14 '11 at 19:54
@DarthVader, see the answer to a question on side effects: http://stackoverflow.com/a/766736/374750 — CassOnMars, Dec 14 '11 at 19:56
Could this question be more appropriate in https://softwareengineering.stackexchange.com/ ? — Giulio Caccin, Nov 19 '19 at 08:13

score 8 · Answer 1 · answered Dec 24 '12 at 16:41

8

It does now.

.NET just shipped their first immutable collections, which I suggest you try out.

answered Dec 24 '12 at 16:41

Andrew Arnott

80,040
26
132
171

score 3 · Answer 2 · answered Dec 16 '11 at 18:18

Any framework, language, or combination thereof that is not a purely experimental exercise has a market. Some purely experimental ones go on to develop a market.

In this sense, "market" does not necessarily refer to market economics, it's as true whether the producers of the framework/language/both are commercially or non-commercially oriented and distributing the framework/language/both (I'm just going to say "framework" for now on) at a cost or for free. Indeed, free-as-in-both-beer-and-speech projects can be even more heavily dependent on their markets than commercial projects in this way, because their producers are a subset of their market. The market is anyone and everyone who uses it.

The nature of this market will affect the framework in several ways both by organic processes (some parts prove more popular than others and get a bigger share of the mindspace within the community that educates its own members about them) and by fiat (someone decides a feature will serve the market and therefore adds it).

When .NET was developed, it was developed to serve its future market. Ideas about what would serve them therefore influenced decisions as to what should and should not be included in the FCL, the runtime, and the languages that work with it.

Someone decided that we'd quite likely need System.Collections.ArrayList. Someone decided we'd quite likely need System.IO.DirectoryInfo to have a Delete() method. Nobody decided we'd be likely to need a System.Collections.ImmutableStack.

Maybe nobody thought of it at all. Maybe someone did and even implemented it and then it was decided not to be of enough general use. Maybe it was debated at length within Microsoft. I've never worked for MS, so I don't have a clue.

What I can consider though, is the question as to what the people who were using the .NET framework in 2002 using in 2001.

Well, COM, ActiveX, ("Classic") ASP, and VB6 & VBScript is now much less used than it was, so it can be said to have been replaced by .NET. Indeed, that could be said to have been an intention.

As well as VB6 & VBScript, a considerable number who were writing in C++ and Java with Windows as a sole or primary target platform are now at least partly using .NET instead. Again, I think that could be said to be an intention, or at the very least I don't think MS were surprised that it went that way.

In COM we had an enumerator-object based foreach approach to iteration that had direct support in some languages (the VB family*), and .NET we have an enumerator-object based foreach approach to iteration that has direct support in some languages (C#, VB.NET, and others)†.

In C++ we had a rich set of collection types from the STL, and in .NET we have a rich set of collection types from the FCL (and typesafe generic types from .NET2.0 on).

In Java we had a strong everything-is-an-object style of OOP with a small set of methods provided by a common base-type and a boxing mechanism to allow for simple efficient primitives while keeping to this style. In .NET we have a strong everything-is-an-object style of OOP with a small set of methods provided by a common base-type and a (different) boxing mechanism to allow for simple efficient primitives while keeping to this style.

These cases show choices that are unsurprising considering who was likely to end up being the market for .NET (though such broad statements above shouldn't be read in a way that underestimates the amount of work and subtlety of issues within each of them). Another thing that relates to this is when .NET differs from COM or classic VB or C++ or Java, there may well be a bit of an explanation given in the documentation. When .NET differs from Haskell or Lisp, nobody feels the need to point it out!

Now, of course there are things done differently in .NET than to any of the above (or there'd have been no point and we could have stayed with COM etc.)

However, my point is that out of the near-infinite range of possible things that could end up in a framework like .NET, there are some complete no-brainers ("they might need some sort of string type..."), some close-to-obvious ("this is really easy to do in COM, so it must be easy to do in .NET"), some harder calls ("this will be more complicated than in VB6, but the benefits are worth it"), some improvements ("locale support could really be made a lot easier for developers, so lets build a new approach to the old issue") and some that were less related to the above.

At the other extreme, we can probably all imagine something that would be so out there as to be bizarre ("hey, all coders like Conway's Life - let's put a Conway's Life right into the framework") and hence there's no surprise at not finding it supported.

So far I've quickly brushed over a lot of hard work and difficult design balances in a way that makes the design they came up with seem simpler than it no doubt was. Most likely, the more "obvious" it seems to an outsider after the fact, the more difficult it was for the designers.

Immutable collection types falls into the large range of possible components to the FCL that while not as bizarre as a built-in-conway-support idea, was not as strongly called for by examining the market as a mutable list or a way to encapsulate locale information nicely. It would have been novel to much of the initial market, and therefore at risk of ending up not being used. In an alternate universe there's a .NET1.0 with immutable collections, but it's not very surprising that there wasn't one here.

*At least for consuming. Producing IEnumVARIANT implementations in VB6 wasn't simple, and could involve writing pointer values straight into v-tables in a rather nasty way that it suddenly occurs to me, is possibly not even allowed by today's DEP.

†With a sometimes impossible to implement .Reset() method. Is there any reason for this other than it was in IEnumVARIANT? Was it even ever much used in IEnumVARIANT?

score 2 · Answer 3 · answered Dec 14 '11 at 18:41

2

Why can't you make an immutable struct? Consider the following:

public struct Coordinate
{
    public int X
    {
        get { return _x; }
    }
    private int _x;

    public int Y
    {
        get { return _y; }
    }
    private int _y;

    public Coordinate(int x, int y)
    {
        _x = x;
        _y = y;
    }
}

It's an immutable value type.

answered Dec 14 '11 at 18:41

Trevor Elliott

11,292
11
63
102

2

Unfortunately it's not immutable. The value can still change given the right circumstances. In particular if the struct is passed by ref it's values can be over written by direct assignment. – JaredPar Dec 14 '11 at 18:44
2

@JaredPar could you please post some sample code? It would be interesting to see how this can be done. – oleksii Dec 14 '11 at 18:47
1

Maybe you can show us an example. I can't reproduce a way of doing that without using unsafe code. – Trevor Elliott Dec 14 '11 at 18:47
3

Try the following. `void Example(ref Coordinate c) { c = new Coordinate(); }`. Now any code which passed a field / local of type `Coordinate` to `Example` would see the values mutate. Hence it's not immutable. Note: this is different than with a `class`. With a class you'd be changing what object a reference points to, with a `struct` you are mutating the contents of an existing object / value. – JaredPar Dec 14 '11 at 18:48
@JaredPar Are you implying that we cannot do any immutable object in C++, where class is just extended version of struct? – tia Dec 14 '11 at 19:09
@tia, C++ is quite a bit different, as objects are typically 'const' by usage rather than by definition. C++ also lets you bypass its static suggestions fairly easily (by casting away const, for instance). – Dan Bryant Dec 14 '11 at 19:11
@tia C++ is nearly impossible to get 100% immutable objects. Any attempt you make to protect an object via API's I can subvert by grabbing a raw `byte*` to the object and freely mutate it's contents. There are some tricks you can play with making memory readonly but almost all of them can be subverted by the determined programmer. But you can get a reasonable expectation of immutability in C++ because you can control assignment (just make it private an undefined). So for the normal cases it can be done. – JaredPar Dec 14 '11 at 19:21
You can get a reasonable expectation of immutability in C#, too; just make it clear via access modifiers, API design, and documentation that consumers expect your type to be immutable. Programming is not about making it *impossible* for someone else to screw up what you did, just hard for them to accidentally screw it up. – mqp Dec 14 '11 at 19:29
@mquander For me the line is "can the consumer violate immutability with normal usage (i.e. not reflection)?". If they can violate it with normal usage then i won't call it immutable. This is why I don't consider `struct` values to ever be truly immutable as simple assignment can defeat them. – JaredPar Dec 14 '11 at 20:22
@JaredPar: If Coordinate was a class, a caller that passed in a variable "foo" of type Coordinate would see foo.X and foo.Y change, just as with a struct (foo would no longer point to the same instance). The place where the semantics of immutable classes and "immutable" structs differ is with "this". Structure methods are passed "this" by reference, while class methods are passed "this" by value. I wish it were possible to mark struct or non-virtual class methods to indicate whether they should take "this" by reference or value, since there are situations where each... – supercat Dec 15 '11 at 15:47
...could benefit from the other's calling convention. For example, an immutable class type could behave as a mutable value type if properties could replace "this". For example, if one has a string and one wants a string which identical except that the fifth character is replaced with 'E', one could do that if an indexed property could take "this" by reference. `MyString[5]='E';` would replace MyString with a new string whose fifth character was 'E'. – supercat Dec 15 '11 at 15:53
@supercat yes but if you were holding onto a reference to a field within `Coordinate` then the semantics change between a `class` and `struct`. The key here for me is the ability to mutate the underlying object. Once you can do that with normal programming constructs then you've lost immutability. Note: My definition is far from definitive. But I believe the distinction is important – JaredPar Dec 15 '11 at 17:15
@JaredPar: The fields of `Coordinate` are private. How would anyone get a reference to them? Structures with exposed fields have useful mutable-value semantics; they don't really fit the "everything is an Object" model of .net--not because they're evil, but because of limitations in .net's model. Immutable structures behave like immutable reference types except that (1) the default value of a structure type is a structure in which all fields have default values, whereas the default for a class type is "null"; (2) structures are never ReferenceEquals; and... – supercat Dec 15 '11 at 18:03
...(3) the "this" parameter for structures is passed by reference rather than by value. Suppose a routine takes two 'ref' parameters, p1 and p2, of type Coordinate, and sets p1 to a new Coordinate. Regardless of whether Coordinate is a class or struct, the change will affect p2 only if the same variable or field was passed to both p1 and p2. On the other hand, suppose Coordinate had a method which would take a delegate and call that delegate before returning _x. If the call to delegate modifies the field or variable upon which that method was called... – supercat Dec 15 '11 at 18:07
...the method would see the change if Coordinate was a struct (since "this" is passed by reference), but not if it's a class (since "this" is passed by value). – supercat Dec 15 '11 at 18:08
@supercat even if the fields are private the `Coordinate` structure can still use them by reference. It's possible for the `Coordinate` structure as defined to observe itself mutating. That should never be true for an immutable object. – JaredPar Dec 15 '11 at 18:15
@JaredPar: As noted, struct methods receive "this" by reference; this is semantically significant if a struct uses "this" after calling outside code. The same behavior would occur with classes if `void instanceMethod()...` were replaced by `static void theType.staticMethod(ref it)...` and the invocation was changed from `someInstance.instanceMethod();` to `staticMethod(ref someInstance);`. – supercat Dec 15 '11 at 18:33
@JaredPar Objects of Coordinate are immutable. If you change a reference to a new object that doesn't really change this fact. – tymtam Jan 05 '15 at 02:37

mqp · Answer 4 · 2011-12-14T19:13:22.590

It's hard to work with immutable data structures unless you have some functional programming constructs. Suppose you wanted to create an immutable vector containing every other capital letter. How would you do it unless you

A) had functions that did things like range(65, 91), filter(only even) and map(int -> char) to create the sequence in one shot and then turn it into an array

B) created the vector as mutable, added the characters in a loop, then then "froze" it, making it immutable?

By the way, C# does have the B option to some extent -- ReadOnlyCollection can wrap a mutable collection and prevent people from mutating it. However, it's a pain in the ass to do that all the time (and obviously it's hard to support sharing structure between copies when you don't know if something is going to become immutable or not.) A is a better option.

Remember, when C# 1.0 existed, it didn't have anonymous functions, it didn't have language support for generators or other laziness, it didn't have any functional APIs like LINQ -- not even map or filter -- it didn't have concise array initialization syntax (you couldn't write new int[] { 1, 2, 5 }) and it didn't have generic types; just putting stuff into and getting stuff out of collections normally was a pain. So I don't think it would have been a great choice to spend time on making robust immutable collections with such poor language support for using them.

One difficulty with something like "ReadOnlyCollection" is that it makes no distinction between collections which are are immutable and those which can be changed by "someone else". IMHO, .net would have benefited greatly from an ImmutableArray class, of which String would simply be a special case (ImmutableArray), and with both ImmutableArray and Array inheriting from abstract class ReadableArray. — supercat, Dec 15 '11 at 15:39

score 2 · Accepted Answer · answered Dec 14 '11 at 19:17

2

I'll quote from that Eric Lippert blog that you've been reading:

because no one ever designed, specified, implemented, tested, documented and shipped that feature.

In other words, there is no reason other than it hasn't been high enough value or priority to get done ahead of all the other things they're working on.

answered Dec 14 '11 at 19:17

Scott Pedersen

1,311
1
10
19

2

OK, but someone designed, specified, implemented, tested, documented, and shipped lots of mutable data structures instead, so it's a perfectly sensible question to ask why they made that decision. – mqp Dec 14 '11 at 19:25

score 0 · Answer 6 · answered Dec 15 '11 at 16:25

It would be nice if .net had some really solid support for immutable data holders (classes and structures). One difficulty with adding really good support for such things, though, is that taking maximum advantage of mutable and immutable data structures would require some fundamental changes to the way inheritance works. While I would like to see such support in the next major object-oriented framework, I don't know that it can be efficiently worked into existing frameworks like .net or Java.

To see the problem, imagine that there are two basic data types: basicItem and deluxeItem (which is a basicItem with a few extra fields added). Each can exist in two concrete forms: mutable and immutable. Each can also be described in an abstract form: readable. Thus, there should be six data types; all but ReadableBasicItem should be substitutable for at least one other:

ReadableBasicItem: Not substitutable for anything
MutableBasicItem: ReadableBasicItem
ImmutableBasicItem: ReadableBasicItem
ReadableDeluxeItem: ReadableBasicItem
MutableDeluxeItem: ReadableDeluxeItem, MutableBasicItem (also ReadableBasicItem)
ImmutableDeluxeItem: ReadableDeluxeItem, ImmutableBasicItem (also ReadableBasicItem)

Even thought the underlying data type has just one base and one derived type, there inheritance graph has two "diamonds" since both "MutableDeluxeItem" and "ImmutableDeluxeItem" have two parents (MutableBasicItem and ReadableDeluxeItem), both of which inherit from ReadableBasicItem. Existing class architectures cannot effectively deal with that. Note that it wouldn't be necessary to support generalized multiple inheritance; merely to allow some specific variants such as those above (which, despite having "diamonds" in the inheritance graph, have an internal structure such that both ReadableDeluxeItem and MutableBasicItem would inherit from "the same" ReadableBasicItem).

Also, while support for that style of inheritance of mutable and immutable types might be nice, the biggest payoff wouldn't happen unless the system had a means of distinguishing heap-stored objects that should expose value semantics from those which should expose reference semantics, could distinguish mutable objects from immutable ones, and could allow objects to start out in an "uncommitted" state (neither mutable nor guaranteed immutable). Copying a reference to a heap object with mutable value semantics should perform a memberwise clone on that object and any nested objects with mutable value semantics, except in cases where the original reference would be guaranteed destroyed; the clones should start life as uncommitted, but be CompareExchange'd to mutable or immutable as needed.

Adding framework support for such features would allow copy-on-write value semantics to be implemented much more efficiently than would be possible without framework support, but such support would really have to be built into the framework from the ground up. I don't think it could very well be overlaid onto an existing framework.

Don't you just hate downvotes with no explanations? Have an upvote, some good points here! — Andriy Drozdyuk, May 29 '12 at 13:05
Thanks. My answer was perhaps a bit long-winded, but I'm glad someone took the time to read it. There are ways of achieving effects similar to what I described if one doesn't mind an extra layer of indirection, but in the absence of language support they end up being rather clunky. One thing that would help a lot and might be feasible with some CLR changes, would be the ability to define an class `Foo:T via X where T:interface`, where the class would have an field of type `T`, called `X`, and where all members of that interface's vtable would be mapped to an auto-generated... — supercat, May 29 '12 at 15:26
...wrapper which would invoke the corresponding members of `(T)X`. If one could do that sort of thing, one could produce a generic moderately-aggressive copy-on-write wrapper class which would have much lower multi-threading overhead than a "perfect" copy-on-write class, but still retain must of the storage and run-time efficiencies such things can offer (using copy-on-write with a tree not only makes deep-copy operations more time and storage efficient--it also helps with things like comparison, intersection, difference, and union operations). — supercat, May 29 '12 at 15:35

Immutable Data Structures in C#

6 Answers6