35

Eric Lippert told me I should "try to always make value types immutable", so I figured I should try to always make value types immutable.

But, I just found this internal mutable struct, System.Web.Util.SimpleBitVector32, in the System.Web assembly, which makes me think that there must be a good reason for having a mutable struct. I'm guessing the reason that they did it this way is because it performed better under testing, and they kept it internal to discourage its misuse. However, that's speculation.

I've C&P'd the source of this struct. What is it that justifies the design decision to use a mutable struct? In general, what sort of benefits can be gained by the approach and when are these benefits significant enough to justify the potential detriments?

[Serializable, StructLayout(LayoutKind.Sequential)]
internal struct SimpleBitVector32
{
    private int data;
    internal SimpleBitVector32(int data)
    {
        this.data = data;
    }

    internal int IntegerValue
    {
        get { return this.data; }
        set { this.data = value; }
    }

    internal bool this[int bit]
    {
        get { 
            return ((this.data & bit) == bit); 
        }
        set {
            int data = this.data;
            if (value) this.data = data | bit;
            else this.data = data & ~bit;
        }
    }

    internal int this[int mask, int offset]
    {
        get { return ((this.data & mask) >> offset); }
        set { this.data = (this.data & ~mask) | (value << offset); }
    }

    internal void Set(int bit)
    {
        this.data |= bit;
    }

    internal void Clear(int bit)
    {
        this.data &= ~bit;
    }
}
smartcaveman
  • 41,281
  • 29
  • 127
  • 212
  • 5
    Look at Eric's comment on this, page 2 : unfortunately no direct link to it, search for "Eric Lippert 16 May 2008 11:16 AM" – BlackICE Nov 13 '11 at 02:04
  • @david, good eye - that's definitely helpful . I wonder what he'd have to say about this particular use – smartcaveman Nov 13 '11 at 02:06
  • @smartcaveman: what Eric says fits in with my hunch that performance *does* play a part in deciding whether the struct is mutable or not. – IAbstract Nov 13 '11 at 14:30
  • @IAbstract, it looks that way. But I'd like to know more about the specific point at which the relative performance of using a struct becomes significant enough to justify its mutability, as well as if there are any circumstances when a public mutable struct would be justified – smartcaveman Nov 13 '11 at 15:29
  • And remember, just because its in the .NET Framework doesn't mean that it is good practice. A good example is the System.LOGIC class in mscorlib v2.0. It has a complete reimplementation of boolean equality. – Kendall Frey Feb 03 '12 at 17:51

5 Answers5

19

Given that the payload is a 32-bit integer, I'd say this could easily have been written as an immutable struct, probably with no impact on performance. Whether you're calling a mutator method that changes the value of a 32-bit field, or replacing a 32-bit struct with a new 32-bit struct, you're still doing the exact same memory operations.

Probably somebody wanted something that acted kind of like an array (while really just being bits in a 32-bit integer), so they decided they wanted to use indexer syntax with it, instead of a less-obvious .WithTheseBitsChanged() method that returns a new struct. Since it wasn't going to be used directly by anyone outside MS's web team, and probably not by very many people even within the web team, I imagine they had quite a bit more leeway in design decisions than the people building the public APIs.

So, no, probably not that way for performance -- it was probably just some programmer's personal preference in coding style, and there was never any compelling reason to change it.

If you're looking for design guidelines, I wouldn't spend too much time looking at code that hasn't been polished for public consumption.

Joe White
  • 94,807
  • 60
  • 220
  • 330
  • +1 array behavior is probably the right answer here if I had to guess – BrokenGlass Nov 13 '11 at 01:56
  • 1
    In general, I'd agree about internals...but this did make it into `System.Web`, and it's used pervasively in the most common classes of that assembly ... That's got to count for something – smartcaveman Nov 13 '11 at 01:59
  • @brokenglass, It's generally used to check and set flags. – smartcaveman Nov 13 '11 at 02:01
  • @smartcaveman: performance only explains why it's a struct, not why its mutable as I had posted - this post is much better in that regard. – BrokenGlass Nov 13 '11 at 02:07
  • There are already a lot of SO questions from mystified .NET programmers that wonder why String.Replace() or DateTime.AddDays() do not work. *Occasional* immutability is not a solution. – Hans Passant Nov 13 '11 at 17:38
  • 1
    As written, it doesn't seem to do anything an immutable struct could not. Suppose, however, it had been written using Interlocked.CompareExchange for the set/reset operations. That would retain perfect storage efficiency while allowing multiple threads to use different bits in the structure without interference. Trying to use an immutable value type would break thread safety. – supercat Nov 16 '11 at 17:25
15

SimpleBitVector32 is mutable, I suspect, for the same reasons that BitVector32 is mutable. In my opinion, the immutable guideline is just that, a guideline; however, one should have a really good reason for doing so.

Consider, also, the Dictionary<TKey, TValue> - I go into some extended details here. The dictionary's Entry struct is mutable - you can change TValue at any time. But, Entry logically represents a value.

Mutability must make sense. I agree with the @JoeWhite: somebody wanted something that acted kind of like an array (while really just being bits in a 32-bit integer); also that both BitVector structs could easily have been ... immutable.

But, as a blanket statement, I disagree with it was probably just some programmer's personal preference in coding style and lean more toward there was never [nor is there] any compelling reason to change it. Simply know and understand the responsibility of using a mutable struct.

Edit
For the record, I do heartily agree that you should always try to make a struct immutable. If you find that requirements dictate member mutability, revisit the design decision and get peers involved.

Update
I was not initially confident in my assessment of performance when considering a mutable value type v. immutable. However, as @David points out, Eric Lippert writes this:

There are times when you need to wring every last bit of performance out of a system. And in those scenarios, you sometimes have to make a tradeoff between code that is clean, pure, robust , understandable, predictable, modifiable and code that is none of the above but blazingly fast.

I bolded pure because a mutable struct does not fit the pure ideal that a struct should be immutable. There are side-affect of writing a mutable struct: understability and predictability are compromised, as Eric goes on to explain:

Mutable value types ... behave in a manner that many people find deeply counterintuitive, and thereby make it easy to write buggy code (or correct code that is easily turned into buggy code by accident.) But yes, they are real fast.

The point Eric is making is that you, as the designer and/or developer need to make a conscious and informed decision. How do you become informed? Eric explains that also:

I would consider coding up two benchmark solutions -- one using mutable structs, one using immutable structs -- and run some realistic user-scenario-focused benchmarks. But here's the thing: do not pick the faster one. Instead, decide BEFORE you run the benchmark how slow is unacceptably slow.

We know that altering a value type is faster than creating a new value type; but considering correctness:

If both solutions are acceptable, choose the one that is clean, correct and fast enough.

The key is being fast enough to offset side affects of choosing mutable over immutable. Only you can determine that.

Community
  • 1
  • 1
IAbstract
  • 19,551
  • 15
  • 98
  • 146
  • That's fair. Of course, everyone who *uses* your code will also need to understand that responsibility, which is probably why `Entry` and `SimpleBitVector32` are internal -- that way only a small group at MS truly needs to understand the implementation. Know your audience: if your only audience is the people you work with, and they're all smart, then you have more freedom to do stuff that only smart people will understand. – Joe White Nov 13 '11 at 03:07
  • I completely agree. For that reason, I would not expose `BitVector32` for public mutability - e.g., **I** would control usage. – IAbstract Nov 13 '11 at 03:10
  • +1, updates definitely added value to the answer. However, I still think the "Only you can determine" is a little bit of a cop out. There's got to be a best practice or standard for us to go by – smartcaveman Nov 13 '11 at 17:25
  • @smartcaveman: sometimes that is the last answer. Once you have the facts, you still have to make the design decisions. The best practice is *immutable value types* except in such rare circumstances when performance becomes the overriding factor. :) – IAbstract Nov 13 '11 at 18:58
  • The solution for the fact that some people don't understand mutable structs is for people to be taught that .net is not Java. Back in the days when C# allowed doing silly things with mutable struct values, making structs immutable was a good way to discourage people from doing those silly things. If the compiler properly forbids attempts to modify fields in read-only contexts, however, instead of allowing such modifications but having them not work, nothing is gained by forbidding updates of fields in non-read-only contexts. Indeed, I'd suggest that so-called "immutable structs" have... – supercat Feb 15 '12 at 23:54
  • ...more pitfalls for a knowledgeable programmer than do "plain old data" structs with exposed fields. Among other things, "mutable" structs in read-only contexts are immutable, while so-called "immutable" structs in non-read-only contexts, are mutable. For example, the `ToString()` method for `KeyValuePair` works by calling `ToString()` on `Key` and `Value` in turn. If, between the time a `KeyValuePair` calls `Key.ToString()` and `Value.ToString()` it gets replaced by a new instance, the resulting string will combine the `Key` from the first with the `Value` of the second. – supercat Feb 15 '12 at 23:58
15

Actually, if you search for all classes containing BitVector in the .NET framework, you'll find a bunch of these beasts :-)

  • System.Collections.Specialized.BitVector32 (the sole public one...)
  • System.Web.Util.SafeBitVector32 (thread safe)
  • System.Web.Util.SimpleBitVector32
  • System.Runtime.Caching.SafeBitVector32 (thread safe)
  • System.Configuration.SafeBitVector32 (thread safe)
  • System.Configuration.SimpleBitVector32

And if you look here were resides the SSCLI (Microsoft Shared Source CLI, aka ROTOR) source of System.Configuration.SimpleBitVector32, you'll find this comment:

//
// This is a cut down copy of System.Collections.Specialized.BitVector32. The
// reason this is here is because it is used rather intensively by Control and
// WebControl. As a result, being able to inline this operations results in a
// measurable performance gain, at the expense of some maintainability.
//
[Serializable()]
internal struct SimpleBitVector32

I believe this says it all. I think the System.Web.Util one is more elaborate but built on the same grounds.

Simon Mourier
  • 132,049
  • 21
  • 248
  • 298
2

Using a struct for a 32- or 64-bit vector as shown here is reasonable, with a few caveats:

  1. I would recommend using an Interlocked.CompareExchange loop when performing any updates to the structure, rather than just using the ordinary Boolean operators directly. If one thread tries to write bit 3 while another tries to write bit 8, neither operation should interfere with the other beyond delaying it a little bit. Use of an Interlocked.CompareExchange loop will avoid the possibility of errant behavior (thread 1 reads value, thread 2 reads old value, thread 1 writes new value, thread 2 writes value computed based on old value and undoes thread 1's change) without needing any other type of locking.
  2. Structure members, other than property setters, which modify "this" should be avoided. It's better to use a static method which accepts the structure as a reference parameter. Invoking a structure member which modifies "this" is generally identical to calling a static method which accepts the member as a reference parameter, both from a semantic and performance standpoint, but there's one key difference: If one tries to pass a read-only structure by reference to a static method, one will get a compiler error. By contrast, if one invokes on a read-only structure a method which modifies "this", there won't be any compiler error but the intended modification won't happen. Since even mutable structures can get treated as read-only in certain contexts, it's far better to get a compiler error when this happens than to have code which will compile but won't work.

Eric Lippert likes to rail on about how mutable structures are evil, but it's important to recognize his relation to them: he's one of the people on the C# team who is charged with making the language support features like closures and iterators. Because of some design decisions early in the creation of .net, properly supporting value-type semantics is difficult in some contexts; his job would be a lot easier if there weren't any mutable value types. I don't begrudge Eric for his point of view, but it's important to note that some principles which may be important in framework and language design are not so applicable to application design.

supercat
  • 77,689
  • 9
  • 166
  • 211
0

If I understand correctly, you cannot make a serializable immutable struct simply by using SerializableAttribute. This is because during deserialization, the serializer instantiates a default instance of the struct, then sets all the fields following instantiation. If they are readonly, deserialization will fail.

Thus, the struct had to be mutable, else a complex serialization system would have been necessary.

Zenexer
  • 18,788
  • 9
  • 71
  • 77