26

The following are the only ways classes are different from structs in C# (please correct me if I'm wrong):

  • Class variables are references, while struct variables are values, therefore the entire value of struct is copied in assignments and parameter passes
  • Class variables are pointers stored on stack that point to the memory on heap, while struct variables are on stored heap as values

Suppose I have an immutable struct, that is struct with fields that cannot be modified once initialized. Each time I pass this struct as a parameter or use in assignments, the value would be copied and stored on stack.

Then suppose I make this immutable struct to be an immutable class. The single instance of this class would be created once, and only the reference to the class would be copied in assignments and parameter passes.

If the object was mutable, the behavior in these two cases would be different: when one would change the object, in the first case the copy of the struct would be modified, while in the second case the original object would be changed. However, in both cases the object is immutable, therefore there is no difference whether this is actually a class or a struct for the user of this object.

Since copying reference is cheaper than copying struct, why would one use an immutable struct?

Also, since mutable structs are evil, it looks like there is no reason to use structs at all.

Where am I wrong?

Community
  • 1
  • 1
Georgii Oleinikov
  • 3,865
  • 3
  • 27
  • 27
  • Actually, structs are stored on the stack, not the heap which can make them much more efficient if they're sufficiently compact. – itsme86 Jan 03 '13 at 21:18
  • 1
    Structs must be defined prior to use and cannot be null. Class instances need not be defined and can be null. – Dour High Arch Jan 03 '13 at 21:20
  • 3
    @itsme86 - `struct`s are stored wherever they're contained. A local variable `struct` may be stored on the stack (implementation detail), but a `struct` that is part of a `class` is stored as part of that `class` data, usually on the heap (implementation detail). – Damien_The_Unbeliever Jan 04 '13 at 15:21
  • 1
    Class member fields are stored on the heap. Local variables, such as method or property variables or method parameters, are stored on top of the stack when the method or property is invoked, and are popped off the stack when the method or property returns. But keep in mind that for local variables which are reference type (classes) only their reference will be stored on the stack, and their contents will be stored on the heap. (This is simply a rule-of-thumb; optimizations and implementation differences exist which change how these behave in some cases.) – Brandon Bonds Nov 17 '15 at 17:20

4 Answers4

31

Since copying reference is cheaper than copying struct, why would one use an immutable struct?

This isn't always true. Copying a reference is going to be 8 bytes on a 64bit OS, which is potentially larger than many structs.

Also note that creation of the class is likely more expensive. Creating a struct is often done completely on the stack (though there are many exceptions), which is very fast. Creating a class requires creating the object handle (for the garbage collector), creating the reference on the stack, and tracking the object's lifetime. This can add GC pressure, which also has a real cost.

That being said, creating a large immutable struct is likely not a good idea, which is part of why the Guidelines for choosing between Classes and Structures recommend always using a class if your struct will be more than 16 bytes, if it will be boxed, and other issues that make the difference smaller.

That being said, I often base my decision more on the intended usage and meaning of the type in question. Value types should be used to refer to a single value (again, refer to guidelines), and often have a semantic meaning and expected usage different than classes. This is often just as important as the performance characteristics when making the choice between class or struct.

Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
  • 1
    Though they violate those rules left and right in the framework :) – Rotem Jan 03 '13 at 21:20
  • 1
    @Rotem There's a reason they're *guidelines* and not *rules*. It means you shouldn't be violating them unless you have a compelling reason for doing so, and understand the implications of your actions. I'm sure the framework designers understood the implications of their decision, while the OP is unlikely to. – Servy Jan 03 '13 at 21:21
  • @Servy Meant as a humorous side point, not saying the guidelines are rubbish. – Rotem Jan 03 '13 at 21:23
  • 5
    `Creating a struct is done completely on the stack` may not be true. [in the Microsoft implementation of C# on the desktop CLR, value types are stored on the stack when the value is a local variable or temporary that is not a closed-over local variable of a lambda or anonymous method, and the method body is not an iterator block, and the jitter chooses to not enregister the value](http://blogs.msdn.com/b/ericlippert/archive/2010/09/30/the-truth-about-value-types.aspx) – L.B Jan 03 '13 at 21:24
  • @Rotem You used the word "rules". I'm pointing out that they were very specifically called "guidelines" instead, because they knew that they don't always apply. – Servy Jan 03 '13 at 21:24
  • 4
    It's also worth mentioning that structs and classes carry a semantic implication to the user of whether it's really a single value or not, which is worth taking into consideration. You shouldn't *just* consider performance. – Servy Jan 03 '13 at 21:26
27

Reed's answer is quite good but just to add a few extra points:

please correct me if I'm wrong

You are basically on the right track here. You've made the common error of confusing variables with values. Variables are storage locations; values are stored in variables. And you are flirting with the commonly-stated myth that "value types go on the stack"; rather, variables go on either short-term or long-term storage, because variables are storage locations. Whether a variable goes on short or long term storage depends on its known lifetime, not its type.

But all of that is not particularly relevant to your question, which boils down to asking for a refutation of this syllogism:

  • Mutable structs are evil.
  • Reference copying is cheaper than struct copying, so immutable structs are always worse.
  • Therefore, there is never any use for structs.

We can refute the syllogism in several ways.

First, yes, mutable structs are evil. However, they are sometimes very useful because in some limited scenarios, you can get a performance advantage. I do not recommend this approach unless other reasonable avenues have been exhausted and there is a real performance problem.

Second, reference copying is not necessarily cheaper than struct copying. References are typically implemented as 4 or 8 byte managed pointers (though that is an implementation detail; they could be implemented as opaque handles). Copying a reference-sized struct is neither cheaper nor more expensive than copying a reference-sized reference.

Third, even if reference copying is cheaper than struct copying, references must be dereferenced in order to get at their fields. Dereferencing is not zero cost! Not only does it take machine cycles to dereference a reference, doing so might mess up the processor cache, and that can make future dereferences far more expensive!

Fourth, even if reference copying is cheaper than struct copying, who cares? If that is not the bottleneck that is producing an unacceptable performance cost then which one is faster is completely irrelevant.

Fifth, references are far, far more expensive in memory space than structs are.

Sixth, references add expense because the network of references must be periodically traced by the garbage collector; "blittable" structs may be ignored by the garbage collector entirely. Garbage collection is a large expense.

Seventh, immutable value types cannot be null, unlike reference types. You know that every value is a good value. And as Reed pointed out, in order to get a good value of a reference type you have to run both an allocator and a constructor. That's not cheap.

Eighth, value types represent values, and programs are often about the manipulation of values. It makes sense to "bake in" the metaphors of both "value" and "reference" in a language, regardless of which is "cheaper".

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • Another point worth mentioning is that reference types are very efficient relative to structs when the average number of references to each instance is large, and inefficient relative to structs if the average number is small. If the number of references to each instance will be 1 because the instances are constructed purely to return multiple values from a function, and the recipients are going to read out their data and abandon them, even a huge struct would outperform a class holding the same information (though unless consumers always use all the data, some other design might be better). – supercat Jan 04 '13 at 17:09
2

From MSDN;

Classes are reference types and structures are value types. Reference types are allocated on the heap, and memory management is handled by the garbage collector. Value types are allocated on the stack or inline and are deallocated when they go out of scope. In general, value types are cheaper to allocate and deallocate. However, if they are used in scenarios that require a significant amount of boxing and unboxing, they perform poorly as compared to reference types.

Do not define a structure unless the type has all of the following characteristics:

  • It logically represents a single value, similar to primitive types (integer, double, and so on).

  • It has an instance size smaller than 16 bytes.

  • It is immutable.

  • It will not have to be boxed frequently.

So, you should always use a class instead of struct, if your struct will be more than 16 bytes. Also read from http://www.dotnetperls.com/struct

Soner Gönül
  • 97,193
  • 102
  • 206
  • 364
  • Whoever (at Microsoft) wrote: _"Value types are allocated on the stack or inline and are deallocated when they go out of scope."_ has done the world a disservice. If I have `public class X { public int A; public int B; public int C; }` then every one of those fields (all of which are structs/value types) is stored on the managed heap for ever instance of `X`. That sentence has been confusing folks for decades. It's frustrating that that is still on the Microsoft site, 9 years later. – Flydog57 Sep 28 '22 at 22:21
1

There are two usage cases for structures. Opaque structures are useful for things which could be implemented using immutable classes, but are sufficiently small that even in the best of circumstances there wouldn't be much--if any--benefit to using a class, especially if the frequency with which they are created and discarded is a significant fraction of the frequency with which they will be simply copied. For example, Decimal is a 16-byte struct, so holding a million Decimal values would take 16 megabytes. If it were a class, each reference to a Decimal instance would take 4 or 8 bytes, but each distinct instance would probably take another 20-32 bytes. If one had many large arrays whose elements were copied from a small number of distinct Decimal instances, the class could win out, but in most scenarios one would be more likely to have an array with a million references to a million distinct instances of Decimal, which would mean the struct would win out.

Using structures in this way is generally only good if the guidelines quoted from MSDN apply (though the immutability guideline is mainly a consequence of the fact that there isn't yet any way via which struct methods can indicate that they modify the underlying struct). If any of the last three guidelines don't apply, one is likely better off using an immutable class than a struct. If the first guideline does not apply, however, that means one shouldn't use an opaque struct, but not that one should use a class instead.

In some situations, the purpose of a data type is simply to fasten a group of variables together with duct tape so that their values can be passed around as a unit, but they still remain semantically as distinct variables. For example, a lot of methods may need to pass around groups of three floating-point numbers representing 3d coordinates. If one wants to draw a triangle, it's a lot more convenient to pass three Point3d parameters than nine floating-point numbers. In many cases, the purpose of such types is not to impart any domain-specific behavior, but rather to simply provide a means of passing things around conveniently. In such cases, structures can offer major performance advantages over classes, if one uses them properly. A struct which is supposed to represent three varaibles of type double fastened together with duct tape should simply have three public fields of type double. Such a struct will allow two common operations to be performed efficiently:

  1. Given an instance, take a snapshot of its state so the instance can be modified without disturbing the snapshot
  2. Given an instance which is no longer needed, somehow come up with an instance which is slightly different

Immutable class types allow the first to be performed at fixed cost regardless of the amount of data held by the class, but they are inefficient at the second. The greater the amount of data the variable is supposed to represent, the greater the advantage of immutable class types versus structs when performing the first operation, and the greater the advantage of exposed-field structs when performing the second.

Mutable class types can be efficient in scenarios where the second operation dominates, and the first is needed seldom if ever, but it can be difficult for an object to expose the present values in a mutable class object without exposing the object itself to outside modification.

Note that depending upon usage patterns, large exposed-field structures may be much more efficient than either opaque structures or class types. Structure larger than 17 bytes are often less efficient than smaller ones, but they can still be vastly more efficient than classes. Further, the cost of passing a structure as a ref parameter does not depend upon its size. Large structs are inefficient if one accesses them via properties rather than fields, passes them by value needlessly, etc. but if one is careful to avoid redundant "copy" operations, there are usage patterns where there is no break-even point for classes versus structs--structs will simply perform better.

Some people may recoil in horror at the idea of a type having exposed fields, but I would suggest that a struct such as I describe shouldn't be thought of so much as an entity unto itself, but rather an extension of the things that read or write it. For example:

public struct SlopeAndIntercept
{
   public double Slope,Intercept;
}
public SlopeAndIntercept FindLeastSquaresFit() ...

Code which is going to perform a least-squares-fit of a bunch of points will have to do a significant amount of work to find either the slope or Y intercept of the resulting line; finding both would not cost much more. Code which calls the FindLeastSquaresFit method is likely going to want to have the slope in one variable and the intercept in another. If such code does:

var resultLine = FindLeastSquaresFit();

the result will be to effectively create two variables resultLine.Slope and resultLine.Intercept which the method can manipulate as it sees fit. The fields of resultLine don't really belong to SlopeIntercept, nor to FindLeastSquaresFit; they belong to the code that declares resultLine. The situation is little different from if the method were used as:

double Slope, Intercept;
FindLeastSquaresFit(out Slope, out Intercept);

In that context, it would be clear that immediately following the function call, the two variables have the meaning assigned by the method, but that their meaning at any other time will depend upon what else the method does with them. Likewise for the fields of the aforementioned structure.

There are some situations where it may be better to return data using an immutable class rather than a transparent structure. Among other things, using a class will make it easier for future versions of a function that returns a Foo to return something which includes additional information. On the other hand, there are many situations where code is going to expect to deal with a specific set of discrete things, and changing that set of things would fundamentally change what clients have to do with it. For example, if one has a bunch of code that deals with (x,y) points, adding a "z" coordinate is going to require that code to be rewritten, and there's nothing the "point" type can do to mitigate that.

supercat
  • 77,689
  • 9
  • 166
  • 211