16

When you need to have very small objects, say that contains 2 float property, and you will have millions of them that aren't gonna be "destroyed" right away, are structs a better choice or classes?

Like in xna as a library, there are point3s, etc as structs but if you need to hold onto those values for a long time, would it pose a performance threat?

Joan Venge
  • 315,713
  • 212
  • 479
  • 689

6 Answers6

34

Contrary to most questions about structs, this actually seems to be a good use of a struct. If the data it contains are value types, and you are going to use a lot of these, a structure would work well.

Some tips:

:: The struct should not be larger than 16 bytes, or you lose the performance advantages.

:: Make the struct immutable. That makes the usage clearer.

Example:

public struct Point3D {

   public float X { get; private set; }
   public float Y { get; private set; }
   public float Z { get; private set; }

   public Point3D(float x, float y, float z) {
      X = x;
      Y = y;
      Z = z;
   }

   public Point3D Invert() {
      return new Point3D(-X, -Y, -Z);
   }

}
Guffa
  • 687,336
  • 108
  • 737
  • 1,005
  • 4
    This is probably a good use of the readonly keyword, i.e. instead of declaring X, Y and Z as properties, you could have readonly fields. The advantage would be that you wouldn't break your immutability by mistake. – Ant Mar 04 '09 at 11:47
  • Is `myRect = new Rectangle(myRect.X, myRect.Y+4, myRect.Width, myRect.Height);` really clearer than `myRect.Y += 4;`? Mutating *methods* for structs are problematic, but exposed fields should be the norm in many cases. – supercat Aug 08 '12 at 23:25
  • @supercat: If you have a need to change part of a struct, then it doesn't represent a single entity as a value type is supposed to do. Besides, exposed fields are only convenient when you have the value as a single variable, if it's a property or in an array, then you can't change them like that. You can make methods that returns the new value to handle operations like that: `myRect = myRect.AddY(4);`. – Guffa Aug 08 '12 at 23:59
  • @Guffa: A struct is the best format for storing a collection of related but orthogonal values (e.g. the coordinates of a point); while .net unfortunately didn't provide any particularly nice patterns for modifying structs within a collection, I would suggest that `var tempRect = ListOfRects[index]; tempRect.Y += 4; ListOfRects[index] = tempRect;` is still clearer than `var tempRect = ListOfRects[index]; ListOfRects[index] = new Rectangle(tempRect.X, tempRect.Y+4, tempRect.Width, tempRect.Height);`. Among other things, one can tell at a glance which fields it's modifying.... – supercat Aug 09 '12 at 15:16
  • ...without having to examine the code for a constructor to ensure that it's not (accidentally or intentionally) doing something like transposing X and Y coordinates. The advice to limit structs to read-only properties dates to early C# compilers where `ListOfRects[index].Y += 4` would generate bogus code rather than yielding a compile-time error. PODS (Plain Old Data Structs) have semantics which differ from class-type objects, but they all have the same semantics. Suppose one had a `List`, where `CRect` is a mutable class with the same properties as `Rectangle`. How would one... – supercat Aug 09 '12 at 15:25
  • ...increase the `Y` coordinate of `ListOfCRects[index]` without affecting the `Y` coordinate of any of the items in `SomeOtherListOfCRects`? Unless the objects in `ListOfCRects` have never been exposed to the outside world, it may be impossible to know what other references to them may exist. Structs don't have that problem. If a `List` has a `Count` of 5, it holds five distinct instances--no more, no less--and those are distinct from the instances held in every other storage location of type `Rectangle`. A very useful property. – supercat Aug 09 '12 at 15:35
  • @supercat: Normally you don't have to examine the code to assume that it's doing something reasonably. In that case you would also have to examine the code so that you know that `myRect.Y += 4;` doesn't do anything unexpected. A struct in C# is not at all like for example a struct in C, so PODS in C# should generally be classes, not structs. You can't just use structs as a method for protecting data, just because a struct is a distinct value in it's simplest form, you can't blindly trust that it can't be implemented to affect other values. – Guffa Aug 09 '12 at 18:46
  • @Guffa: If `thing` is a variable of some struct type which exposes an `int` field named `Y`, then `thing.Y += 4;` will add four to that field in that variable, no matter what else `thing` contains or does, or where its value came from. Further, if method wishes to make data that's held in a PODS field available to its caller without allowing the caller to modify the data store, it can simply `return` the field. By contrast, if the data were held in a mutable class, the method would either have to copy all of the data to a new instance, or trust that the caller wouldn't modify it, or... – supercat Aug 09 '12 at 18:59
  • @supercat: If the member `Y` is a field, then it won't do anything unexpected, but if it's a property, it can do anything. – Guffa Aug 09 '12 at 19:06
  • ...create a new instance of a read-only wrapper around the data, receive from the caller an object instance into which the data should be copied, etc. There are many approaches to dealing with the situation, but none of them is suitable for every case; there's no universal approach that a caller can use to end up with a mutable class object which contains the same data in the collection except for some particular change (e.g. a `Y` value). By contrast, with PODS, the approach is simple: get the data and modify the appropriate field. – supercat Aug 09 '12 at 19:06
  • @Guffa: Structs which wrap fields in mutable properties lose the advantages of PODS, which is why I favor PODS rather than structs with mutable properties. As you note, it's impossible to know what a property does without looking at the code. On the other hand, if Intellisense reports that something is a field, one won't have to look at any code to know that it will behave like a field. – supercat Aug 09 '12 at 19:10
  • @supercat: What you are talking about is one specific use of structs that is totally different from how a struct is normally used. A struct is intended for implementing value types that represent a single entity of some kind. You won't normally manipulate part of the value, but rather create a new value. – Guffa Aug 09 '12 at 19:47
  • @Guffa: PODS containing orthogonal fields (e.g. `Point` and `Rectangle`) might not comprise the majority of structs, but they're hardly uncommon. If what one wants to store is a collection of orthogonal values, using an exposed-field PODS and modifying the field directly will often yield code which is faster and clearer than using any approach. In cases where it does, why should that not be the preferred choice? – supercat Aug 09 '12 at 20:08
  • @supercat: The `Point` and `Rectangle` structures are not PODS. They have properties, not fields. – Guffa Aug 09 '12 at 21:19
  • @Guffa: Hmm... Guess Microsoft probably did that for Reflection/Designer purposes. Semantically, there's no reason `Point` and `Rectangle` shouldn't be PODS (meaning structures whose state is entirely exposed via public fields), since they generally behave like them. BTW, minor peeve about vb.net: although it doesn't allow `With` to be used to piecewise-modify read/write properties of `Struct` type, it allows read/write properties to be passed as `ByRef` method parameters by copying to a temporary first and writing the property after. Another argument in favor of PODS, since... – supercat Aug 09 '12 at 21:38
  • ...if X was a public field, `Threading.Interlocked.Increment(myRect.X) would behave as it should; with `X` being a property, it will behave almost as it should. – supercat Aug 09 '12 at 21:39
  • @supercat: I think that you will find that PODS isn't used anywhere in the framework. That's simply now what structs are intended for. – Guffa Aug 09 '12 at 21:55
  • @Guffa: Whether or not `Rectangle` is actually a PODS, it holds four integer values, representing `X`, `Y`, `Width`, and `Height`, which may independently be set to any value, in any sequence. In other words, it holds a set of related but orthogonal values. Further, regardless of what struct was "intended" for, what other type of data item allows one to group together related but orthogonal values, with the semantics that the values will be mutable when the group stored in an mutable storage location, and immutable when it's stored in an immutable one? – supercat Aug 09 '12 at 22:46
  • @supercat: A struct doesn't do that. It doesn't become immutable just because you store it in an "immutable location". – Guffa Aug 09 '12 at 23:37
  • @Guffa: If create a `new Tuple(new Point(4,5))` is there any way that that Tuple's `Item1.X` property will ever hold a value other than 4? In what way would that Tuple's `Item1` property not be immutable? Conversely, if I store a `KeyValuePair` into some static field `myThing`, and the `Foo.ToString()` override stores a different `KeyValuePair` in that static field, the string reported by `myThing.ToString()` will contain the `ToString` result from the old value of `Key` and from the new value of `Value`, demonstrating that the instance was mutated. – supercat Aug 10 '12 at 14:56
  • @Guffa: In the former situation, `Point` was a so-called "mutable" struct, but the particular instance described can never change. In the latter situation, `KeyValuePair` is a so-called "immutable" struct, but it's possible to mutate an instance by copying all fields from another instance (if `myThing` had held a reference to an immutable class like `Tuple`, the `ToString` would have continued to use the unmodified instance even when `myThing` was overwritten; since it was a structure, though, there was no "unmodified instance". – supercat Aug 10 '12 at 15:04
  • @supercat: Just because you don't have access to change a struct, doesn't make it immutable. That's a property of the implementation of the type, not how you use it. – Guffa Aug 10 '12 at 15:19
  • @Guffa: If the only references that exist to something are held by code which is neither going to change it, nor expose it to any code that might, then the thing is immutable because *there is no possible means by which it could be mutated*. It's possible to have a mutable `private set` property, but I would consider a `private set` property to be immutable if no execution sequence would cause the class to use the setter after the object is exposed to outside code. – supercat Aug 10 '12 at 15:37
  • @supercat: You are confused regarding the concept. A mutable type is still mutable even if you encapsulate it in another type that makes it impossible to change the value. It's not the usage of a type that determines if it's mutble, it's the implementation. – Guffa Aug 10 '12 at 15:47
  • I would consider an *instance* of a type to be mutable if there exists some means by which it could be mutated while a reference exists, and immutable if no such means exists. Arrays are mutable, but if the only extant reference to a particular array instance is held by code which will neither write it nor expose it to code that might do so, that particular array instance will be effectively immutable. Any immutable types which hold data in arrays use this pattern. All instances of immutable types are immutable, but it is often useful to have immutable instances of mutable types. – supercat Aug 10 '12 at 18:14
  • @supercat: No. A type does not magically become immutable just because you don't have access to change it. What you call "effectively immutable" is not immutable. It's called encapsulation, and is one of the base principles of object oriented programming. – Guffa Aug 10 '12 at 20:51
  • @Guffa: We're obviously talking past each other. I didn't say the *TYPE* became immutable because some instances could never mutate. I said the *INSTANCES* that could never mutate were immutable. – supercat Aug 10 '12 at 21:01
  • @supercat: Then you are simply using it wrong. Mutability is a property of the type, not the instance. – Guffa Aug 10 '12 at 22:06
5

The answer depends on where the objects/values will eventually be stored. If they are to be stored in an untyped collection like ArrayList, then you end up boxing them. Boxing creates an object wrapper for a struct and the footprint is the same as with a class object. On the other hand if you use a typed array like T[] or List, then using structs will only store the actual data for each element with footprint for entire collection only and not its elements.

So structs are more efficient for use in T[] arrays.

agsamek
  • 8,734
  • 11
  • 36
  • 43
2

The big concern is whether the memory is allocated on the stack or the heap. Structs go in the stack by default, and the stack is generally much more limited in terms of space. So creating a whole bunch of structs just like that can be a problem.

In practice, though, I don't really think it's that big of a deal. If you have that many of them they're likely part of a class instance (on the heap) somewhere.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
  • 1
    The stack space is not a concern for structs, as you rarely have that many local variables. If you create an array of structs it will be allocated on the heap, not the stack. – Guffa Mar 03 '09 at 22:17
  • I think that was my point: if you _do_ have that many it's a problem, but you're not likely to have that many. – Joel Coehoorn Mar 03 '09 at 22:18
  • If you do have that many variables it's a problem regardless if it's a struct or a class, as the references would also be allocated on the stack... :) – Guffa Mar 03 '09 at 22:23
  • Thanks Joel. So if have a list of 1M point3 struct values, they are gonna be stored on the heap? – Joan Venge Mar 03 '09 at 22:43
2

Struct seems right for this application.

Bear in mind that the "need to hold onto those valueS" implies their storage on the heap somewhere, probably an array field of a class instance.

One thing to watch out for is that this results in a allocation on the large object heap. Its not that clear how, if at all, this heap defrags itself, however for very long lived objects that perhaps isn't an issue.

Using class for millions of these data types would likely be expensive in the shear volume of dereferencing that will likely be taking place for operations on this type.

AnthonyWJones
  • 187,081
  • 35
  • 232
  • 306
2

As a rule, large arrays of non-aliased (i.e. unshared) data of the same type is best stored in structs for performance since you reduce the number of indirections. (See also when-are-structs-the-answer). The exact performance difference between class and struct depends on your usage. (E.g., in operations, do you only access parts of the struct? Do you do a lot of temporary copying? If the struct is small it's probably always better to use but if it's large, creating temporary copies may slow you down. If you make it immutable you will have to always copy the whole thing to change the value.)

When in doubt, measure.

Since you are interested in possible long-term effects that may not be apparent by such a measurement, be aware that such arrays are likely stored on the large-object heap and should be re-used instead of destroyed and re-allocated. (see CRL Inside Out: Large Object Heap Uncovered.)

When passing larger-size structs in calls you might want to pass them with the ref argument to avoid copying.

Community
  • 1
  • 1
ILoveFortran
  • 3,441
  • 1
  • 21
  • 19
1

Value types (struct) are good for type that are not allocated on heap often, that is, they are mostly contained in another reference or value type.

The Vector3 example you gave is a perfect example. You will rarely have dangling Vector3 in heap, they will most of the time be contained in a type that is itself in heap, or used as a local variable, in which case, it will be allocated on the stack.

Coincoin
  • 27,880
  • 7
  • 55
  • 76
  • Thanks, so structs will be on the heap only if used as a local variable, and nothing else? – Joan Venge Mar 03 '09 at 22:47
  • No, it's the other way around. A value type like a struct is only allocated on the stack if it's a local variable. If it's a member of a class, it's allocated as part of the object's memory area on the heap. – Guffa Mar 04 '09 at 00:37