3

It maybe unclear for me but, when I read the msdn doc and I try to understand deeply Struct behaviour.

From msdn

Dealing with Stack :

This will yield performance gains.

and :

Whenever you have a need for a type that will be used often and is mostly just a piece of data, structs might be a good option.

I don't understand because, I guess when I pass a Struct in parameter of a method the "copy value" process must be slower than "copy reference" process?

Christophe Debove
  • 6,088
  • 20
  • 73
  • 124

3 Answers3

21

The cost of passing a struct is proportional to its size. If the struct is smaller than a reference or the same size as a reference then passing its value will have the same cost as passing a reference.

If not, then you are correct; copying the struct might be more expensive than copying the reference. That's why the design guidelines say to keep a struct small.

(Note that when you call a method on a struct, the "this" is actually passed as a reference to the variable that contains the struct value; that's how you can write a mutable struct.)

There are potential performance gains when using structs, but as you correctly point out, there are potential performance losses as well. Structs are cheap (in both memory and time) to allocate and cheap to deallocate (in time), and cheap to copy if they are small. References are slightly more expensive in both memory and time to allocate, more expensive to deallocate, and cheap to copy. If you have a large number of small structs -- say, a million Point structs -- then it will be cheaper to allocate and deallocate an array with a million structs in it than an array with a million references to a million instances of a Point class.

But if the struct is big, then all that additional copying might be more expensive than the benefit you get from the more efficient allocation and deallocation. You have to look at the whole picture when doing performance analysis; don't make the "struct vs class" decision on the basis of performance without empirical data to back up that decision.

There is much misinformation on the internet, in our own documentation, and in many books, about how memory management works behind the scenes in C#. If you are interested in learning what is myth and what is reality, I recommend reading my series of articles on the subject. Start from the bottom:

http://blogs.msdn.com/b/ericlippert/archive/tags/memory+management/

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • ok it deserve another question but just to be sure what is the size of an reference ( to calculate when it more interresting to deal with Struct ) – Christophe Debove Mar 16 '12 at 13:56
  • 1
    @ChristopheDebove: The size of a reference is 32 bits on a 32-bit system and 64 bits on a 64-bit system, i.e. 4 or 8 bytes. – Guffa Mar 16 '12 at 14:39
  • @Guffa ok I guessed it depended of the type of the reference, – Christophe Debove Mar 16 '12 at 14:43
  • 2
    @ChristopheDebove: It is in theory possible to have a memory management system in which the size of a reference differs. (And indeed, in Windows 3.1 it was the case that you could have pointers of different sizes depending on whether they were "near" or "far" pointers, which was quite confusing.) In modern architectures, references are all exactly the same size, and they are the "native word size" of the machine -- 32 bits or 64 bits on current hardware. – Eric Lippert Mar 16 '12 at 15:33
  • Cache locality is another factor in favour of struct arrays. The GC also need to follow less references potentially making it much faster. And ever bit structs can be a good idea in high performance code if you carefully avoid copying them. For example XNA has 64 byte structs, which are passed around by `ref`, even to functions that don't modify them. Ugly, but fast, especially since the GC of the XNA on the XBox sucks. – CodesInChaos Mar 16 '12 at 19:39
4

Another recommendation for structs is that they should be small; not larger than 16 bytes. That way they can be copied with a single instruction, or just a few instructions.

Copying a reasonably small amount of data will be almost as fast as copying a reference, and then it will be faster for the method to access the data as there is no redirection needed.

If the struct is smaller than a pointer (i.e. 32 or 64 bits), it will even be faster to copy the value than to copy a reference.

Even if a structure is a bit larger than a reference, there is still some overhead involved with creating objects. Each object has some overhead and has to be allocated as a separate memory block. A byte as a value type takes up just a single byte, but if you box the byte as an object, it will take up 16 or 24 bytes on the heap, plus another 4 or 8 bytes for the reference.


Anyhow, a decision to use a struct or a class should normally be about what kind of data they represent, and not just about performance. A structure works well for data that represent a single entity, so that you can treat it as a single value.

Guffa
  • 687,336
  • 108
  • 737
  • 1,005
-1

It is true about what you said regarding copy process as in copying a reference takes lesser time than copying a struct as all struct and references are stored on the stack. But the reason why msdn suggests using struct would give a performance gain is the time takes to access the stack and the heap.

If you need a type that contains mostly static data and does is not huge (meaning it does not contain huge arrays, multi dimensional or otherwise, of value types) then it would be wiser to use struct rather than reference types as the access for stack is much lower than the managed heap.

Along with that, the time taken for allocation and deallocation, or you can say in short, the management of heap is somewhat time consuming as compared to the stack.

You can have a better understanding of this topic here and here as this has been explained in detail.

Hope it helps.

Community
  • 1
  • 1
Shakti Prakash Singh
  • 2,414
  • 5
  • 35
  • 59
  • Stack and heap use the same memory (just different areas of it) on every machine i've ever used, and thus would have about the same access speed. Allocation is cheap, and a big part of the time is spent constructing, which happens for structs as well. I imagine a bigger factor performancewise is cache -- a struct on the stack is more likely to be in the cache than some random thing out on the heap. – cHao Mar 16 '12 at 14:15
  • 1
    All structs and references are *not* stored on the stack. What about an array of int? Are those ints all stored on the stack? – Eric Lippert Mar 16 '12 at 14:40
  • array of int is a primitive type I think it's on the stack. – Christophe Debove Mar 16 '12 at 14:44
  • 5
    @ChristopheDebove: The stack is only a million bytes by default. Try allocating an array of a million ints. Did it work? Then where did the memory come from? **Arrays store their elements on the heap.** The idea that "value types go on the stack" is simply *completely wrong*. Lots of people believe that, but that belief doesn't make any sense. The truth is *variables whose lifetimes are known to be short* go on the stack, regardless of whether those variables contain values of value type or references. Variables whose lifetimes are not known to be short go on the heap. – Eric Lippert Mar 16 '12 at 14:53
  • @Eric: You may want to let someone at MS know. MSDN's very own "[Structs Tutorial](http://msdn.microsoft.com/en-us/library/aa288471%28v=vs.71%29.aspx)" says that "When you call the New operator on a class, it will be allocated on the heap. However, when you instantiate a struct, it gets created on the stack." I've just commented on the page, but eh. Maybe someone on the inside can get those guys to listen. – cHao Mar 16 '12 at 15:12
  • 1
    @cHao: That is unfortunate in that it is both *accurate* and *misleading* at the same time. When you do "new struct", what happens is **temporary space is allocated on the stack**, and that temporary space is used as the variable that is initialized by the struct's constructor. Then the now-initialized temporary value on the stack is *copied* to the location where the value is going to be used, which might be on the stack or the heap. (And in some cases the compiler generates code which optimizes away the temporary.) – Eric Lippert Mar 16 '12 at 15:17
  • @cHao: Thanks for bringing that to my attention, and thanks for commenting. Unfortunately we very rarely update documentation that is that old. – Eric Lippert Mar 16 '12 at 15:18
  • @christophedebove: You're welcome. If you want to learn the truth about implementation details of memory management in C# and why most everything you read about it elsewhere is wrong, I recommend reading my long series of articles on that topic. Start from the bottom: http://blogs.msdn.com/b/ericlippert/archive/tags/memory+management/ – Eric Lippert Mar 16 '12 at 15:29
  • @Eric - From what I read, no matter how short the lifetime of the reference type is, it would be allocated on the heap. They would be considered Gen 0 objects. And the huge objects are automatically shifted to Gen 2. I am not sure how correct it is but I had no point doubting it as this was provided in the msdn documentation on Memory management. [Automatic Memory Management](http://msdn.microsoft.com/en-us/library/f144e03t.aspx) – Shakti Prakash Singh Mar 19 '12 at 05:32
  • 1
    @ShaktiSingh: Indeed, instances of reference type are in practice always allocated on the heap. If the runtime could prove that the instance was short lived then it could allocate it on the stack, but in practice the jitter does not perform the necessary escape analysis. – Eric Lippert Mar 19 '12 at 05:37
  • Large objects are allocated on the large object heap, which has a different compaction and collection policy than the regular heap. – Eric Lippert Mar 19 '12 at 05:38
  • @Eric - Just went through more documentation. Well, nowhere it has been mentioned about storing the reference type on stack. I guess they need to update the documentation. Also, in case of the Arrays or Collections, my understanding is that the reference to the array or the collection is stored on the stack and the further references to the objects they hold is on the heap. If this is the case, would accessing individual object from them increase the access time? Pardon me if my understanding is wrong, just starting with .Net. Still have lot of C++ in my head. – Shakti Prakash Singh Mar 19 '12 at 05:59
  • *References* are stored on the stack or the heap depending on the lifetime of the variable used for storage. *Instances of reference type* are stored on the heap. It would be *legal* for short-lived instances of reference type to be stored on the heap but in practice the jitter does not do the necessary escape analysis. In the case of arrays, the variable holding the reference to the array is on the stack or the heap depending on its lifetime; the variables *in* the array are long-lifetime and therefore on the heap. I don't understand your question about access time. – Eric Lippert Mar 19 '12 at 13:53
  • But why are you worried about this at all? Let the runtime manage your memory for you; that's what it's for. – Eric Lippert Mar 19 '12 at 13:53