39

Quick note on the accepted answer: I disagree with a small part of Jeffrey's answer, namely the point that since Delegate had to be a reference type, it follows that all delegates are reference types. (It simply isn't true that a multi-level inheritance chain rules out value types; all enum types, for example, inherit from System.Enum, which in turn inherits from System.ValueType, which inherits from System.Object, all reference types.) However I think the fact that, fundamentally, all delegates in fact inherit not just from Delegate but from MulticastDelegate is the critical realization here. As Raymond points out in a comment to his answer, once you've committed to supporting multiple subscribers, there's really no point in not using a reference type for the delegate itself, given the need for an array somewhere.


See update at bottom.

It has always seemed strange to me that if I do this:

Action foo = obj.Foo;

I am creating a new Action object, every time. I'm sure the cost is minimal, but it involves allocation of memory to later be garbage collected.

Given that delegates are inherently themselves immutable, I wonder why they couldn't be value types? Then a line of code like the one above would incur nothing more than a simple assignment to a memory address on the stack*.

Even considering anonymous functions, it seems (to me) this would work. Consider the following simple example.

Action foo = () => { obj.Foo(); };

In this case foo does constitute a closure, yes. And in many cases, I imagine this does require an actual reference type (such as when local variables are closed over and are modified within the closure). But in some cases, it shouldn't. For instance in the above case, it seems that a type to support the closure could look like this: I take back my original point about this. The below really does need to be a reference type (or: it doesn't need to be, but if it's a struct it's just going to get boxed anyway). So, disregard the below code example. I leave it only to provide context for answers the specfically mention it.

struct CompilerGenerated
{
    Obj obj;

    public CompilerGenerated(Obj obj)
    {
        this.obj = obj;
    }

    public void CallFoo()
    {
        obj.Foo();
    }
}

// ...elsewhere...

// This would not require any long-term memory allocation
// if Action were a value type, since CompilerGenerated
// is also a value type.
Action foo = new CompilerGenerated(obj).CallFoo;

Does this question make sense? As I see it, there are two possible explanations:

  • Implementing delegates properly as value types would have required additional work/complexity, since support for things like closures that do modify values of local variables would have required compiler-generated reference types anyway.
  • There are some other reasons why, under the hood, delegates simply can't be implemented as value types.

In the end, I'm not losing any sleep over this; it's just something I've been curious about for a little while.


Update: In response to Ani's comment, I see why the CompilerGenerated type in my above example might as well be a reference type, since if a delegate is going to comprise a function pointer and an object pointer it'll need a reference type anyway (at least for anonymous functions using closures, since even if you introduced an additional generic type parameter—e.g., Action<TCaller>—this wouldn't cover types that can't be named!). However, all this does is kind of make me regret bringing the question of compiler-generated types for closures into the discussion at all! My main question is about delegates, i.e., the thing with the function pointer and the object pointer. It still seems to me that could be a value type.

In other words, even if this...

Action foo = () => { obj.Foo(); };

...requires the creation of one reference type object (to support the closure, and give the delegate something to reference), why does it require the creation of two (the closure-supporting object plus the Action delegate)?

*Yes, yes, implementation detail, I know! All I really mean is short-term memory storage.

Community
  • 1
  • 1
Dan Tao
  • 125,917
  • 54
  • 300
  • 447
  • The first possible explanation sounds more than reason enough to me. – Jon Oct 26 '11 at 16:52
  • Ok, say you want to implement a delegate as a value-type with a function pointer and an object pointer. In your closure example, where would the object pointer point to? You would almost certainly need to box the `CompilerGenerated` struct instance and put it on the heap (with escape analysis, this could be avoided in some situations). – Ani Oct 26 '11 at 16:52
  • @Ani: Ah, I see your point. Maybe you could expand on that comment in the form of an answer? – Dan Tao Oct 26 '11 at 16:56
  • Do you really want to work with Nullable ? – Amy B Oct 26 '11 at 17:01
  • @DavidB: Do you really want to work with `null`? Nullability isn't always desired. –  Oct 26 '11 at 17:03
  • Jon Skeet has sort of answered this : http://stackoverflow.com/questions/2324224/why-system-enum-is-not-a-value-type – Chris S Oct 26 '11 at 19:16
  • 2
    @Ani: If a delegate was a struct that contained a function pointer and object pointer, constructing a closure would only require creating one new heap object rather than two. If delegates were interface types (which is what I think they should be), a closure would only require creating a single heap object to hold both the closure data and its method. – supercat Feb 10 '13 at 18:43

7 Answers7

17

The question boils down to this: the CLI (Common Language Infrastructure) specification says that delegates are reference types. Why is this so?

One reason is clearly visible in the .NET Framework today. In the original design, there were two kinds of delegates: normal delegates and "multicast" delegates, which could have more than one target in their invocation list. The MulticastDelegate class inherits from Delegate. Since you can't inherit from a value type, Delegate had to be a reference type.

In the end, all actual delegates ended up being multicast delegates, but at that stage in the process, it was too late to merge the two classes. See this blog post about this exact topic:

We abandoned the distinction between Delegate and MulticastDelegate towards the end of V1. At that time, it would have been a massive change to merge the two classes so we didn’t do so. You should pretend that they are merged and that only MulticastDelegate exists.

In addition, delegates currently have 4-6 fields, all pointers. 16 bytes is usually considered the upper bound where saving memory still wins out over extra copying. A 64-bit MulticastDelegate takes up 48 bytes. Given this, and the fact that they were using inheritance suggests that a class was the natural choice.

Jeffrey Sax
  • 10,253
  • 3
  • 29
  • 40
  • +1 This seems like a good technical reason (as opposed to other answers) - so they were stuck with having `MulticastDelegate` and `Delegate` and the former being a subclass of the latter? –  Oct 26 '11 at 17:45
  • 3
    I kind of see where you're going with this, but it isn't strictly true that just because `Delegate` is a reference type, all delegates must be reference types, right? I mean, consider `System.Enum`: it's a reference type, and all actual enum types inherit from it; and yet enums are value types. This is legal in the CLI and clearly possible from the compiler's end. So it still must be that there are further reasons for the decision that all delegate types are reference types. – Dan Tao Oct 26 '11 at 17:58
  • @DanTao `System.Enum` is also a value type. But I see your point: you can inherit from a value type. I don't know if it changes anything, though. They already had one delegate type that inherited from another, so using a class was natural. Then the user delegates inherited from that. You don't have that multi-level inheritance with `Enum`. – Jeffrey Sax Oct 26 '11 at 18:21
  • 2
    `System.Enum` is *not* a value type! It is an abstract class; see for yourself: http://msdn.microsoft.com/en-us/library/system.enum.aspx. – Dan Tao Oct 26 '11 at 18:31
  • 1
    Also, you have multi-level inheritance with all value types, as they all inherit from `System.ValueType` (which is, ironically, a reference type). – Dan Tao Oct 26 '11 at 18:39
  • @tao It's a reference type but presumably boxed? And doesn't leave that boxed state until you call the instance methods – Chris S Oct 26 '11 at 19:12
  • 1
    @DanTao You are right. My mistake. There are differences, though. Enums can have only one (non-constant) instance field, and are used much like the underlying integer type. Delegates are a lot heavier, with at least 4 instance fields. Size matters. (See my amended answer.) – Jeffrey Sax Oct 26 '11 at 19:21
  • 1
    I feel that the point about the base `Delegate` type being a reference type got us a bit off track, but it seems the invocation list used by `MulticastDelegate` is at the heart of the matter here. And I like the historical context you provided in this answer. Thanks! – Dan Tao Oct 27 '11 at 12:53
  • 3
    I don't see any reason a `Delegate` needs more than two fields (to hold the Method and Target). Given two "unicast" delegates, one could form a multicast delegate by having `Target` reference an array containing the other two delegates, and `Method` point to a static method takes such an array as its first parameter and invokes the delegates therein. Note that building a delegate with the `Method` and `Target` of such a multicast delegate would produce a correctly-formed multicast delegate--unlikes the present situation. Note also that... – supercat Feb 10 '13 at 18:46
  • 1
    ...`Delegate.Combine` could check the `Method` of any delegate to see whether it was the multicast-invoker, and produce a combined invocation list if so. – supercat Feb 10 '13 at 18:47
10

There is only one reason that Delegate needs to be a class, but it's a big one: while a delegate could be small enough to allow efficient storage as a value type (8 bytes on 32-bit systems, or 16 bytes on 64-bit systems), there's no way it could be small enough to efficiently guarantee if one thread attempts to write a delegate while another thread attempts to execute it, the latter thread wouldn't end up either invoking the old method on the new target, or the new method on the old target. Allowing such a thing to occur would be a major security hole. Having delegates be reference types avoids this risk.

Actually, even better than having delegates be structure types would be having them be interfaces. Creating a closure requires creating two heap objects: a compiler-generated object to hold any closed-over variables, and a delegate to invoke the proper method on that object. If delegates were interfaces, the object which held the closed-over variables could itself be used as the delegate, with no other object required.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • 2
    This answer gives the most real reason. Delegates could really be value types when not considering this problem. There are also similar types, like `TypedReference`, `ArgIterator`, and various handles, that also represent a reference to something, they are all value types. – IS4 Nov 29 '14 at 23:11
  • readonly value type delegates. – Gavin Williams Aug 26 '23 at 18:27
  • @GavinWilliams: What do you mean? If delegates were two-field value types, even if they didn't provide any means of accessing fields independently, an assignment would still be likely to set one field and then the other as two separate operations. – supercat Aug 27 '23 at 20:06
  • I was responding to the reason you raised that they couldn't be structs. A readonly struct would be thread safe. And now that we have delegate* (function pointers) I'm leaning toward them being value types. Being able to use value types is really important. And I think it's more down to the will of the developers than any real limitation. – Gavin Williams Aug 28 '23 at 04:06
  • 1
    @GavinWilliams: If a read-only structure type contains multiple fields, a struct assignment is treated as a sequence of individual field assignments, *even if the structure type pretends to be "read-only"*. – supercat Aug 28 '23 at 05:15
  • I'm just thinking of my own scenario, and that's wanting to use delegates as function pointers to invoke at a known time later. In an update-render pattern. Where the structs holding the delegates get handed off in a bulk lot and they can't be invoked while they are being constructed. It's just not possible. The struct holding the delegate is either being constructed, or in a pool waiting to be used, or passed off to another thread to be used. There's no chance of collision between threads. I think I can see what you're saying though, there are more scenarios I should consider. – Gavin Williams Aug 28 '23 at 12:44
  • @GavinWilliams: A fundamental aspect of the .NET design philosophy is that there's no way by which even thread-unsafe code can violate fundamental invariants. As such, it can't offer a delegate structure type with "trust me" semantics. – supercat Aug 28 '23 at 15:06
7

Imagine if delegates were value types.

public delegate void Notify();

void SignalTwice(Notify notify) { notify(); notify(); }

int counter = 0;
Notify handler = () => { counter++; }
SignalTwice(handler);
System.Console.WriteLine(counter); // what should this print?

Per your proposal, this would internally be converted to

struct CompilerGenerated
{
    int counter = 0;
    public Execute() { ++counter; }
};

Notify handler = new CompilerGenerated();
SignalTwice(handler);
System.Console.WriteLine(counter); // what should this print?

If delegate were a value type, then SignalEvent would get a copy of handler, which means that a brand new CompilerGenerated would be created (a copy of handler) and passed to SignalEvent. SignalTwice would execute the delegate twice, which increments the counter twice in the copy. And then SignalTwice returns, and the function prints 0, because the original was not modified.

Raymond Chen
  • 44,448
  • 11
  • 96
  • 135
  • 1
    But there are actually two things here. I agree with you that I was mistaken to propose that `CompilerGenerated` in this case could be a value type. But there is still the issue of the delegate *itself*. In your example, rather than `Notify handler = new CompilerGenerated()`, wouldn't the "real" output be more like `Notify handler = new CompilerGenerated().Execute`? And in this case, what I'm trying to get at is that even though `CompilerGenerated` needs to be a reference type, `Notify` does *not*. A `Notify` instance can point to a `CompilerGenerated` and a function (`Execute`) and that's it. – Dan Tao Oct 26 '11 at 18:04
  • 6
    Since a delegate can have multiple subscribers, and a struct must be fixed-size, you would either have to have a hard limit on the number of subscribers you can multicast to (and if you pick a number too large, then your delegate is getting pretty big), or you would have to keep the subscribers in a separate object like an array (in which case you failed to avoid creating a reference type). – Raymond Chen Oct 26 '11 at 20:44
  • 1
    I think your follow-up comment really gets to the crux of the matter. The fact that delegates allow for multiple subscribers seems to be the underlying "root cause" here; at least that's the explanation that makes the most sense to me. By the way, at the risk of sounding pretty corny, let me just say it's great to see you on StackOverflow and I feel pretty honored to have received an answer from you! – Dan Tao Oct 26 '11 at 22:32
  • 3
    There's no reason delegate would have to be a reference type to allow multi-subscription to work. If delegate was a struct which combined a object reference and a pointer to a method which accepts such an object, two Action(integer) delegates could be combined into an Array of Action(integer) along with a pointer to an ExecuteAllActionsInArray method (if either delegate had ExecuteAllActionsInArray as its method, the delegates in the attached array could be copied to the new array). – supercat Nov 29 '11 at 22:32
  • 1
    The purpose of the exercise was to make a delegate a pure value type (no additional reference objects created). Once you say "create an array" you've missed the point of the exercise. – Raymond Chen Nov 30 '11 at 00:03
  • 2
    @Raymond Chen: From thread safety, delegates must be class types. Were thread safety not a concern, a delegate could be an immutable struct with a field of type object and a pointer to a method which can act on that type of object. Constructing two independent delegates would not require the creation of any heap objects. Combining them into a multicast delegate would require the creation of a heap object to hold the originals, and a struct which would hold a reference to that heap object and a pointer to a method which would run the delegates held in that object. – supercat Nov 30 '11 at 21:10
  • 2
    @Raymond Chen: The majority of delegates that are created in practice have exactly one target. Even though it is necessary to make a heap allocation for multicast delegates which have multiple targets, avoiding a heap allocation for every delegate would be a significant win. Actually, I suspect that reducing every single-target delegate to a single heap object comprising one object reference and a function pointer would be better than what actually exists. – supercat Nov 30 '11 at 21:19
4

Here's an uninformed guess:

If delegates were implemented as value-types, instances would be very expensive to copy around since a delegate-instance is relatively heavy. Perhaps MS felt it would be safer to design them as immutable reference types - copying machine-word sized references to instances are relatively cheap.

A delegate instance needs, at the very least:

  • An object reference (the "this" reference for the wrapped method if it is an instance method).
  • A pointer to the wrapped function.
  • A reference to the object containing the multicast invocation list. Note that a delegate-type should support, by design, multicast using the same delegate type.

Let's assume that value-type delegates were implemented in a similar manner to the current reference-type implementation (this is perhaps somewhat unreasonable; a different design may well have been chosen to keep the size down) to illustrate. Using Reflector, here are the fields required in a delegate instance:

System.Delegate: _methodBase, _methodPtr, _methodPtrAux, _target
System.MulticastDelegate: _invocationCount, _invocationList

If implemented as a struct (no object header), these would add up to 24 bytes on x86 and 48 bytes on x64, which is massive for a struct.


On another note, I want to ask how, in your proposed design, making the CompilerGenerated closure-type a struct helps in any way. Where would the created delegate's object pointer point to? Leaving the closure type instance on the stack without proper escape analysis would be extremely risky business.

Ani
  • 111,048
  • 26
  • 262
  • 307
  • 1
    I've responded to your comment about making `CompilerGenerated` a value type: you're right, it doesn't help. But the question still stands about delegates themselves. I think your educated guess on the reasoning here makes sense. – Dan Tao Oct 26 '11 at 17:47
  • 2
    Actually, a delegate just needs two fields--the target object and a function upon which to execute it. Two or more delegates can be combined by creating an array holding them and putting that array, along with pointer to an ExecuteAllDelegatesInArray method, into a new delegate. – supercat Nov 29 '11 at 22:35
1

I saw this interesting conversation on the Internet:

Immutable doesn't mean it has to be a value type. And something that is a value type is not required to be immutable. The two often go hand-in-hand, but they are not actually the same thing, and there are in fact counter-examples of each in the .NET Framework (the String class, for example).

And the answer:

The difference being that while immutable reference types are reasonably common and perfectly reasonable, making value types mutable is almost always a bad idea, and can result in some very confusing behaviour!

Taken from here

So, in my opinion the decision was made by language usability aspects, and not by compiler technological difficulties. I love nullable delegates.

nawfal
  • 70,104
  • 56
  • 326
  • 368
Daniel Peñalba
  • 30,507
  • 32
  • 137
  • 219
  • It isn't important to the question, but: I don't see what could be confusing about mutable value types. Could you give an example? The obvious exceptions are readers who have never (properly) understood value type semantics and cases where it's nonobvious something is a value type. But both can and should be fixed anyway... –  Oct 26 '11 at 17:04
  • 1
    Just search for mutable value types being evil here or on eric lipperts blog. There is plenty of discussion on that issue available. Them being confusing usually boils down to the compiler creating a copy of the value type where you don't expect it, and then you only mutate the copy. – CodesInChaos Oct 26 '11 at 17:09
  • 3
    The CLR memory manager is optimized to accommodate frequent allocation of small, short-lived objects, making the performance concern of reference types vs. value types mostly illusory. In almost all cases, a reference type is preferable to a value type for representing a given piece of immutable data. The biggest benefit to a value type is when you need compact storage of data in a large collection. – Dan Bryant Oct 26 '11 at 17:12
  • 1
    @CodeInChaos: From I've found, only casting-to-interface example described in http://blogs.msdn.com/b/ericlippert/archive/2011/03/14/to-box-or-not-to-box-that-is-the-question.aspx seem particular strange to me. `using` making a copy, `obj.valueTypeInstance` making a copy, getting a `SpinLock` (value type) from a list making a copy, etc. all seem perfectly in line with value types' semantics as I have them in my head. That leaves that many people haven't fully groked value semantics. –  Oct 26 '11 at 17:25
  • 1
    I don't disagree with anything you've said in this answer. But—and maybe it's just me—I view it all as rather tangential to what I'm asking. That is, I didn't mean to suggest "*because* delegates are immutable, *it follows* they should be value types"; rather, I only mentioned their immutability as a reason they don't *have* to be reference types. The main benefit of choosing value over reference would be (in my mind) the memory payoff. – Dan Tao Oct 26 '11 at 17:34
  • 2
    Also: you may love nullable delegates, but it seems (to me) that a simple `default(Action)` that did nothing (or even threw a `NullReferenceException` as the function pointer would point to nothing) would work just fine. – Dan Tao Oct 26 '11 at 18:43
1

I can tell that making delegates as reference types is definitely a bad design choice. They could be value types and still support multi-cast delegates.

Imagine that Delegate is a struct composed of, let's say: object target; pointer to the method

It can be a struct, right? The boxing will only occur if the target is a struct (but the delegate itself will not be boxed).

You may think it will not support MultiCastDelegate, but then we can: Create a new object that will hold the array of normal delegates. Return a Delegate (as struct) to that new object, which will implement Invoke iterating over all its values and calling Invoke on them.

So, for normal delegates, that are never going to call two or more handlers, it could work as a struct. Unfortunately, that is not going to change in .Net.


As a side note, variance does not requires the Delegate to be reference types. The parameters of the delegate should be reference types. After all, if you pass a string were an object is required (for input, not ref or out), then no cast is needed, as string is already an object.

Paulo Zemek
  • 147
  • 2
  • 1
0

I guess one reason is support for multi cast delegates Multi cast delegates are more complex than simply a few fields indicating target and method.

Another thing that's only possible in this form is delegate variance. This kind of variance requires a reference conversion between the two types.

Interestingly F# defines it's own function pointer type that's similar to delegates, but more lightweight. But I'm not sure if it's a value or reference type.

CodesInChaos
  • 106,488
  • 23
  • 218
  • 262