128

So, I (think I) understand what the in parameter modifier does. But what it does appears to be quite redundant.

Usually, I'd think that the only reason to use a ref would be to modify the calling variable, which is explicitly forbidden by in. So passing by in reference seems logically equivalent to passing by value.

Is there some sort of performance advantage? It was my belief that on the back-end side of things, a ref parameter must at least copy the physical address of the variable, which should be the same size as any typical object reference.

So, then is the advantage just in larger structs, or is there some behind-the-scenes compiler optimization that makes it attractive elsewhere? If the latter, why shouldn't I make every parameter an in?

Magnetron
  • 7,495
  • 1
  • 25
  • 41
Travis Reed
  • 1,572
  • 2
  • 8
  • 11
  • 2
    Yes, there is a performance advantage. `ref` is used to pass *structs* by reference instead of copying them. `in` means the struct shouldn't be modified. – Panagiotis Kanavos Oct 15 '18 at 15:49
  • @dbc no, this has nothing to do with interop – Panagiotis Kanavos Oct 15 '18 at 15:50
  • 6
    Performance for value types.[The New “in” Keyword For C# 7.2](https://dotnetcoretutorials.com/2018/01/08/new-keyword-c-7-2/) – Silvermind Oct 15 '18 at 15:51
  • 3
    Detailed discussion [here](https://blogs.msdn.microsoft.com/seteplia/2018/03/07/the-in-modifier-and-the-readonly-structs-in-c/). Note the last warning: `It means that you should never pass a non-readonly struct as in parameter. ` – Panagiotis Kanavos Oct 15 '18 at 15:55

5 Answers5

120

in was recently introduced to the C# language.

in is actually a ref readonly. Generally speaking, there is only one use case where in can be helpful: high performance apps dealing with lots of large readonly structs.

Assuming you have:

readonly struct VeryLarge
{
    public readonly long Value1;   
    public readonly long Value2;

    public long Compute() { }
    // etc
}

and

void Process(in VeryLarge value) { }

In that case, the VeryLarge struct will be passed by-reference without creating of defensive copies when using this struct in the Process method (e.g. when calling value.Compute()), and the struct immutability is ensured by the compiler.

Note that passing a not-readonly struct with an in modifier will cause the compiler to create a defensive copy when calling struct's methods and accessing properties in the Process method above, which will negatively affect performance!

There is a really good MSDN blog entry which I recommend to carefully read.

If you would like to get some more historical background of in-introducing, you could read this discussion in the C# language's GitHub repository.

In general, most developers agree that introducing of in could be seen as a mistake. It's a rather exotic language feature and can only be useful in high-perf edge cases.

dymanoid
  • 14,771
  • 4
  • 36
  • 64
  • Does it suppress compiler-generated defensive copies, or merely eliminate the need for programmers to manually create defensive copies in contexts that would otherwise use `ref`, e.g. by allowing `someProc(in thing.someProperty);` versus `propType myProp = thing.someProperty; someProc(ref myProp);`? is `in` a C#-only concept like `out`, or has it been added to the .NET Framework? – supercat Oct 15 '18 at 19:13
  • @supercat, you cannot pass a property using `in`, because `in` is effectively `ref` with a special attribute. So your first snippet won't compile. @VisualMelon, that's right, the defensive copying occurs on calling methods or accessing properties of the struct from within the method that gets the struct as argument. – dymanoid Oct 15 '18 at 19:23
  • @dymanoid sorry for spamming you, it took me about 10 attempts to understand what you'd written (and I don't think that's your fault!) – VisualMelon Oct 15 '18 at 19:25
  • @VisualMelon, I didn't explain it precisely because the OP mentions that they already understand the concept. The `in` feature is of course confusing. I merely tried to describe the (only) use case where this feature makes sense. – dymanoid Oct 15 '18 at 19:28
  • 5
    @dymanoid: Both `in` and `out` are concepts which should have been in the Framework from the beginning (along with a means by which methods and properties could indicate whether they modifiy `this`). A compiler could allow a property to be passed to an `in` argument by passing a reference to a temporary holding it; if a function written in another language modifies that temporary, the semantics would be a bit icky, but that would be the fault of the function's behavior not matching its signature. – supercat Oct 15 '18 at 19:32
  • 2
    @supercat, please don't open a discussion here (I'm not in the .NET Framework concept team anyway). There is a long discussion referenced in the answer, and that GitHub discussion references some others - interesting reading. BTW, you can pass properties by-ref in VB.NET - the compiler creates those temporaries for you (which might lead to obscure issues of course). – dymanoid Oct 15 '18 at 19:36
56

passing by in reference seems logically equivalent to passing by value.

Correct.

Is there some sort of performance advantage?

Yes.

It was my belief that on the back-end side of things, a ref parameter must at least copy the physical address of the variable, which should be the same size as any typical object reference.

There is not a requirement that a reference to an object and a reference to a variable both be the same size, and there is not a requirement that either is the size of a machine word, but yes, in practice both are 32 bits on 32 bit machines and 64 bits on 64 bit machines.

What you think the "physical address" has to do with it is unclear to me. On Windows we use virtual addresses, not physical addresses in user mode code. Under what possible circumstances would you imagine that a physical address is meaningful in a C# program, I am curious to know.

There is also not a requirement that a reference of any kind be implemented as the virtual address of the storage. References could be opaque handles into GC tables in a conforming implementation of the CLI specification.

is the advantage just in larger structs?

Decreasing the cost of passing larger structs is the motivating scenario for the feature.

Note that there is no guarantee that in makes any program actually faster, and it can make programs slower. All questions about performance must be answered by empirical research. There are very few optimizations that are always wins; this is not an "always win" optimization.

is there some behind-the-scenes compiler optimization that makes it attractive elsewhere?

The compiler and runtime are permitted to make any optimization they choose if doing so does not violate the rules of the C# specification. There is to my knowledge not such an optimization yet for in parameters, but that does not preclude such optimizations in the future.

why shouldn't I make every parameter an in?

Well, suppose you made an int parameter instead an in int parameter. What costs are imposed?

  • the call site now requires a variable rather than a value
  • the variable cannot be enregistered. The jitter's carefully-tuned register allocation scheme just got a wrench thrown into it.
  • the code at the call site is larger because it must take a ref to the variable and put that on the stack, whereas before it could simply push the value onto the call stack
  • larger code means that some short jump instructions may have now become long jump instructions, so again, the code is now larger. This has knock-on effects on all kinds of things. Caches get filled up sooner, the jitter has more work to do, the jitter may choose to not do certain optimizations on larger code sizes, and so on.
  • at the callee site, we've turned access to a value on the stack (or register) into an indirection into a pointer. Now, that pointer is highly likely to be in the cache, but still, we've now turned a one-instruction access to the value into a two-instruction access.
  • And so on.

Suppose it's a double and you change it to an in double. Again, now the variable cannot be enregistered into a high-performance floating point register. This not only has performance implications, it can also change program behaviour! C# is permitted to do float arithmetic in higher-than-64-bit precision and typically does so only if the floats can be enregistered.

This is not a free optimization. You have to measure its performance against the alternatives. Your best bet is to simply not make large structs in the first place, as the design guidelines suggest.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • _"Under what [...] possible circumstances would you imagine that a physical address is meaningful in a C# program, [...]."_. C# is usually used on systems that have with memory-mapped I/O, and things like reset vectors, etc. I'm thinking about plain old wintel boxes here. With x86 processors and VGA framebuffers. Okay, usually we don't manipulate these directly, but we could, right? – Lorraine Oct 16 '18 at 07:58
  • 7
    Is passing by `in` really equivalent to passing by value in all cases? If we have `f(ref T x, in T y)` and `f` modifies `x`, it should observe the same change to `y` when invoked as `f(ref a,a)`. The same also applies if `f` takes a `in T y` and a delegate which will modify `y` when called. Without `in`, the semantics would be different, since `y` would never have its value changed, I think, since it'd be a copy. – chi Oct 16 '18 at 09:01
  • 4
    I suspect the OP meant "physical address" in an informal way, as "the bit-pattern describing the memory location", contrasting that with "whatever the backend chooses to use to describe where to find an object". – Sneftel Oct 16 '18 at 10:52
  • @Wilson: I'd love to see an example of what you're talking about; I do not know of anyone who attempts to do those sorts of things in C#. There has been some talk over the years of a "systems C#" whereby a higher-level managed language could be used to write low-level system components, but I have never actually seen such code myself. – Eric Lippert Oct 16 '18 at 16:33
  • 4
    @chi: That's correct. If you play dangerous games with variable aliasing then you can get into trouble. If it hurts when you do that, don't do that. The *intention* of `in` is to represent the notion of "pass by reference, readonly", which is logically equivalent to passing by value *when you behave sensibly*. – Eric Lippert Oct 16 '18 at 16:35
  • Could you elaborate on "There is not a requirement that a reference to an object and a reference to a variable both be the same size?" I would think I.12.1.1 in ECMA-335 clearly covers this under "The native-size types (native int, native unsigned int, O, and &) are a mechanism in the CLI for deferring the choice of a value’s size. These data types exist as CIL types; however, the CLI maps each to the native size for a specific processor." – Tanner Gooding Sep 28 '21 at 03:04
  • @TannerGooding: Even the devil can quote scripture. :) Good find. But let me counter with: there's no requirement that the *native size for a specific processor* need be the same for different kinds of pointers, and in fact I am (just barely) old enough to have worked on compilers for languages that ran on architectures where pointers to local variables and pointers to objects were different sizes. – Eric Lippert Oct 12 '21 at 22:36
  • @TannerGooding: Of course I'm splitting hairs here. In practice, on modern architectures there is no reason I'm aware of why they'd ever be different. – Eric Lippert Oct 12 '21 at 22:38
  • If you pass a struct that is declared readonly is `in` still doing something? Seems redundant if the struct is declared readonly no? – WDUK May 06 '22 at 12:46
8

There is. When passing a struct, the in keyword allows an optimization where the compiler only needs to pass a pointer, without the risk of the method changing the content. The last is critical — it avoids a copy operation. On large structs this can make a world of difference.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
TomTom
  • 61,059
  • 10
  • 88
  • 148
  • 2
    This is just repeating what the question already says, and doesn't answer the actual question it asks. – Servy Oct 15 '18 at 15:56
  • 1
    Only in readonly structs. Otherwise the compiler will still create a defensive copy – Panagiotis Kanavos Oct 15 '18 at 15:56
  • 1
    Actually no. It is confirming that there is a performanc ebenefit. Given that IN was created in the performance run through the langauge, yes, that IS the reason. It ALLOWS an optmization (on readonly structs). – TomTom Oct 15 '18 at 15:57
  • 3
    @TomTom Again, which *the question already covered*. What it asks is, "So, then is the advantage just in larger structs, or is there some behind-the-scenes compiler optimization that makes it attractive elsewhere? If the latter, why shouldn't I make every parameter an in?" Note how it's not asking if it's actually beneficial for larger structs, or just when it's beneficial at all. It's *just* asking if it's beneficial for smaller structs (and then a follow up if yes). You haven't answer either question. – Servy Oct 15 '18 at 15:58
4

This is done because of the functional programming approach. One of the major principle is that function should not have side effects, which means it should not change values of the parameters and should return some value. In C# there was no way to pass structs(and value type) without being copied only by reference which allows changing of the value. In swift there is a hacky algorithm which copies struct (their collections are structs BTW) as long as method starts changing its values. People who use swift not all aware of the copy stuff. This is nice c# feature since it's memory efficient and explicit. If you look at what's new you will see that more and more stuff is done around structs and arrays in stack. And in statement is just necessary for these features. There are limitations mentioned in the other answers, but is not that essential for understanding where .net is heading.

Access Denied
  • 8,723
  • 4
  • 42
  • 72
4

in it is readonly reference in c# 7.2

this means you do not pass entire object to function stack similar to ref case you pass only reference to structure

but attempt to change value of object gives compiler error.

And yes this will allow you to optimize code performance if you use big structures.

Serg Shevchenko
  • 662
  • 1
  • 5
  • 21