Multiple return variables - which has best performance (out, tuple, class)?

Question

A method that returns multiple doubles can be realized in various ways:

Through out parameters:

class MyClass
{
    static double Add3(double x, out double xp1, out double xp2)
    {
        xp1 = x + 1.0;
        xp2 = x + 2.0;
        return x + 3.0;
    }
}

Through tuples:

class MyClass
{
    static Tuple<double, double, double> Add3(double x)
    {
        Tuple<double, double, double> ret = new Tuple<double, double, double>();
        ret.Item1 = x + 1.0;
        ret.Item2 = x + 2.0;
        ret.Item3 = x + 3.0;
        return ret;
    }

Through a class gathering the results:

class MyClass
{
    class Result
    { 
        double xp1;
        double xp2;
        double xp3;
    }

    static Result Add3(double x)
    {
        Result ret = new Result
        {
            xp1 = x + 1.0;
            xp2 = x + 2.0;
            xp3 = x + 3.0;
        }
        return ret;
    }
}

My impression from the comments to this question is that people in general consider the approach with the extra class as the best practice. However, I wonder if there is a rule of thumb about the implications on run time performance for the three variants.

Does the constructor of the Tuple or the class take any extra time as compared to the out parameters?

In particular, does the variant with the out parameter have any performance advantage in the case that only one of the resulting doubles will actually be used, such as in the following snippet?

double zPlus3 = MyClass.Add3(z, out _, out _)

If you have two horses and you want to know which of the two is the faster then **race your horses:** https://ericlippert.com/2012/12/17/performance-rant/ — Marc Gravell, Jul 30 '21 at 09:55
The class will need to be garbage-collected. You can avoid this by using value tuples or a by using a struct instead of a class. — SomeBody, Jul 30 '21 at 10:00
Such a question is opinion-based if it is asked in general. For the case provided, see theory and benchmarks. My first reflex would be to say a class, or struct, or parameters if not so many, because in reality it depends on what you do with all of this in and out of the method, and that's the same for tuple. Therefore it is impossible to answer if not for the example given here and limited to the code provided here **...** which is of little interest given the nature of the question itself. This question is in fact related to the CPU CALL STACK usage and optimization in conjunction with HEAP. — , Jul 30 '21 at 10:05
I've voted to reopen the question, because I can't see how it is opinion-based. There is nothing opinion based in the results of a proper benchmark. Also I would like to note that posting the link to [the rant](https://ericlippert.com/2012/12/17/performance-rant/) on performance-related questions is borderline rude. It's an indirect way of saying "don't bother us with your silly questions, we have better things to do than writing benchmarks for you". My point is, if you don't like doing benchmarks for others, it's OK. Just skip the question. There is no need to respond to questions with rants. — Theodor Zoulias, Jul 30 '21 at 10:29
@TheodorZoulias To quote that article "The question presupposes that there actually is a performance problem to be solved" — Charlieface, Jul 30 '21 at 18:12
@Charlieface the article is [a rant](https://ericlippert.com/2012/12/17/performance-rant/), it's not intended to be taken seriously IMHO. It's obviously written by someone who has seen too many performance-related questions, and has had enough. The [code of conduct](https://stackoverflow.com/conduct) has this to say: *"Avoid sarcasm and be careful with jokes. [...] If a situation makes it hard to be friendly, stop participating and move on."* — Theodor Zoulias, Jul 30 '21 at 19:38
@Charlieface The purpose of my question is not only to optimize a particular piece of code, but in general to learn about possible performance implications of certain design decisions. If somebody has a hint for me and other people who have similar questions - please share it. If someone thinks that a valid rule of thumb that I have asked for cannot exist, please share that information, too. Otherwise, just don't answer. Why the need to disqualify the question? — Amos Egel, Aug 02 '21 at 05:47
@OlivierRogier "For the case provided, see theory and benchmarks." - I think what you refer to as "theory" is what I wanted to ask for. — Amos Egel, Aug 02 '21 at 05:51
@AmosEgel Very broad subject. It depends on the design, the number of entities manipulated "at the same time", the number of data members and their types, the number of instances manipulated and the number of proc calls, the modifications made, as well as target x32, x64, arm... You can certainly find various articles online or in books, or create your own by spending many hours (very interesting area). But in general, if I'm not mistaken, little data is optimized with atomic parameters or structs, otherwise use classes because only the pointer is consumed by the cpu stack during calls/rets. — , Aug 02 '21 at 07:00
@AmosEgel Also and not the least of the considerations:: do you ask for speed or memory performance? Or an average balance sheet that complicates points of view? — , Aug 02 '21 at 07:02
@OlivierRogier The intention was to ask about speed. Thanks for your hints. I guess I'll eventually consider a textbook to get some insights in the topics that you mentioned. — Amos Egel, Aug 02 '21 at 08:32
@OlivierRogier "otherwise use classes because only the pointer is consumed by the cpu stack during calls/rets" Am I right that this assumes that the class object is handed to the method (and not constructed within the method itself)? — Amos Egel, Aug 02 '21 at 08:33
Reason to disqualify the question: because it is too broad and is also liable to opinion-based answers. [so] is not meant for general theoretical arguments, it is meant for answers to *specific* problems. Your question is a legitimate question, just not a legitimate [so] question. If you wanted my opinion, I would say that using a class (or `Tuple<>` which is also a class) has performance implications because it requires garbage collection, so I would consider not using it in tight loops. But most of the time it's unlikely to make any difference at all. — Charlieface, Aug 02 '21 at 09:20
Please avoid using `out` parameters. Although, they can make sense in certain situations, e.g. using the `TryXXX` pattern, they are very hard to grasp. See also: https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca1021#rule-description and https://stackoverflow.com/questions/4255188/is-using-out-bad-practice. Unless you are writing an extremely resource sensitive software (probably not the case if you use C# over a programming language such as C++), please use a return value. If there is not a very good reason, just don't. — Thomas, Aug 03 '21 at 01:18
Personally, I think this kind of questions are valid BUT very dangerous. It makes sense to have a feeling what may perform better but there are so many inexperienced developer which will just see that one solution performs better than another one and go with it. I am aware that my comment above may be out of context but I think it's important that everything is considered (in this case readability and less error-prone) and not just performance, and it should be mentioned. If we have questions like what is better, it should be balanced. — Thomas, Aug 03 '21 at 02:24

score 2 · Answer 1 · answered Dec 06 '21 at 15:24

2

To add some hard Facts to this Question, here is a Benchmark Project that compares the Performance of these Alternatives.

Surprisingly .NET applies heavy Optimizations on ValueTuple and KeyValuePair, bringing down Execution Time by a Factor of 34 in Release-Mode compared to Debug!

In Release-Mode all Implementations have similar Speed, except for returning Tuple{int,int} which is 10 times slower due to Garbage Collection in high Volumes.

In Debug Mode only the Methods using Out-Parameters are fast. The relative Speed-Factor from Debug to Release Build is given in the last Column '*'

Method	Release	Debug	*
Return Tuple	510.84 ms	1,515.2 ms	3
Return KeyValuePair	44.56 ms	1,527.1 ms	34
Return ValueTuple	51.28 ms	1,418.6 ms	28
Return NullableValue	48.41 ms	1,527.0 ms	29
2 out Parameters	43.83 ms	560.4 ms	14
1 out Parameter	48.41 ms	586.5 ms	13
Return Single Value	49.72 ms	523.8 ms	11

answered Dec 06 '21 at 15:24

Spoc

668
5
14

I think that the Debug measurements are redundant. Anyone who is conscious about the performance of their app, will not release Debug builds of it! – Theodor Zoulias Dec 06 '21 at 16:24
The Debug Measurements just demonstrate again what a huge Difference it can make and this helps people to do proper benchmarking. – Spoc Mar 14 '22 at 07:01
How it helps people to do proper benchmarking? By learning how to obtain useless metrics? Whatever is the difference in performance on Debug mode, is irrelevant. All that matters is the benchmarks on Release mode, because that what the users of your app will experience. – Theodor Zoulias Mar 14 '22 at 07:07

Theodor Zoulias · Accepted Answer · 2021-08-07T07:08:19.250

Returning multiple values through a Tuple<T1,T2,T3> or a custom class with 3 properties, is equivalent performance-wise. Tuples are more readily available (you don't have to code them) while custom classes are more convenient to use, but both approaches involve the instantiation of a reference-type, that has to be heap-allocated and later garbage collected. If you use these types just for accessing their properties once, then you are getting no added value to compensate for the heap-allocation/garbage collection overhead. Using out parameters is superior performance-wise in this case. There is a forth solution though, that combines the advantages of all these approaches: value tuples (available in C# 7.0 and later).

static (double, double, double) Add3(double x)
{
    return (x + 1.0, x + 2.0, x + 3.0);
}

Usage example, demonstrating tuple deconstruction:

(double xp1, double xp2, double xp3) = Add3(13);

...or equivalently using type inference:

var (xp1, xp2, xp3) = Add3(13);

Advantages:

A ValueTuple<T1,T2,T3> is as readily available as a Tuple<T1,T2,T3>.
There is language support for changing the field names of a ValueTuple<T1,T2,T3> to something more meaniningful than Item1, Item2 and Item3, making them (almost) equally convenient to a custom class.
A ValueTuple<T1,T2,T3> is stored in the stack, just like the out parameters. No heap-allocation and no garbage collection is involved.

Multiple return variables - which has best performance (out, tuple, class)?

2 Answers2