3

I am optimizing our debug print facilities (class). The class is roughly straightforward, with a global "enabled" bool and a PrineDebug routine.

I'm investigating the performance of the PrintDebug method in "disabled" mode, trying to create a framework with less impact on run time if no debug prints are needed.

During the exploration I came across the below results, which were a surprise to me and I wonder what am I missing here?

public class Profiler
{
     private bool isDebug = false;

     public void PrineDebug(string message)
     {
         if (isDebug)
         {
             Console.WriteLine(message);
         }
     }
}

[MemoryDiagnoser]
public class ProfilerBench
{
    private Profiler profiler = new Profiler();
    private int five = 5;
    private int six = 6;

    [Benchmark]
    public void DebugPrintConcat()
    {
        profiler.PrineDebug("sometext_" + five + "_" + six);
    }

    [Benchmark]
    public void DebugPrintInterpolated()
    {
        profiler.PrineDebug($"sometext_{five}_{six}");
    }
}

Running this benchmark under BenchmarkDotNet.. Here are the results:

|                 Method |     Mean |   Error |  StdDev |  Gen 0 | Allocated |
|----------------------- |---------:|--------:|--------:|-------:|----------:|
|       DebugPrintConcat | 149.0 ns | 3.02 ns | 6.03 ns | 0.0136 |      72 B |
| DebugPrintInterpolated | 219.4 ns | 4.13 ns | 6.18 ns | 0.0181 |      96 B |

I thought the Concat approach will be slower as every + operation actually creates a new string (+allocation), but seems the interpolation caused higher allocation with higher time.

Can you explain?

NirMH
  • 4,769
  • 3
  • 44
  • 69
  • 1
    stirng interpolation also creates new strings and even futher it calls sting.Format on every "part" of the interpollation.... so in the end its a lot more calls – Jonathan Alfaro Jul 10 '22 at 06:56
  • Does this answer your question? [String Interpolation vs String.Format](https://stackoverflow.com/questions/32342392/string-interpolation-vs-string-format) – GSerg Jul 10 '22 at 07:55
  • @JonathanAlfaro This is false, string interpolation only creates one string in the current .net (see my answer) and in the older .net versions it always called just one single `string.Format()` – Petrusion Jul 10 '22 at 13:32
  • @Petrusion my statement is 100% true for all NET versions before NET 6.0. Even further the use of Interpolation Handlers does not guarantee that for small strings it will be "faster" than string concatenation. – Jonathan Alfaro Jul 11 '22 at 16:22
  • @JonathanAlfaro You said that: 1) "string interpolation creates new strings", presumably meaning that it ends up calling `.ToString()` on every item. 2) "it calls `string.Format()` on every part of the interpolation", possibly meaning that one `$"..."` (which the question was about) gets translated to multiple `string.Format()` (but I might be misunderstanding that). Neither of those are true, `string.Format()` used `ISpanFormattable` on its items even before .Net 6, skipping `.ToString()` – Petrusion Jul 11 '22 at 16:42
  • @Petrusion NET 6 is not the only version of NET in existence... string.Format does create a new string which is returned by the method. So on this string "$"sometext_{five}_{six}"" string.Formant will be called at least twice. As for NET 6 it is different than other versions of NET I am pretty sure most existing applications in NET are not NET 6.... sience NET has been around for 20 years – Jonathan Alfaro Jul 12 '22 at 12:53
  • 1
    @JonathanAlfaro *"`string.Format` does create a new string which is returned by the method"* - that much is beyond obvious, that is the entire point of the method. Why do you keep saying that I only talk about .Net 6 when I talked about the previous versions many times in this comment chain? If I understand correctly, you are saying `$"sometext_{five}_{six}"` gets translated to more than one `string.Format()` call, that is wrong for any .Net version. Before .Net 6, `$"sometext_{five}_{six}"` got translated to `string.Format("sometext_{0}_{1}", five, six)`, that is a single `string.Format()`. – Petrusion Jul 12 '22 at 14:33

2 Answers2

12

TLDR: Interpolated strings are overall the best and they only allocate more memory in your benchmarks because you are using old .Net and cached number strings

There's a lot to talk about here.

First off, a lot of people think string concatenation using + will always create a new string for every +. That might be the case in a loop, but if you use lots of + one after another, the compiler will actually replace those operators with a call to one string.Concat, making the complexity O(n), not O(n^2). Your DebugPrintConcat actually compiles to this:

public void DebugPrintConcat()
{
    profiler.PrineDebug(string.Concat("sometext_", five.ToString(), "_", six.ToString()));
}

It should be noted that in your specific case, you are not benchmarking string allocation for the integers because .Net caches string instances for small numbers, so those .ToString() on five and six end up allocating nothing. The memory allocation would've been much different if you used bigger numbers or formatting (like .ToString("10:0000")).

The three ways of concating strings are +(that is, string.Concat()), string.Format() and interpolated strings. Interpolated strings used to be the exact same as string.Format(), as $"..." was just syntactic sugar for string.Format(), but that is not the case anymore since .Net 6 when they got a redesign via Interpolated String Handlers

Another myth I think I have to address is that people think that using string.Format() on structs will always lead to first boxing the struct, then creating an intermediate string by calling .ToString() on the boxed struct. That is false, for years now, all primitive types have implemented ISpanFormattable which allowed string.Format() to skip creating an intermediate string and write the string representation of the object directly into the internal buffer. ISpanFormattalbe has gone public with the release of .Net 6 so you can implement it for your own types, too (more on that at the end of this answer)

About memory characteristics of each approach, ordered from worst to best:

  • string.Concat() (the overloads accepting objects, not strings) is the worst because it will always box structs and create intermediate strings (source: decompilation using ILSpy)
  • + and string.Concat() (the overloads accepting strings, not objects) are slightly better than the previous, because while they do use intermediate strings, they don't box structs
  • string.Format() is generally better than previous because as mentioned earlier it does need to box structs, but not make an intermediate string if the structs implement ISpanFormattable (which was internal to .Net until not too long ago, but the performance benefit was there nevertheless). Furthermore, it is much more likely string.Format() won't need to allocate an object[] compared to previous methods
  • Interpolated strings are the best because with the release of .Net 6, they don't box structs, and they don't create intermediate strings for types implementing ISpanFormattable. The only allocation you will generally get with them is just the returned string and nothing else.

To support the claims above, I'm adding a benchmark class and benchmark results below, making sure to avoid the situation in the original post where + performs best only because strings are cached for small ints:

[MemoryDiagnoser]
[RankColumn]
public class ProfilerBench
{
    private float pi = MathF.PI;
    private double e = Math.E;
    private int largeInt = 116521345;

    [Benchmark(Baseline = true)]
    public string StringPlus()
    {
        return "sometext_" + pi + "_" + e + "_" + largeInt + "...";
    }

    [Benchmark]
    public string StringConcatStrings()
    {
        // the string[] overload
        // the exact same as StringPlus()
        return string.Concat("sometext_", pi.ToString(), "_", e.ToString(), "_", largeInt.ToString(), "...");
    }

    [Benchmark]
    public string StringConcatObjects()
    {
        // the params object[] overload
        return string.Concat("sometext_", pi, "_", e, "_", largeInt, "...");
    }

    [Benchmark]
    public string StringFormat()
    {
        // the (format, object, object, object) overload
        // note that the methods above had to allocate an array unlike string.Format()
        return string.Format("sometext_{0}_{1}_{2}...", pi, e, largeInt);
    }

    [Benchmark]
    public string InterpolatedString()
    {
        return $"sometext_{pi}_{e}_{largeInt}...";
    }
}

Results are ordered by bytes allocated:

Method Mean Error StdDev Rank Gen 0 Allocated
StringConcatObjects 293.9 ns 1.66 ns 1.47 ns 4 0.0386 488 B
StringPlus 266.8 ns 2.04 ns 1.91 ns 2 0.0267 336 B
StringConcatStrings 278.7 ns 2.14 ns 1.78 ns 3 0.0267 336 B
StringFormat 275.7 ns 1.46 ns 1.36 ns 3 0.0153 192 B
InterpolatedString 249.0 ns 1.44 ns 1.35 ns 1 0.0095 120 B

If I edit the benchmark class to use more than three format arguments, then the difference between InterpolatedString and string.Format() will be even greater because of the array allocation:

[MemoryDiagnoser]
[RankColumn]
public class ProfilerBench
{
    private float pi = MathF.PI;
    private double e = Math.E;
    private int largeInt = 116521345;
    private float anotherNumber = 0.123456789f;

    [Benchmark]
    public string StringPlus()
    {
        return "sometext_" + pi + "_" + e + "_" + largeInt + "..." + anotherNumber;
    }

    [Benchmark]
    public string StringConcatStrings()
    {
        // the string[] overload
        // the exact same as StringPlus()
        return string.Concat("sometext_", pi.ToString(), "_", e.ToString(), "_", largeInt.ToString(), "...", anotherNumber.ToString());
    }

    [Benchmark]
    public string StringConcatObjects()
    {
        // the params object[] overload
        return string.Concat("sometext_", pi, "_", e, "_", largeInt, "...", anotherNumber);
    }

    [Benchmark]
    public string StringFormat()
    {
        // the (format, object[]) overload
        return string.Format("sometext_{0}_{1}_{2}...{3}", pi, e, largeInt, anotherNumber);
    }

    [Benchmark]
    public string InterpolatedString()
    {
        return $"sometext_{pi}_{e}_{largeInt}...{anotherNumber}";
    }
}

Benchmark results, again ordered by bytes allocated:

Method Mean Error StdDev Rank Gen 0 Allocated
StringConcatObjects 389.3 ns 2.65 ns 2.34 ns 4 0.0477 600 B
StringPlus 350.7 ns 1.88 ns 1.67 ns 2 0.0329 416 B
StringConcatStrings 374.4 ns 6.90 ns 6.46 ns 3 0.0329 416 B
StringFormat 390.4 ns 2.01 ns 1.88 ns 4 0.0234 296 B
InterpolatedString 332.6 ns 2.82 ns 2.35 ns 1 0.0114 144 B

EDIT: People might still think calling .ToString() on interpolated string handler arguments is a good idea. It is not, the performance will suffer if you do it and Visual Studio even kind of warns you not to do it. This is not something that only applies to .net6, below you can see that even when using string.Format(), which interpolated string used to be syntactic sugar for, it is still bad to call .ToString():

[MemoryDiagnoser]
[RankColumn]
public class ProfilerBench
{
    private float pi = MathF.PI;
    private double e = Math.E;
    private int largeInt = 116521345;
    private float anotherNumber = 0.123456789f;

    [Benchmark]
    public string StringFormatGood()
    {
        // the (format, object[]) overload with boxing structs
        return string.Format("sometext_{0}_{1}_{2}...{3}", pi, e, largeInt, anotherNumber);
    }

    [Benchmark]
    public string StringFormatBad()
    {
        // the (format, object[]) overload with pre-converting the structs to strings
        return string.Format("sometext_{0}_{1}_{2}...{3}", 
            pi.ToString(), 
            e.ToString(), 
            largeInt.ToString(), 
            anotherNumber.ToString());
    }
}
Method Mean Error StdDev Rank Gen 0 Allocated
StringFormatGood 389.0 ns 2.27 ns 2.12 ns 1 0.0234 296 B
StringFormatBad 442.0 ns 4.62 ns 4.09 ns 2 0.0305 384 B

The explanation for the results is that it is cheaper to box the struct and have string.Format() write the string representations directly into it's char buffer, rather than creating an intermediate string explicitly and forcing string.Format() to copy from it.

If you want to read more about how interpolated string handlers work and how to make your own types implement ISpanFormattable, this is a good reading: link

Petrusion
  • 940
  • 4
  • 11
-1

I believe that problem here is just a boxing of ints. I tried to eliminate the boxing and got the same performance as for concatenation

Method Mean Error StdDev Gen 0 Allocated
DebugPrintConcat 41.49 ns 0.198 ns 0.185 ns 0.0046 48 B
DebugPrintInterpolated 103.07 ns 0.257 ns 0.227 ns 0.0092 96 B
DebugPrintInterpolatedStrings 41.36 ns 0.211 ns 0.198 ns 0.0046 48 B

DebugPrintInterpolatedStrings code: I just added explicit ToString

    [Benchmark]
    public void DebugPrintInterpolatedStrings()
    {
        profiler.PrineDebug($"sometext_{five.ToString()}_{six.ToString()}");
    }

We can also note the reduced allocations (exactly because of absence of additional boxed objects).

PS. By the way, @GSerg already mentioned post with the same explanation in the comment.

Serg
  • 3,454
  • 2
  • 13
  • 17
  • 1
    Doing things like `five.ToString()` in string interpolation is a bad idea. Right now it worked well for you only because .Net caches strings to return for small ints. In most cases it is better to just let the boxing happen because internally .Net **won't** create new strings but use `TryFormat()` from `ISpanFormattable` to append to the string being created directly without intermediate string objects. – Petrusion Jul 10 '22 at 11:13
  • @Petrusion, can you please elaborate your note about the cache? I redo the benchmark with making the `five` and `six` to be random (so I presumable broke any caching if any) and I again have result that the version with explicit `ToString` is better then without and is almost the same as the `Concat` one. – Serg Jul 10 '22 at 13:18
  • 1
    You are using old .net I think. See my answer for more details. Note: I have just rechecked again that with the current .net, interpolated strings are faster and allocate less memory without `.ToString()` – Petrusion Jul 10 '22 at 13:27
  • 1
    addendum: even without the interpolated string performance enhancements of .net 6, `five.ToString()` is still a bad idea. I am going to add why to my answer. – Petrusion Jul 10 '22 at 13:50
  • Also, did you perhaps constrain the random numbers to be small? Creating random numbers is useless in this case if they are going to be small anyway. – Petrusion Jul 10 '22 at 14:05
  • I tried `rnd.Next()` and `rnd.Next(int.MaxValue/2, int.MaxValue)` without noticeable changes in the results. The tests were run on net5. After your comment I redoed the tests on net6 and saw a perforamnce boost of all tests. The `DebugPrintInterpolated` almost twice faster, but still not so fast as other vesions, althought the difference is not so huge (46ms vs 59ms) as for net5 (64ms vs 106ms). – Serg Jul 10 '22 at 14:42
  • 2
    I just did the same benchmark you described. You are apparently just looking at the differences between speed, you need to look at allocations, those are more important than some 13ms speed difference, especially because those differences can change in the future as the .net team optimizes the string handler ref structs. The whole point of my argument is what allocates less heap memory, not meager speed differences. Also, see the last benchmark of my answer, the speed difference can be in favour of not doing `.ToString()`, along with the memory allocation – Petrusion Jul 10 '22 at 15:20