11

Edited for the release of .Net Core 2.1

Repeating the test for the release of .Net Core 2.1, I get results like this

1000000 iterations of "Concat" took 842ms.

1000000 iterations of "new String" took 1009ms.

1000000 iterations of "sb" took 902ms.

In short, if you are using .Net Core 2.1 or later, Concat is king.


I've edited the question to incorporate the valid points raised in the comments.


I was musing on my answer to a previous question and I started to wonder, is this,

return new string(charSequence.ToArray());

The best way to convert an IEnumerable<char> to a string. I did a little search and found this question already asked here. That answer asserts that,

string.Concat(charSequence)

is a better choice. Following an answer to this question, a StringBuilder enumeration approach was also suggested,

var sb = new StringBuilder();
foreach (var c in chars)
{
    sb.Append(c);
}

return sb.ToString();

while this may be a little unwieldy I include it for completeness. I decided I should do a little test, the code used is at the bottom.

When built in release mode, with optimizations, and run from the command line without the debugger attached I get results like this.

1000000 iterations of "Concat" took 1597ms.

1000000 iterations of "new String" took 869ms.

1000000 iterations of "sb" took 748ms.

To my reckoning, the new string(...ToArray()) is close to twice as fast as the string.Concat method. The StringBuilder is marginally faster still but, is awkward to use but could be an extension.

Should I stick with new string(...ToArray()) or, is there something I'm missing?

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;

class Program
{
    private static void Main()
    {
        const int iterations = 1000000;
        const string testData = "Some reasonably small test data";

        TestFunc(
            chars => new string(chars.ToArray()),
            TrueEnumerable(testData),
            10,
            "new String");

        TestFunc(
            string.Concat,
            TrueEnumerable(testData),
            10,
            "Concat");

        TestFunc(
            chars =>
            {
                var sb = new StringBuilder();
                foreach (var c in chars)
                {
                    sb.Append(c);
                }

                return sb.ToString();
            },
            TrueEnumerable(testData),
            10,
            "sb");

        Console.WriteLine("----------------------------------------");

        TestFunc(
            string.Concat,
            TrueEnumerable(testData),
            iterations,
            "Concat");

        TestFunc(
            chars => new string(chars.ToArray()),
            TrueEnumerable(testData),
            iterations,
            "new String");

        TestFunc(
            chars =>
            {
                var sb = new StringBuilder();
                foreach (var c in chars)
                {
                    sb.Append(c);
                }

                return sb.ToString();
            },
            TrueEnumerable(testData),
            iterations,
            "sb");

        Console.ReadKey();
    }

    private static TResult TestFunc<TData, TResult>(
            Func<TData, TResult> func,
            TData testData,
            int iterations,
            string stage)
    {
        var dummyResult = default(TResult);

        var stopwatch = Stopwatch.StartNew();
        for (var i = 0; i < iterations; i++)
        {
            dummyResult = func(testData);
        }

        stopwatch.Stop();
        Console.WriteLine(
            "{0} iterations of \"{2}\" took {1}ms.",
            iterations,
            stopwatch.ElapsedMilliseconds,
            stage);

        return dummyResult;
    }

    private static IEnumerable<T> TrueEnumerable<T>(IEnumerable<T> sequence)
    {
        foreach (var t in sequence)
        {
            yield return t;
        }
    }
}
Community
  • 1
  • 1
Jodrell
  • 34,946
  • 5
  • 87
  • 124
  • 5
    1.) If this was done in Debug mode, the results are inaccurate and have to be tossed. 2.) This sounds like [premature optimization](http://en.wikipedia.org/wiki/Program_optimization). If it hasn't caused a performance hit (and I'm sure it hasn't), what does testing the performance solve? – Dave Zych Feb 08 '13 at 18:24
  • 4
    Side note: consider testing on something that is true `IEnumerable` (i.e. `Enumerable.Repeat('d', 100)`) to avoid potential shortcuts in constructors/conversion methods. – Alexei Levenkov Feb 08 '13 at 18:26
  • 1
    To add to @DaveZych comment, you need to test in release mode *without the debugger attached*: Ctrl+F5 in Visual Studio. – Jim Mischel Feb 08 '13 at 20:15
  • 1
    @JimMischel, you make a good point, I've tested in release mode without the debugger attached to get the results I state. – Jodrell Feb 11 '13 at 09:25
  • 1
    @DaveZych, 1) see my previous response above. 2) The purpose of the test is to answer the question. Given a number of choices that are all easy to use is speed or performance a bad differentiator? – Jodrell Feb 11 '13 at 09:30
  • 1
    you have a fourth option: `string.Join("", charSequence)`. I expect it to be close to constructor/string builder performance – nawfal Dec 16 '13 at 01:00

2 Answers2

7

It's worth noting that these results, whilst true for the case of IEnumerable from a purists point of view, are not always thus. For example if you were to actually have a char array even if you are passed it as an IEnumerable it is faster to call the string constructor.

The results:

Sending String as IEnumerable<char> 
10000 iterations of "new string" took 157ms. 
10000 iterations of "sb inline" took 150ms. 
10000 iterations of "string.Concat" took 237ms.
======================================== 
Sending char[] as IEnumerable<char> 
10000 iterations of "new string" took 10ms.
10000 iterations of "sb inline" took 168ms.
10000 iterations of "string.Concat" took 273ms.

The Code:

static void Main(string[] args)
{
    TestCreation(10000, 1000);
    Console.ReadLine();
}

private static void TestCreation(int iterations, int length)
{
    char[] chars = GetChars(length).ToArray();
    string str = new string(chars);
    Console.WriteLine("Sending String as IEnumerable<char>");
    TestCreateMethod(str, iterations);
    Console.WriteLine("===========================================================");
    Console.WriteLine("Sending char[] as IEnumerable<char>");
    TestCreateMethod(chars, iterations);
    Console.ReadKey();
}

private static void TestCreateMethod(IEnumerable<char> testData, int iterations)
{
    TestFunc(chars => new string(chars.ToArray()), testData, iterations, "new string");
    TestFunc(chars =>
    {
        var sb = new StringBuilder();
        foreach (var c in chars)
        {
            sb.Append(c);
        }
        return sb.ToString();
    }, testData, iterations, "sb inline");
    TestFunc(string.Concat, testData, iterations, "string.Concat");
}
Baguazhang
  • 309
  • 2
  • 3
  • Your results are similar to my results. Apart from `string.Concat` being marginally easier to type, I see no reason why I should use it in place of `new string(...ToArray())`. – Jodrell Feb 12 '13 at 09:38
  • I guess the key is for people to understand the options. If you genuinely have an IEnumerable and performance is that much of a concern then I think I would go to the effort of having an extension method to use a string builder but if you know you have an array then clearly the best option is to use the string constructor. – Baguazhang Feb 12 '13 at 10:02
1

Well, I just wrote up a little test, trying 3 different ways of creating a string from an IEnumerable:

  1. using StringBuilder and repeated invocations of its Append(char ch) method.
  2. using string.Concat<T>
  3. using the String constructor.

10,000 iterations of generating a random 1,000 character sequence and building a string from it, I see the following timings in a release build:

  • Style=StringBuilder elapsed time is 00:01:05.9687330 minutes.
  • Style=StringConcatFunction elapsed time is 00:02:33.2672485 minutes.
  • Style=StringConstructor elapsed time is 00:04:00.5559091 minutes.

StringBuilder the clear winner. I'm using a static StringBuilder (singleton) instance, though. Dunno if that makes much difference.

Here's the source code:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Security.Cryptography;
using System.Text;

namespace ConsoleApplication6
{
  class Program
  {

    static readonly RandomNumberGenerator Random = RandomNumberGenerator.Create() ;

    static readonly byte[] buffer = {0,0} ;

    static char RandomChar()
    {
      ushort codepoint ;
      do
      {
        Random.GetBytes(buffer) ;
        codepoint = BitConverter.ToChar(buffer,0) ;
        codepoint &= 0x007F ; // restrict to Unicode C0 ;
      } while ( codepoint < 0x0020 ) ;
      return (char) codepoint ;
    }

    static IEnumerable<char> GetRandomChars( int count )
    {
      if ( count < 0 ) throw new ArgumentOutOfRangeException("count") ;

      while ( count-- >= 0 )
      {
        yield return RandomChar() ;
      }
    }

    enum Style
    {
      StringBuilder = 1 ,
      StringConcatFunction = 2 ,
      StringConstructor = 3 ,
    }

    static readonly StringBuilder sb = new StringBuilder() ;
    static string MakeString( Style style )
    {
      IEnumerable<char> chars = GetRandomChars(1000) ;
      string instance ;
      switch ( style )
      {
      case Style.StringConcatFunction :
        instance = String.Concat<char>( chars ) ;
        break ;
      case Style.StringBuilder : 
        foreach ( char ch in chars )
        {
          sb.Append(ch) ;
        }
        instance = sb.ToString() ;
        break ;
      case Style.StringConstructor :
        instance = new String( chars.ToArray() ) ;
        break ;
      default :
        throw new InvalidOperationException() ;
      }
      return instance ;
    }

    static void Main( string[] args )
    {
      Stopwatch stopwatch = new Stopwatch() ;

      foreach ( Style style in Enum.GetValues(typeof(Style)) )
      {
        stopwatch.Reset() ;
        stopwatch.Start() ;
        for ( int i = 0 ; i < 10000 ; ++i )
        {
          MakeString( Style.StringBuilder ) ;
        }
        stopwatch.Stop() ;
        Console.WriteLine( "Style={0}, elapsed time is {1}" ,
          style ,
          stopwatch.Elapsed
          ) ;
      }
      return ;
    }
  }
}
Nicholas Carey
  • 71,308
  • 16
  • 93
  • 135
  • 2
    When the `StringBuilder` is instantiated it needs to allocate memory. This time should be accounted for in the string builder approach. It also looks like you are throwing a random mnumber of expensive exceptions in the string generation. The test could/should be conducted on the same string. While the length could be a siginificant factor I don't think randomness of content is relavent here. – Jodrell Feb 11 '13 at 10:03
  • the questions still stands though: *Should [the OP] stick with new string(...ToArray()) or, is there something [he's] missing?* – default Feb 11 '13 at 11:10
  • This code has **huge bugs**. The call to `MakeString(Style.StringBuilder);` should be `MakeString(style);`, otherwise you're just comparing the `StringBuilder` method to itself! You got different timings because each `foreach` iteration gets slower as your shared `StringBuilder` grows ever larger; if you create a new instance for each value of `style` then (with the other bug) the tests yield similar timings. After fixing the `MakeString()` call with Visual Studio 2017 v15.7.5/.NET v4.7.1 I get `StringBuilder=0:59.41`, `StringConcatFunction=6:13.75`, and `StringConstructor=7:56.22`. (cont.) – Lance U. Matthews Jul 29 '18 at 23:44
  • (cont.) Also, it's not fair to use a shared `StringBuilder` because `instance = sb.ToString();` produces an incorrect (cumulative) `string` for every call but the first. To fix this, on each call `MakeString()` should either call `sb.Clear()` (for single-threaded code) or create its own local `StringBuilder` (for multi-threaded code), in which case I get `StringBuilderSingleInstanceCleared=2:44.31` and `StringBuilderInstancePerCall=4:27.73`. Ultimately, your conclusion was still correct, though the `StringBuilder` method is more like 1.4x-2.9x, not 2.4x-3.7x, as fast as other methods. – Lance U. Matthews Jul 29 '18 at 23:44