0

I heard about the new Span and wrote a little sample appliaction to benchmark the execution times.

  public class Program
{
    private static int _times = 20;

    public static void Main(string[] args)
    {
        var stringResponse = "0123456789";
        var byteResponse = new byte[] {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};

        // Benchmark with byte[] and Array.Copy()
        var stopWatch = new Stopwatch();
        stopWatch.Start();
        ParseByteArray(byteResponse);
        stopWatch.Stop();
        var byteArrayTime = stopWatch.ElapsedMilliseconds;

        // Benchmark with Span<byte[]>
        stopWatch.Restart();
        ParseWithSpanOfByteArray(byteResponse);
        stopWatch.Stop();
        var spanOfByteArrayTime = stopWatch.ElapsedMilliseconds;

        // Benchmark with string and string.SubString()
        stopWatch.Restart();
        ParseWithString(stringResponse);
        stopWatch.Stop();
        var stringTime = stopWatch.ElapsedMilliseconds;

        // Benchmark with Span<string>
        stopWatch.Restart();
        ParseWithSpanOfString(stringResponse);
        stopWatch.Stop();
        var spanStringTime = stopWatch.ElapsedMilliseconds;


        Console.ForegroundColor = ConsoleColor.Red;
        Console.WriteLine($"With byte[] and Array.Copy: {byteArrayTime} ms.");
        Console.ForegroundColor = ConsoleColor.Green;
        Console.WriteLine($"With Span<byte[]> and Slice: {spanOfByteArrayTime} ms.");

        Console.ForegroundColor = ConsoleColor.Red;
        Console.WriteLine($"With string and string.SubString: {stringTime} ms.");
        Console.ForegroundColor = ConsoleColor.Green;
        Console.WriteLine($"With Span<string>: {spanStringTime} ms.");

        Console.ReadKey();
    }

    private static void ParseWithSpanOfString(string response)
    {
        var result = 0;
        var span = response.AsSpan();

        for (var i = 0; i < _times; i++)
        {
            var firstHalf = span.Slice(0, 5);
            var secondHalf = span.Slice(5, 5);

            result += firstHalf[0];
            result += secondHalf[0];
        }

        Console.WriteLine(result);
    }

    private static void ParseWithString(string response)
    {
        var result = 0;
        for (var i = 0; i < _times; i++)
        {
            var firstHalf = response.Substring(0, 5);
            var secondHalf = response.Substring(5, 5);

            result += firstHalf[0];
            result += secondHalf[0];
        }

        Console.WriteLine(result);
    }

    private static void ParseByteArray(byte[] response)
    {
        var result = 0;
        for (var i = 0; i < _times; i++)
        {
            var firstHalf = new byte[5];
            var secondHalf = new byte[5];
            Array.Copy(response, firstHalf, 5);
            Array.Copy(response, 5, secondHalf, 0, 5);

            result += firstHalf[0];
            result += secondHalf[0];
        }

        Console.WriteLine(result);
    }

    private static void ParseWithSpanOfByteArray(Span<byte> response)
    {
        var result = 0;

        for (var i = 0; i < _times; i++)
        {
            var firstHalf = response.Slice(0, 5);
            var secondHalf = response.Slice(5, 5);

            result += firstHalf[0];
            result += secondHalf[0];
        }

        Console.WriteLine(result);
    }

What I found out is that the promised performance boost of Span only starts around _times = 10.000.000. I am a little bit disappointed.

Result for _times = 1000:

With byte[] and Array.Copy: 3 ms.
With Span<byte[]> and Slice: 22 ms.
With string and string.SubString: 0 ms.
With Span<string>: 3 ms.

Result for _times = 10.000:

With byte[] and Array.Copy: 4 ms.
With Span<byte[]> and Slice: 23 ms.
With string and string.SubString: 1 ms.
With Span<string>: 3 ms.

Result for _times = 100.000:

With byte[] and Array.Copy: 19 ms.
With Span<byte[]> and Slice: 26 ms.
With string and string.SubString: 5 ms.
With Span<string>: 6 ms.

Result for _times = 1.000.000:

With byte[] and Array.Copy: 166 ms.
With Span<byte[]> and Slice: 121 ms.
With string and string.SubString: 72 ms.
With Span<string>: 30 ms.

Am I doing something wrong or should I still prefer using byte[] and string manipulations when dealing with only a few operations?

[EDIT] As mentioned in the comments I had to use the release mode for my little application. The results are now in favor of Span. Even with _times = 1:

With byte[] and Array.Copy: 3 ms.
With Span<byte[]> and Slice: 1 ms.
With string and string.SubString: 0 ms.
With Span<string>: 2 ms.

Only Span seems to be a bit slower in my case...

[Edit] After a lot of comments I learned that my benchmark made no sense it all and I would need a little more time to get correct results.

I now used BenchmarkDotNet to get my results:

enter image description here

Thanks for all the input!

selmaohneh
  • 553
  • 2
  • 17
  • 1
    When you profiled this, what was the memory consumption of the two approaches? Number of GCs that occurred? Duration of those GCs? – mjwills Nov 05 '18 at 06:48
  • 1
    Good point. How can I get that? Or how do I make sure that all examples run with the same memory? – selmaohneh Nov 05 '18 at 06:50
  • span starts to look as the `ten million mark` for me – jazb Nov 05 '18 at 06:52
  • Use a benchmarking framework or even better, test it with a real world application. These console app tests usually yield wrong results, because there are many things under the hood that impact performance. – FCin Nov 05 '18 at 06:58
  • "*that the promised performance boost*" who promised said blanket performance boost. Also these are unrealistic tests, I mean you could get better performance just using array indexes, or pointers, span is never going to beat it for simple array types. Which raises the old adage, the right tool for the right job and all that. – TheGeneral Nov 05 '18 at 06:58
  • Benchmarking debug builds is generally useless in languages like C++ and C# where lots of inlining and optimizing of generic libraries is essential for good performance. Even in C it's basically useless ([C loop optimization help for final assignment](/a/32001196)). It's not a uniform speedup; the speed cost of a debug build depends greatly on minor details like whether you do a lot of stuff in one expression or whether you use tmp vars which don't matter at all in a normal optimized build. – Peter Cordes Nov 05 '18 at 07:05
  • 1
    Possible duplicate of [Performance differences between debug and release builds](https://stackoverflow.com/questions/4043821/performance-differences-between-debug-and-release-builds) (i.e. never benchmark on `debug` - it is pointless) – mjwills Nov 05 '18 at 07:13
  • According to https://msdn.microsoft.com/en-us/magazine/mt814808.aspx, `Span` sounds like it should basically optimize away completely; it's just an interface for doing what C can do where a pointer+length *is* an array, regardless of being in the middle of another array or not. But presumably with a debug build, Span doesn't optimize away. – Peter Cordes Nov 05 '18 at 07:14
  • One thing that occurred to me is that your test data set is way too small. Anything that can comfortably fit in L2 cache (including whatever copy you make) will not yield a noticeable performance boost. – dumetrulo Nov 05 '18 at 09:24

0 Answers0