4

Question Background

I read this question that is about how to reverse a string as fast as possible. I found that one of the answers was comparing different methods. In one of them, they just run a loop swapping elements from position i with the one at position string.Length-1-i but they use the known tricky swap via XOR. I was wondering how faster is reversing the string using the swap via XOR in comparison with the same method using the classic swap via a temporal variable. Surprisingly I'm getting almost a 50% improvement over the XOR one.

The Question

Is the compiler doing something magic behind the scenes, why I'm I getting this result?

The modified code with the Benchmarks

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ContestLibrary
{
    public class Problem
    {
        delegate string StringDelegate(string s);

        static void Benchmark(string description, StringDelegate d, int times, string text)
        {
            Stopwatch sw = new Stopwatch();
            sw.Start();
            for (int j = 0; j < times; j++)
            {
                d(text);
            }
            sw.Stop();
            Console.WriteLine("{0} Ticks {1} : called {2} times.", sw.ElapsedTicks, description, times);
        }

        public static string ReverseXor(string s)
        {
            char[] charArray = s.ToCharArray();
            int len = s.Length - 1;

            for (int i = 0; i < len; i++, len--)
            {
                charArray[i] ^= charArray[len];
                charArray[len] ^= charArray[i];
                charArray[i] ^= charArray[len];
            }

            return new string(charArray);
        }

        public static string ReverseClassic(string s)
        {
            char[] charArray = s.ToCharArray();
            int len = s.Length-1;

            for (int i = 0; i < len; i++, len--)
            {
                char temp = charArray[len];
                charArray[len] = charArray[i];
                charArray[i] = temp;
            }
            return new string(charArray);
         }

        public static string StringOfLength(int length)
        {
            Random random = new Random();
            StringBuilder sb = new StringBuilder();
            for (int i = 0; i < length; i++)
            {
                sb.Append(Convert.ToChar(Convert.ToInt32(Math.Floor(26 * random.NextDouble() + 65))));
            }
            return sb.ToString();
        }

        static void Main(string[] args)
        {

            int[] lengths = new int[] {1,10,100,1000, 10000, 100000};

            foreach (int l in lengths)
            {
                int iterations = 10000;
                string text = StringOfLength(l);
                Benchmark(String.Format("Classic (Length: {0})", l), ReverseClassic, iterations, text);
                Benchmark(String.Format("Xor (Length: {0})", l), ReverseXor, iterations, text);
                Console.WriteLine();    
            }
            Console.Read();
        }
    }
}
Raudel Ravelo
  • 648
  • 2
  • 6
  • 24
  • So xor is slower. Why does that surprise you? – Zohar Peled Jan 10 '18 at 05:25
  • People use the XOR trick because it's supposed to be faster, like when they use n>>1 to divide n by 2 instead of the classic n/=2. – Raudel Ravelo Jan 10 '18 at 05:33
  • 4
    @RaudelRavelo XOR swap is not an optimization, that myth was created by people who think that temporary variables are on the stack and inherently costly, while in reality `temp` will be held in some register. `n >> 1` really is an optimization (though not a very big one, since division by a constant, especially a PoT, does not correspond to runtime division). – harold Jan 10 '18 at 05:50
  • Good to know that, it's much more unclear than the simple one. Then this is not one of those scenarios when XOR is really an optimization. – Raudel Ravelo Jan 10 '18 at 05:58
  • Bear in mind [this answer](https://stackoverflow.com/a/15111719/15498) on the other question which points out that *most* of these implementations are woefully incomplete in a modern, international setting. – Damien_The_Unbeliever Jan 10 '18 at 07:22
  • 1
    I've updated my answer with some more details, following @Damien_The_Unbeliever's comment. I know it's not related directly to your question, but I think not enough people are aware of the problem of reversing strings when it comes to "strange" languages like Hebrew or French. – Zohar Peled Jan 10 '18 at 07:56
  • Deleted. glad to help :-) – Zohar Peled Jan 10 '18 at 08:00
  • @harold And even in the case when local variables actually end up in the stack (as when you have many variables) they're still very likely to end up in the CPU caches (specially L1d) anyway, which are not much slower than a register. – Alejandro Jan 24 '18 at 19:50

3 Answers3

4

The reason is very simple, lets check how many operations in IL code each function has. But first, lets see the real difference in time of both functions. You said that the XOR function is almost 50% slower than the other, when I run the code in Debug mode I have that results, but you must run the code in Release mode to fully allow the optimizer do its job :). In release mode the XOR functions is almost 3x slower.

The pictures have the IL code of the part inside the for loop, that is the only piece that changes.

The first picture is the function with the temp variable

Classic code with the temp variable

The second picture is the function with the XOR

XOR function

As you can see the difference in the number of instructions is huge, 14 against 34. The 3x difference in time come from some operations like conv.u2 that are a little expensive.

Rey Cruz
  • 384
  • 3
  • 14
2

I agree with Harold. XOR swap is not faster than using a temporary variable.

I think that the myth of XOR swap dates back to the days where allocating memory for new variables was a time consuming job.

Initially I thought there is a chance that this might have something to do with the type, and maybe using XOR swap in int arrays would give a better results than on char arrays, so I've built upon your benchmark one for int arrays just to text - turns out the same results (more or less) for int XOR swap is just slower than using a temporary variable.

Update:

As Damien_The_Unbeliever wrote in his comment to your question, The answer given by R. Martinho Fernandes in the question you've linked to is actually the only answer in the first page that correctly reverse a string, even with languages other then English.
In fact, Based on that answer and one of it's comments I've written an extension method to correctly reverse a string.

In my native language (Hebrew) we have all kinds of dots and symbols to specify vowels, and the simple reverse of an array (or IEnumerable) is doing it wrong, just like in the French example.

It's significantly slower than the regular swap based implementation (And about 10 times slower than the XOR based swap), but I hardly think that's going to be an issue, since reversing a string is not something you do very often, and reversing many strings in a tight loop even less.
Tested based on your benchmark method, reversing 10,000 string of length 10,000 with this method took approximately 13.5 seconds.
If anyone would ever write a program that actually needs to do so many reverses for such long strings, I would be very surprised.

So here is my international safe string reverse implementation, I hope someone would benefit from it:

public static string Reverse(this string source)
{
    if (string.IsNullOrEmpty(source))
    {
        return source;
    }
    var info = new StringInfo(source);
    var sb = new StringBuilder();

    for (int i = info.LengthInTextElements - 1; i > -1; i--)
    {
        sb.Append(info.SubstringByTextElements(i, 1));
    }
    return sb.ToString();
}
Zohar Peled
  • 79,642
  • 10
  • 69
  • 121
  • Thanks, I edited the question and I added the corrected methods, I'm still getting the same results, classic faster than XOR for strings with 10 chars and on. – Raudel Ravelo Jan 10 '18 at 05:47
  • I recommend you to try it out with the same string's lengths I used, because if not you will get different results than the ones I'm claiming. – Raudel Ravelo Jan 10 '18 at 05:51
  • BTW, the results you published gave you smaller times for the classic when run with length 15 and 20 ;) – Raudel Ravelo Jan 10 '18 at 05:53
  • Glad you updated, now people can see what the question is about. BTW, why the downvote on my question? – Raudel Ravelo Jan 10 '18 at 06:13
0

You can test below code with Array.Reverse, It will more efficient than other approaches which you are mentioned, Array.Reverse is coded natively and it is very simple to maintain and understand.

    public static string ReverseArray(string text)
    {
        if (text == null) return null;

        char[] array = text.ToCharArray();
        Array.Reverse(array);
        return new String(array);
    }

Below is a complete code to demonstrate,

delegate string StringDelegate(string s);

    static void Benchmark(string description, StringDelegate d, int times, string text)
    {
        Stopwatch sw = new Stopwatch();
        sw.Start();
        for (int j = 0; j < times; j++)
        {
            d(text);
        }
        sw.Stop();
        Console.WriteLine("{0} Ticks {1} : called {2} times.", sw.ElapsedTicks, description, times);
    }

    public static string ReverseArray(string text)
    {
        if (text == null) return null;

        // this was posted by petebob as well 
        char[] array = text.ToCharArray();
        Array.Reverse(array);
        return new String(array);
    }

    public static string ReverseXor(string s)
    {
        char[] charArray = s.ToCharArray();
        int len = s.Length - 1;

        for (int i = 0; i < len; i++, len--)
        {
            charArray[i] ^= charArray[len];
            charArray[len] ^= charArray[i];
            charArray[i] ^= charArray[len];
        }

        return new string(charArray);
    }

    public static string ReverseClassic(string s)
    {
        char[] charArray = s.ToCharArray();
        int len = s.Length-1;

        for (int i = 0; i < len; i++, len--)
        {
            char temp = charArray[len];
            charArray[len] = charArray[i];
            charArray[i] = temp;
        }
        return new string(charArray);
    }

    public static string StringOfLength(int length)
    {
        Random random = new Random();
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < length; i++)
        {
            sb.Append(Convert.ToChar(Convert.ToInt32(Math.Floor(26 * random.NextDouble() + 65))));
        }
        return sb.ToString();
    }

    static void Main(string[] args)
    {

        int[] lengths = new int[] { 1, 10, 100, 1000, 10000, 100000 };

        foreach (int l in lengths)
        {
            int iterations = 10000;
            string text = StringOfLength(l);                
            Benchmark(String.Format("Classic (Length: {0})", l), ReverseClassic, iterations, text);
            Benchmark(String.Format("Array (Length: {0})", l), ReverseArray, iterations, text);
            Benchmark(String.Format("Xor (Length: {0})", l), ReverseXor, iterations, text);
            Console.WriteLine();
        }
        Console.Read();
    }

Note: I have corrected your code for reversing the string, it was not correctly reverse the string as your post

programtreasures
  • 4,250
  • 1
  • 10
  • 29
  • Array.Reverse has nothing especial, in fact using it is a little slower than if you do it by your self. This is the link of the .Net implementation. https://referencesource.microsoft.com/#mscorlib/system/array.cs,f5edebc3412b666b – Rey Cruz Feb 08 '18 at 18:54
  • @ReyCruz agreed Array.Reverse casting while assigning to reverse array. – programtreasures Feb 09 '18 at 03:04