8

Is there any other method that is faster than doing like this?

private void EscapeStringSequence(ref string data)
{
    data = data.Replace("\\", "\\\\"); // Backslash
    data = data.Replace("\r", "\\r");  // Carriage return
    data = data.Replace("\n", "\\n");  // New Line
    data = data.Replace("\a", "\\a");  // Vertical tab
    data = data.Replace("\b", "\\b");  // Backspace
    data = data.Replace("\f", "\\f");  // Formfeed
    data = data.Replace("\t", "\\t");  // Horizontal tab
    data = data.Replace("\v", "\\v");  // Vertical tab
    data = data.Replace("\"", "\\\""); // Double quotation mark
    data = data.Replace("'", "\\'");   // Single quotation mark
}

-- Edited (Add explanation) --
Q1: Is there a reason why you need to speed it up? Is it causing a huge problem?
This part is used in this project: http://mysqlbackuprestore.codeplex.com/
I'm going to loop lots of various length of strings into this function repeatly. The whole process takes around 6-15 seconds to finished for millions of rows. There are other part get involve too. I'm trying to speed up every part.

Q2: How slow is it now?
OK, I'll capture the exact time used and post it here. I'll come back later. (will post the result tomorrow)

Update 29-06-2012
I have run test. This is the result:

Speed Test: String.Replace() - measured in miliseconds
Test 1: 26749.7531 ms
Test 2: 27063.438 ms
Test 3: 27753.8884 ms
Average: 27189.0265 ms
Speed: 100%

Speed Test: Foreach Char and Append - measured in miliseconds
Test 1: 8468.4547 ms
Test 2: 8348.8527 ms
Test 3: 8353.6476 ms
Average: 8390.3183 ms
Speed: 224% < faster
===================================
Update - Next Test (Another round)
===================================
------
Test Replace String Speed.
Test 1: 26535.6466
Test 2: 26379.6464
Test 3: 26379.6463
Average: 26431.6464333333
Speed: 100%
------
Test Foreach Char String Append.
Test 1: 8502.015
Test 2: 8517.6149
Test 3: 8595.6151
Average: 8538.415
Speed: 309.56%
------
Test Foreach Char String Append (Fix StringBuilder Length).
Test 1: 8314.8146
Test 2: 8330.4147
Test 3: 8346.0146
Average: 8330.41463333333
Speed: 317.29%


Conclusion:
Using Foreach Char Loop and Append is faster than String.Replace().

Thanks you very much guys.

--------
Below are the codes that I used to run the test: (edited)

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.Write("Press any key to continue...");
            Console.ReadKey();
            Console.Write("\r\nProcess started.");
            Test();
            Console.WriteLine("Done.");
            Console.Read();
        }

        public static Random random = new Random((int)DateTime.Now.Ticks);

        public static string RandomString(int size)
        {
            StringBuilder sb = new StringBuilder();
            char ch;
            for (int i = 0; i < size; i++)
            {
                ch = Convert.ToChar(Convert.ToInt32(Math.Floor(26 * random.NextDouble() + 65)));
                sb.Append(ch);
            }
            return sb.ToString();
        }

        public static void Test()
        {
            string text = "\\_\r\n\a\b\f\t\v\"'" + RandomString(2000) + "\\_\r\n\a\b\f\t\v\"'" + RandomString(2000);

            List<TimeSpan> lstTimeUsed = new List<TimeSpan>();

            int target = 100000;

            for (int i = 0; i < 3; i++)
            {
                DateTime startTime = DateTime.Now;
                for (int j = 0; j < target; j++)
                {
                    if (j.ToString().EndsWith("000"))
                    {
                        Console.Clear();
                        Console.WriteLine("Test " + i.ToString());
                        Console.WriteLine(j.ToString() + " of " + target.ToString());
                    }

                    string data = text;

                    data = data.Replace("\\", "\\\\"); // Backslash
                    data = data.Replace("\r", "\\r");  // Carriage return
                    data = data.Replace("\n", "\\n");  // New Line
                    data = data.Replace("\a", "\\a");  // Vertical tab
                    data = data.Replace("\b", "\\b");  // Backspace
                    data = data.Replace("\f", "\\f");  // Formfeed
                    data = data.Replace("\t", "\\t");  // Horizontal tab
                    data = data.Replace("\v", "\\v");  // Vertical tab
                    data = data.Replace("\"", "\\\""); // Double quotation mark
                    data = data.Replace("'", "\\'");   // Single quotation mark

                }
                DateTime endTime = DateTime.Now;
                TimeSpan ts = endTime - startTime;
                lstTimeUsed.Add(ts);
            }

            double t1 = lstTimeUsed[0].TotalMilliseconds;
            double t2 = lstTimeUsed[1].TotalMilliseconds;
            double t3 = lstTimeUsed[2].TotalMilliseconds;
            double tOri = (t1 + t2 + t3) / 3;

            System.IO.TextWriter tw = new System.IO.StreamWriter("D:\\test.txt", true);
            tw.WriteLine("------");
            tw.WriteLine("Test Replace String Speed. Test Time: " + DateTime.Now.ToString());
            tw.WriteLine("Test 1: " + t1.ToString());
            tw.WriteLine("Test 2: " + t2.ToString());
            tw.WriteLine("Test 3: " + t3.ToString());
            tw.WriteLine("Average: " + tOri.ToString());
            tw.WriteLine("Speed: 100%");
            tw.Close();

            lstTimeUsed = new List<TimeSpan>();

            for (int i = 0; i < 3; i++)
            {
                DateTime startTime = DateTime.Now;
                for (int j = 0; j < target; j++)
                {
                    if (j.ToString().EndsWith("000"))
                    {
                        Console.Clear();
                        Console.WriteLine("Test " + i.ToString());
                        Console.WriteLine(j.ToString() + " of " + target.ToString());
                    }

                    string data = text;

                    var builder = new StringBuilder();
                    foreach (var ch in data)
                    {
                        switch (ch)
                        {
                            case '\\':
                            case '\r':
                            case '\n':
                            case '\a':
                            case '\b':
                            case '\f':
                            case '\t':
                            case '\v':
                            case '\"':
                            case '\'':
                                builder.Append('\\');
                                break;
                            default:
                                break;
                        }
                        builder.Append(ch);
                    }

                }
                DateTime endTime = DateTime.Now;
                TimeSpan ts = endTime - startTime;
                lstTimeUsed.Add(ts);
            }

            t1 = lstTimeUsed[0].TotalMilliseconds;
            t2 = lstTimeUsed[1].TotalMilliseconds;
            t3 = lstTimeUsed[2].TotalMilliseconds;

            tw = new System.IO.StreamWriter("D:\\test.txt", true);
            tw.WriteLine("------");
            tw.WriteLine("Test Foreach Char String Append. Test Time: " + DateTime.Now.ToString());
            tw.WriteLine("Test 1: " + t1.ToString());
            tw.WriteLine("Test 2: " + t2.ToString());
            tw.WriteLine("Test 3: " + t3.ToString());
            tw.WriteLine("Average: " + ((t1 + t2 + t3) / 3).ToString());
            tw.WriteLine("Speed: " + ((tOri) / ((t1 + t2 + t3) / 3) * 100).ToString("0.00") + "%");
            tw.Close();

            lstTimeUsed = new List<TimeSpan>();

            for (int i = 0; i < 3; i++)
            {
                DateTime startTime = DateTime.Now;
                for (int j = 0; j < target; j++)
                {
                    if (j.ToString().EndsWith("000"))
                    {
                        Console.Clear();
                        Console.WriteLine("Test " + i.ToString());
                        Console.WriteLine(j.ToString() + " of " + target.ToString());
                    }

                    string data = text;

                    var builder = new StringBuilder(data.Length + 20);
                    foreach (var ch in data)
                    {
                        switch (ch)
                        {
                            case '\\':
                            case '\r':
                            case '\n':
                            case '\a':
                            case '\b':
                            case '\f':
                            case '\t':
                            case '\v':
                            case '\"':
                            case '\'':
                                builder.Append('\\');
                                break;
                            default:
                                break;
                        }
                        builder.Append(ch);
                    }

                }
                DateTime endTime = DateTime.Now;
                TimeSpan ts = endTime - startTime;
                lstTimeUsed.Add(ts);
            }

            t1 = lstTimeUsed[0].TotalMilliseconds;
            t2 = lstTimeUsed[1].TotalMilliseconds;
            t3 = lstTimeUsed[2].TotalMilliseconds;

            tw = new System.IO.StreamWriter("D:\\test.txt", true);
            tw.WriteLine("------");
            tw.WriteLine("Test Foreach Char String Append (Fix StringBuilder Length). Test Time: " + DateTime.Now.ToString());
            tw.WriteLine("Test 1: " + t1.ToString());
            tw.WriteLine("Test 2: " + t2.ToString());
            tw.WriteLine("Test 3: " + t3.ToString());
            tw.WriteLine("Average: " + ((t1 + t2 + t3) / 3).ToString());
            tw.WriteLine("Speed: " + ((tOri) / ((t1 + t2 + t3) / 3) * 100).ToString("0.00") + "%");
            tw.Close();

        }
    }
}
peterh
  • 11,875
  • 18
  • 85
  • 108
mjb
  • 7,649
  • 8
  • 44
  • 60
  • possible duplicate of [fastest way to replace string in a template](http://stackoverflow.com/questions/959940/fastest-way-to-replace-string-in-a-template) – KV Prajapati Jun 28 '12 at 01:34
  • Putting the string into a StringBuilder and then using StringBuilder.Replace may be faster. Writing your own one-pass loop that constructs the resulting string should be much faster. – hatchet - done with SOverflow Jun 28 '12 at 01:35
  • 1
    Is there a reason why you need to speed it up? Is it causing a huge problem? – Cody Jun 28 '12 at 01:35
  • 7
    Why do you guys bother commenting with questions like "how slow is it now" or with "reason why it needs to be sped up"? The question is: **"is there any method that is faster than this"** and that should be the only thing worried about. Anything else sounds like trying to avoid the original question. – Marlon Jun 28 '12 at 01:44
  • Because we're trying to help the OP, @StackUnderflow, and sometimes that means questioning premises and assumptions. In this case, we're trying to communicate "*unless you're doing this as a hobby, or are working with a product that's shown a hotspot around this activity, don't worry about it and focus your attentions somewhere more worthy.*" – Michael Petrotta Jun 28 '12 at 01:48
  • 3
    @StackUnderflow: We weren't given any test cases so the rate at which it is currently performing for OP _is_ pertinent here. If the OP has no idea, then is could be symptomatic of the classic developer problem of optimizing the wrong piece (shaving a few micros when you're running in seconds elsewhere). It's a good probing question to see (a) what the dev is thinking and (b) what kind of improvements said dev is looking for. – Austin Salonen Jun 28 '12 at 01:49
  • I'll be very interested to see your results! – Blorgbeard Jun 28 '12 at 03:35
  • @Blorgbeard, I have posted the result in the question. This proved that foreach Loop is faster than string.replace. – mjb Jun 29 '12 at 05:30

2 Answers2

11
    var builder = new StringBuilder(data.Length + 20);
    foreach (var ch in data)
    {
      switch (ch)
      {
        case '\\':
        case '\r':
        ...
          builder.Append('\\');
          break;
      }
      builder.Append(ch);
    }
    return builder.ToString();
Serj-Tm
  • 16,581
  • 4
  • 54
  • 61
  • 1
    You probably want to specify a larger initial capacity, because you're replacing single characters with multi-character strings. – Blorgbeard Jun 28 '12 at 01:37
  • 2
    Can be simplified along the lines of: `switch(ch) { case '\\': case '\r': builder.Append('\'); break; } builder.Append(ch);` – porges Jun 28 '12 at 01:41
  • @DarkGray well, that's a good question - depends on your data I guess. – Blorgbeard Jun 28 '12 at 01:42
  • 1
    Also, although this looks pretty good, I would not assume that it was faster - benchmarking is required :) – Blorgbeard Jun 28 '12 at 01:45
  • @Blorgbeard Code is smaller, processor caching is much probably. – Serj-Tm Jun 28 '12 at 01:54
  • @DarkGray it's difficult to say that from the C# source - it's got to be compiled to MSIL and then JIT'd to machine-code first. – Blorgbeard Jun 28 '12 at 02:00
  • 1
    @Blorgbeard: It is probably safe to assume it is *faster*, how much faster I don't know. You are replacing 10 O(n) operations with 1 O(n) operation and drastically reducing the number of string creations as well. – Ed S. Jun 28 '12 at 02:09
  • 1
    O(10n) is the same as O(n).. a lot depends on the implementations of String.Replace and StringBuilder. Like I said, it looks good, it probably *is* faster, but it's the sort of thing I would benchmark before committing the change. – Blorgbeard Jun 28 '12 at 02:16
  • @Blorgbeard, DarkGray: the amount that is needed to reserve for extra char of '\' is vary and unpredicted. I added explanation in the question. – mjb Jun 28 '12 at 02:46
  • The test has been carried out (Added in the question). This method is faster. Thanks guys. – mjb Jun 29 '12 at 05:21
  • @mjb Add capacity in StringBuilder constructor: new StringBuilder(data.Length + 20). It's important. And retest – Serj-Tm Jun 29 '12 at 09:53
  • @DarkGray I have run another round test and post the result in the question. Its seems that add capacity in StringBuilder constructor has slightly speed increase. However, we cannot predict there are how many string sequence that is needed to replace. The amount that we reserve (+20) might not sufficient. Thanks for your help. – mjb Jun 29 '12 at 13:37
1

Try using a series of StringBuilder calls.

hythlodayr
  • 2,377
  • 15
  • 23
  • Assuming you mean replacing the `String.Append` calls with `StringBuilder.Append` calls - that tests out a good bit *slower* than the OP's original code (30% to 100% slower for me, depending on the size of the string; a million iterations). Not sure why that would be. – Michael Petrotta Jun 28 '12 at 01:43