1

I am coding a C# application that appends, inserts, replaces and finds string content in string called content. Multiple objects called editObjects are performing these actions on the same string called content.

I am currently passing a StringBuilder object to each editObject object, and then each editObject performs the actions on the StringBuilder object.

This question has two parts:

Am I correct in saying that a StringBuilder is the most efficient way to perform multiple actions on a string?

I have found this post from 2013: Fastest search method in StringBuilder, and would like to know if there is some known code that is a more efficient way to find the index of a string in a StringBuilder?

public static class StringBuilderSearching
{
  public static bool Contains(this StringBuilder haystack, string needle)
  {
    return haystack.IndexOf(needle) != -1;
  }
  public static int IndexOf(this StringBuilder haystack, string needle)
  {
    if(haystack == null || needle == null)
      throw new ArgumentNullException();
    if(needle.Length == 0)
      return 0;//empty strings are everywhere!
    if(needle.Length == 1)//can't beat just spinning through for it
    {
      char c = needle[0];
      for(int idx = 0; idx != haystack.Length; ++idx)
        if(haystack[idx] == c)
          return idx;
      return -1;
    }
    int m = 0;
    int i = 0;
    int[] T = KMPTable(needle);
    while(m + i < haystack.Length)
    {
      if(needle[i] == haystack[m + i])
      {
        if(i == needle.Length - 1)
          return m == needle.Length ? -1 : m;//match -1 = failure to find conventional in .NET
        ++i;
      }
      else
      {
        m = m + i - T[i];
        i = T[i] > -1 ? T[i] : 0;
      }
    }
    return -1;
  }      
  private static int[] KMPTable(string sought)
  {
    int[] table = new int[sought.Length];
    int pos = 2;
    int cnd = 0;
    table[0] = -1;
    table[1] = 0;
    while(pos < table.Length)
      if(sought[pos - 1] == sought[cnd])
        table[pos++] = ++cnd;
      else if(cnd > 0)
        cnd = table[cnd];
      else
        table[pos++] = 0;
    return table;
  }
}
Community
  • 1
  • 1
Simon
  • 7,991
  • 21
  • 83
  • 163

1 Answers1

1

StringBuilder is faster for most string manipulations - That doesn't mean it is the best for all multiple actions done on a string.

In your case, you are willing to find a string inside the StringBuilder, this requires you to do one of two things:

  • Doing a standard search at O(n) iterating the whole StringBuilder, which can be done in a pretty optimized way like the code you have posted in the question.

  • Indexing chars or strings on every addition and removal of data to/from the StringBuilder. Note that indexing means you will need to analyse every string you add or remove to/from the StringBuilder which creates a little overhead for each manipulation action.

    • You need to do the judging of what your application does more and what will disturb your application more - O(n) search instead of an O(log(n)) or O(n + m) manipulations instead of O(n) ones. (Where m is the overhead for each insertion/removal divided by the average inserted/removedstring length)

    • Basic indexing illustration:

    string indexing

    • Indexes which are in between are fine too for balance (maximum 2 chars index/maximum 3 chars/etc...).
Community
  • 1
  • 1
Tamir Vered
  • 10,187
  • 5
  • 45
  • 57