15

I have a StringBuilder named stb_Swap_Tabu used to store Course's Names, I am using the following method to find a course:

stb_Swap_Tabu.ToString.Contains("CourseName")

in my case, Performance is the most important issue. Is there any faster way?

Alaa Alweish
  • 8,904
  • 16
  • 57
  • 84
  • 2
    If you need to use a `StringBuilder` then you'll probably need to call `ToString` each time you want to search, which isn't a great idea performance-wise. `StringBuilder` is used to *build strings*; presumably if you're building strings then you already have the constituent parts; why don't you search within those constituent parts directly instead? – LukeH Sep 04 '12 at 10:19
  • 4
    `StringBuilder` is possibly the least suited data type to store a list of searchable names. Why not use a `List` instead, and use the `Contains` method of the `List`? – Rotem Sep 04 '12 at 10:19
  • 9
    Do you have an example of what this string actually is? You say it stores "Course's Names" - whether or not the "Course's" actually means "Courses", the "Names" suggests more than one name - so presumably this is a delimited string somehow. In which case, switching it to a `List` or `HashSet` of the *individual* names would make a lot of sense – Marc Gravell Sep 04 '12 at 10:21
  • 3
    `HashSet` for bigger number of strings, `list` for smaller ones. – Chibueze Opata Sep 04 '12 at 10:30
  • 2
    I agree it's unlikely to be what's needed here. Still, finding a substring in a StringBuilder does come up. – Jon Hanna Sep 04 '12 at 10:58
  • Thanks for everybody for your contributions, i have test case on Curriculum-Based Course TimeTabling program, I will test it out and post the results here. – Alaa Alweish Sep 04 '12 at 11:31
  • I don't think you did much research on your own for this. You could have easily found out that StringBuilder isn't suited for this purpose if performance or efficiency was to be considered. – Shakti Prakash Singh Sep 04 '12 at 11:36
  • 2
    @Shaks depends on what else you are doing with it. StringBuilder isn't great for searching (even with an approach like in my answer, it's only about as good as string), but if there was a good reason for having a builder due to other work done, then the choice is between searching vs. putting into a more searchable collection, searching it and then perhaps updating or rebuilding the string-builder based on the results. In that case, searching the string-builder can be the best overall, if lots of other work **is** the sort of things string-builder suits. – Jon Hanna Sep 04 '12 at 11:42
  • @JonHanna - Well, I do agree with you, if a lot of other string related tasks were to be performed. But my comment was totally based on the information provided in the question. – Shakti Prakash Singh Sep 05 '12 at 06:25
  • Exactly as @JonHanna mentioned, i have a good reason for having a builder that's why my question title is : **Fastest search method in StringBuilder**. As a result of my test, the answer provided by @JonHanna after some modifications was 28% better than using `Conatins`. i will try to make abstracted code and share it here.Thanks for everybody for your contributions, – Alaa Alweish Sep 05 '12 at 08:01

2 Answers2

29

StringBuilder wasn't really intended for all string purposes. If you really need to search one, you have to write your own method.

There are several string-searching algorithms suited to different cases.

The following is a simple implementation of the Knuth–Morris–Pratt algorithm that only cares about ordinal matches (no case-folding, no culture-related collation, just a plain codepoint to codepoint match). It has some initial Θ(m) overhead where m is the length of the word sought, and then finds in Θ(n) where n is the distance to the word sought, or the length of the whole string-builder if it isn't there. This beats the simple char-by-char compare which is Θ((n-m+1) m) (Where O() notation describes upper-bounds, Θ() describes both upper and lower bounds).

All this said, it does sound like creating a list might be a better approach to the task in hand.

public static class StringBuilderSearching
{
  public static bool Contains(this StringBuilder haystack, string needle)
  {
    return haystack.IndexOf(needle) != -1;
  }
  public static int IndexOf(this StringBuilder haystack, string needle)
  {
    if(haystack == null || needle == null)
      throw new ArgumentNullException();
    if(needle.Length == 0)
      return 0;//empty strings are everywhere!
    if(needle.Length == 1)//can't beat just spinning through for it
    {
      char c = needle[0];
      for(int idx = 0; idx != haystack.Length; ++idx)
        if(haystack[idx] == c)
          return idx;
      return -1;
    }
    int m = 0;
    int i = 0;
    int[] T = KMPTable(needle);
    while(m + i < haystack.Length)
    {
      if(needle[i] == haystack[m + i])
      {
        if(i == needle.Length - 1)
          return m == needle.Length ? -1 : m;//match -1 = failure to find conventional in .NET
        ++i;
      }
      else
      {
        m = m + i - T[i];
        i = T[i] > -1 ? T[i] : 0;
      }
    }
    return -1;
  }      
  private static int[] KMPTable(string sought)
  {
    int[] table = new int[sought.Length];
    int pos = 2;
    int cnd = 0;
    table[0] = -1;
    table[1] = 0;
    while(pos < table.Length)
      if(sought[pos - 1] == sought[cnd])
        table[pos++] = ++cnd;
      else if(cnd > 0)
        cnd = table[cnd];
      else
        table[pos++] = 0;
    return table;
  }
}
Jon Hanna
  • 110,372
  • 10
  • 146
  • 251
  • 2
    Exactly as @JonHanna mentioned, i have a good reason for having a builder that's why my question title is : **Fastest search method in StringBuilder**. As a result of my test, the answer provided by @JonHanna after some modifications was 28% better than using `Conatins`. i will try to make abstracted code and share it here. – Alaa Alweish Sep 05 '12 at 08:00
  • 21
    It doesn't make sense that StringBuilder has a `Replace` function that must inevitably search the string and even takes start and end indexes, yet does not provide an `indexOf` function. Why are we left to redo what's already been done? – Slight Jun 01 '15 at 15:57
  • That code fails for me with haystack "abcde" and needle "cd", it returns false. – Juri Robl May 20 '17 at 09:35
  • 2
    Shouldn't it be just `return m;`, not `return m == needle.Length ? -1 : m;`? – Juri Robl May 20 '17 at 09:44
  • @JuriRobl that bit is to match the .NET convention of -1 for not found. Don't know what's causing that failure though, will look later if I get a chance. – Jon Hanna May 20 '17 at 10:26
  • I agree, this should be just `return m;`. It was returning -1 otherwise. – neilsimp1 Nov 08 '19 at 17:48
0

I know this is an old question but it came up in my search results when I was trying to create a solution for my own project. I thought I needed to search a StringBuilder.ToString's method results but then I realized I could just call methods on the StringBuilder itself. My situation may not be the same as yours, but I though I'd share:

Private Function valueFormatter(ByVal value As String) As String
    ' This will correct any formatting to make the value valid for a CSV format
    '
    ' 1) Any value that as a , in it then it must be wrapped in a " i.e. Hello,World -> "Hello,World"
    ' 2) In order to escape a " in the value add a " i.e. Hello"World -> Hello""World
    ' 3) if the value has a " in it then it must also be wrapped in a " i.e. "Hello World" -> ""Hello World"" -> """Hello World"""
    ' 
    ' VB NOTATAION 
    ' " -> """"
    ' "" -> """"""

    If value.Contains(",") Or value.Contains("""") Then
        Dim sb As New StringBuilder(value)
        If value.Contains("""") Then sb.Replace("""", """""")
        sb.Insert(0, """").Append("""")
        Return sb.ToString
    Else
        Return value
    End If
End Function