0

How can I efficiently split a string with a character?

An example would be:

inputString = "ABCDEFGHIJ", sectionLength = 4, splitChar = '-', and output = "ABCD-EFGH-IJ"

Here is my first attempt: I wanted to split an input string with certain chars after every nth interval. I am wondering if there is a more efficient way to do this, or if I am missing something that could fail. I believe the If statement at the beginning should catch any invalid input, save null input.

public String SplitString(string inputString, int sectionLength, 
    char splitChar)
{
    if (inputString.Length <= sectionLength || sectionLength < 1)
        return inputString;

    string returnString = "";
    int subStart;
    int end = inputString.Length;

    for (subStart = 0 ; (subStart + sectionLength) < end; 
        subStart += sectionLength)
    {
        returnString = returnString +
            inputString.Substring(subStart,
            sectionLength) + splitChar;
    }

    return returnString + inputString.Substring(subStart, 
        end - subStart);
}
Andrew
  • 43
  • 5
  • 3
    I'm voting to close this question as off-topic because it is asking for a [code review](http://codereview.stackexchange.com) – Rowland Shaw Jun 23 '15 at 16:02
  • 3
    Although worded as a code review, if it were worded as "How can I efficiently split a string with a character, here is my first attempt" it would be quite acceptable. Not voting to close. – Eric J. Jun 23 '15 at 16:03
  • @Andrew: Could you provide examples of input and desired output? – Eric J. Jun 23 '15 at 16:04
  • How about someone simply editting it then? I would, but I am not good at writing questions to begin with and I would be afraid to screw it up further. That said, I think the question could use some added details, such as the user asking whether some value would cause it to give an error or something. At which point it becomes a question about needing validation methods. – cluemein Jun 23 '15 at 16:06
  • 1
    I edited your question so that it avoids the "this-is-off-topic-must-be-closed" argument and did a bit of formatting on your code. Please review my changes to make sure I didn't remove anything substantial. – xxbbcc Jun 23 '15 at 16:10
  • @xxbbcc Thanks, I appreciate that! – Andrew Jun 23 '15 at 16:56

1 Answers1

3

Strings in .NET are immutable. That means operations that combine strings end up creating a brand-new string.

This section of code

for (subStart = 0 ; (subStart + sectionLength) < end; subStart += sectionLength)
    {
        returnString = returnString + inputString.Substring(subStart, sectionLength) + splitChar;
    }

keeps creating new strings.

Instead, explore the use of StringBuilder.

int estimatedFinalStringLength = 100; // <-- Your estimate here
StringBuilder returnString = new StringBuilder(estimatedFinalStringLength);
for (subStart = 0 ; (subStart + sectionLength) < end; subStart += sectionLength)
{
    returnString.Append(inputString.Substring(subStart, sectionLength) + splitChar);
}

return returnString.ToString() + inputString.Substring(subStart, end - subStart);

Doing your best to estimate the total length of the final string will reduce the number of buffer reallocations that StringBuilder does internally.

Community
  • 1
  • 1
Eric J.
  • 147,927
  • 63
  • 340
  • 553
  • I like this a lot! Exactly the sort of feedback I was looking for. – Andrew Jun 23 '15 at 16:59
  • I'd note that estimates of string length should lean toward the higher end of what is likely. In this case though there's no need to estimate, the final length will always be `(inputString.Length - 1) * (sectionLength + 1) / sectionLength + 1` so use that. – Jon Hanna Jun 23 '15 at 17:10
  • @EricJ. Would it be beneficial to actually calculate the final value? It wouldn't be particularly hard, as you already have the length of the string and a divisor (sectionLength). Edit: Jon you beat me to it. lol – Andrew Jun 23 '15 at 17:11
  • Setting `StringBuilder` size is an optimisation; too high and you waste memory you don't use, too low and you waste time and memory on growing the internal structures. Generally you want it to be toward the high end of what is very likely, but if, as here, you can predict it precisely then it is indeed a good idea to do that. – Jon Hanna Jun 23 '15 at 17:13
  • Yep in this case you can exactly calculate the size based on the original string lengths plus the number of splits you will have. – Eric J. Jun 23 '15 at 17:38