1

I need to repeatedly append strings (for about 50 times), which is a substring of another StringBuilder. I need to do this for around 30k inputs. It takes me a time of around 6 minutes.

 StringBuilder input = new StringBuilder(10000);
    StringBuilder output = new StringBuilder(10000);

//for loop till end of file which reads strings into variable 'input'

{
output.append(input.substring(1, 8));
output.append(input.substring(33, 45));
output.append(input.substring(20, 25)); // and so on
}

This took around 6 minutes.

So, i tried something like

{


 output.append(input.substring(1, 8) + output.append(input.substring(33, 45) + output.append(input.substring20, 25) + .. // and so on );

}

This, has also taking the same time. I know both of these are same.

But, eventhough I use a StringBuilder, why still I have performance lag? Is there something I'm doing wrong?

I referred: StringBuilder vs String concatenation in toString() in Java and String concatenation in Java - when to use +, StringBuilder and concat and some more. Most of them suggest to use a StringBuilder.

Community
  • 1
  • 1
Harbinger
  • 762
  • 2
  • 14
  • 36
  • I suggest monitoring GC activity while your loop is in progress. You may be running into memory limits. – Marko Topolnik Feb 25 '15 at 11:48
  • 2
    " I know both of these are same." No they're not. Your second code converts `output` to a string on each iteration, and uses string concatenation. You definitely don't want to do that. – Jon Skeet Feb 25 '15 at 11:49
  • Maybe this is your issue: "for loop till end of file which reads strings into variable 'input' "? Show us how you set up the reader and how you are reading. – Seelenvirtuose Feb 25 '15 at 11:49
  • 1
    There is no way that appending 50 small substrings, 50K times take much more than a few seconds. If it is taking longer it is more likely to be doing work somewhere you haven't mentioned. I suggest you profile your application. – Peter Lawrey Feb 25 '15 at 11:50
  • @Seelenvirtuose : Actually I read them from queue.. not from a file.. so, I read till the queue size is zero. – Harbinger Feb 25 '15 at 11:52

2 Answers2

3

You can definitely avoid creating so many objects using the overload of append which allows you to specify a subsequence:

for (...)
{
    output.append(input, 1, 8);
    output.append(input, 33, 45);
    output.append(input, 20, 25);
}

That may or may not help you. The string concatenation in your second example should definitely be avoided - I'm surprised that isn't making a huge difference... that suggests it may well not be the appending that's taking the time, but whatever's reading the input.

To test that, you should possibly try an empty for loop, so that you're still reading all the same input, but not appending to output at all.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • The short-lived strings will just increase minor GC frequency. The chance that this is the problem is vanishingly small, especially given the small size of individual substrings. – Marko Topolnik Feb 25 '15 at 12:22
  • @MarkoTopolnik: Agreed. Basically we really can't tell where the time is going here - hence my suggestion of just cycling through the inputs. – Jon Skeet Feb 25 '15 at 12:55
1

Most likely the performance issue is elsewhere as this shouldn't take more than a second. I suggest you profile your application to determine where it is actually spending it's processing time.

long start = System.currentTimeMillis();
char[] chars = new char[500];
Arrays.fill(chars, '.');
for (int i = 0; i < 30000; i++) {
    String input = new String(chars);
    StringBuilder output = new StringBuilder();
    for (int j = 0; j < 50; j++) {
        output.append(input, j * 10, j * 10 + 9);
    }
    String out = output.toString();
}
System.out.println("Took: " + (System.currentTimeMillis() - start) / 1e3 + " seconds");

prints for 50 substrings of 30,000 strings

Took: 0.058 seconds
Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130