0

I'm running a small program that processes around 215K of records in the database. These records contain xml that is used by JaxB to marshal and unmarshal to objects.

The program I was running was trying to find xml's that due to legacy couldn't be unmarshalled anymore. Each time I had the unmarshal exception I save this exception message containing the xml in an arraylist. All in the end I wanted to send out a mail with all failed records with the cause exception message. So I used the messages in the arraylist together with a StringBuilder to compose the email body.

However there where around 75K failures and when I was building the body the StringBuilder just stopped appending at a certain point in the for loop and the thread was blocked. I since changed my approach not to append the xml from the exception message anymore, but I'm still not clear why it didn't work.

Could it be that the VM went out of memory, or can Strings only be of a certain size (doubtful I believe certainly in the 64 bit era). Is there a better way I could have solved this ? I contemplated sending the StringBuilder to my service instead of saving the strings in an arraylist first, but that would be such a dirty interface then :(

Any architectural insights would be appreciated.

EDIT As requested here the code, it's no rocket science. Take that the failures list contains around 75K entries, each entry contains an xml of on avg 500 to 1000 lines

  private String createBodyMessage(List<String> failures) {
    StringBuilder builder = new StringBuilder();
    builder.append("Failed operations\n");
    builder.append("=================\n\n");
    for (String failure : failures) {
      builder.append(failure);      
      builder.append("\n");      
    }
    return builder.toString();
  }
kenny
  • 1,157
  • 1
  • 16
  • 41
  • 1
    possible duplicate http://stackoverflow.com/questions/1179983/how-many-characters-can-a-java-string-have – Binkan Salaryman Mar 27 '15 at 10:27
  • 2
    You could create a compressed attachment, either zip or one single text. The latter with GZipOutputStream as `.txt.gz`. There are no hidden caveats with StringBuilder, though you could give it a sufficiently high initial capacity. Mails do have issues, like allowed size, encoding of subject and content. – Joop Eggen Mar 27 '15 at 10:32
  • added the code as requested but well not rocket science – kenny Mar 27 '15 at 10:32
  • Adding it as attachment is probably the best way to go then. It will take a bit of I/O time but considering this job should only run a few months at a time, that isn't a big issue – kenny Mar 27 '15 at 10:35

3 Answers3

2

You might be just successful with

int sizeEstimate = failures.size() * 20;
StringBuilder builder = new StringBuilder(sizeEstimate);
builder.append("Failed operations\n");
builder.append("=================\n\n");
while (!failures.isEmpty()) {
    builder.append(failures.remove(0));      
    builder.append('\n');      
}

This does less resizing the internal buffer of StringBuilder and consumes failures to reduce that memory.

It might not solve the problem if the text is too huge.

Compressed attachment however is standard procedure.

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
  • I'm going to take the compressed attachment approach seems the most clean way. Interesting approach you are taking for the while loop. Not something I've ever encountered – kenny Mar 27 '15 at 10:45
1

StringBuffer is based on Array structure, and the maximum number of cells in array is 2^31-1
Reaching this size will normally throws an error on Java 7, but i'm not very sure

The solution is to swap your data to a file, before reaching a fixed size of your StringBuffer

Halayem Anis
  • 7,654
  • 2
  • 25
  • 45
1

Could it be that the VM went out of memory,

If you filled up the heap, you would get an OutOfMemoryError exception.

or can Strings only be of a certain size (doubtful I believe certainly in the 64 bit era).

Actually, yes. A Java String or StringBuilder can contain at most 2^32-1 characters1.

Is there a better way I could have solved this ? I contemplated sending the StringBuilder to my service instead of saving the strings in an arraylist first ...

That won't help if the real problem is that the concatenation of the strings is too large to hold in a StringBuilder.

Actually, a better approach would be to stream the strings into a PipedOutputStream, and use the corresponding PipedInputStream to construct a MimeBodyPart that you then attach to the email. You could include a compressor in the stream stack too.

But an even better approach would be not to attempt to send gigabytes of erroneous data as email attachments. Save them as files that can be be fetched (or whatever) if the email recipient wants them.


1 - Surprisingly, the javadocs don't seem to state this explicitly. However, String.length() returns an int, and various string manipulation methods take int arguments to specify offsets and lengths. And certainly, the standard implementations of String and StringBuilder use a single char[] as backing store, and arrays are limited to 2^31-1 elements by the JLS and the JVM spec.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216