0

I have to transfer big files (500 MB+...can also be of 1GB in size). These files have to be base64 encoded and the encoded string has to be put in a XML file. While my below code works good for smaller files (30 - 50 MB) it fails for files great than 100 MB.

I am using base64 encoder from SUN (sun.misc.BASE64Encoder).

public void execute(InputStream inputstream, OutputStream outputstream) throws StreamTransformationException{
        try
        {
            String sourceFileName = "test_file";
            String ReceiverStr = "";
            //2. Convert input data in Base64Encoded string
            BASE64Encoder encoder = new BASE64Encoder();
            byte input[] = new byte[inputstream.available()];

            inputstream.read(input);
            String base64Encoded = encoder.encode(input);
            //3. Build the SOAP request format
            String serverUrl = "http://website/url";
           
            String soapEnvelope = "<soapenv:Envelope xmlns:soapenv=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:soap=\"http://schemas.microsoft.com/sharepoint/soap/\">";
            String soapHeader = "<soapenv:Header/><soapenv:Body><soap:CopyIntoItems><soap:SourceUrl>C:\\Users\\Desktop\\test_file.txt</soap:SourceUrl><soap:DestinationUrls><soap:string>" + serverUrl + "</soap:string></soap:DestinationUrls><soap:Fields><soap:FieldInformation " + "Type=" + "\"Text\"" + " DisplayName=\"" + sourceFileName + "\"" + " InternalName=\"" + sourceFileName + "\"" + " Id=\"deff4b5c-b727-414c-893d-c56a8e12455f\"" + " Value=\"" + sourceFileName + "\"/></soap:Fields>";
            String soapStream = "<soap:Stream>" + base64Encoded + "</soap:Stream>";
            ReceiverStr = soapEnvelope + soapHeader + soapStream + "</soap:CopyIntoItems></soapenv:Body></soapenv:Envelope>";
            //4. Write the SOAP request to receiver channel
            outputstream.write(ReceiverStr.getBytes());
        }
        catch(Exception e) {
            throw new StreamTransformationException(e.toString());  
        }
    }

When I try to see the message at run-time, then the entire message is not displayed and it is truncated in-between in the base64Encoded string. Below is the error that is seen in my system on executing the JAVA code.

enter image description here

Please note that my server settings can otherwise easily transfer 1GB+ files without any JAVA Heap size error or file truncation.

Can you please let me know how can I process big files using above logic?

Thanks,

Abhishek.

Sandra Rossi
  • 11,934
  • 5
  • 22
  • 48
  • Run Java with the command-line option -Xmx, this option sets the maximum size of the heap. Also have a look at http://stackoverflow.com/questions/37335/how-to-deal-with-java-lang-outofmemoryerror-java-heap-space-error-64mb-heap – Balwinder Singh Sep 30 '15 at 05:53
  • @BalwinderSingh That's worthless advice. You can't just keep increasing memory when you run out like it'll solve everything. – Kayaman Sep 30 '15 at 05:54
  • I have the below settings for eclipse installed on my system: -vm C:\Program Files (x86)\Java\jdk1.6.0_20\bin\javaw.exe -showsplash com.sap.netweaver.developerstudio --launcher.XXMaxPermSize 256m -vmargs -Xmx512m -Xms128m -XX:PermSize=32m -XX:MaxPermSize=256m -Dfile.encoding=UTF-8 -Dosgi.requiredJavaVersion=1.5 – user3932624 Sep 30 '15 at 05:56
  • Please note that the settings which I am giving above are from my "Eclipse" installation. – user3932624 Sep 30 '15 at 06:06

3 Answers3

1

There are plenty of things wrong with your code. First of all I recommend switching to OutputStreamWriter instead of OutputStream as your parameter (you're not writing binary data, but character data).

Write out the headers first, then start processing the inputstream in chunks of let's say 8192 bytes (don't use inputstream.available() ever, you won't need it). If you don't know the "standard" way of processing streams, go through Java IO Essentials. Basically you read a chunk of data, convert it to Base64 write it out and repeat (until inputstream is exhausted). NOTE! You have to make sure you encode chunks of size divisible by 3 (except for the last chunk), otherwise there will be padding applied and it will mess up the result. The last chunk can have the padding.

After that you can write the footers, and the whole process will take barely any memory.

Kayaman
  • 72,141
  • 5
  • 83
  • 121
0

Taking a look at your code you store the data three times:

The first time is in byte array from the input stream. The second time is in the encoded String and finally you write it to the output stream.

What you can do to optimize this a bit is to split the reading from the input stream/encoding and the writing of the encoded string to the output stream. That way you can let go of the input byte array once encoded and the memory can be freed. An even better solution would be when the encoder directly writes to the Output stream or directly write to an encoding output stream.

However you will still have to consider what the maximum file size should be that can be processed and adjust the heap setting accordingly: How to deal with "java.lang.OutOfMemoryError: Java heap space" error (64MB heap size)

Community
  • 1
  • 1
hotzst
  • 7,238
  • 9
  • 41
  • 64
  • He can't use `Base64OutputStream` (I was going to suggest the same) since he needs to put the headers and footers in the stream too. – Kayaman Sep 30 '15 at 06:33
  • I changed the code and now i just have one variable which contains the data. This has reduced the overall message size and hence there is no more JAVA heap size error – user3932624 Oct 20 '15 at 10:14
-1

What kind of heap size did you configure, when you started the application? As far as I remember, the default heap size is 256kb and since you encode your complete file at once as base64, you need to set a heap size of at least 1.5 times your filesize.

Check out how to use set and use the VM-argument "-Xmx".

Hansjoerg Wingeier
  • 4,274
  • 4
  • 17
  • 25
  • I'm sorry, but offering to increase memory is very poor advice. His code has plenty of issues that can be either fixed, or hidden by increasing memory. The first option should always be preferred. – Kayaman Sep 30 '15 at 06:43