2

I am trying to convert a large stream (4mb) to a string which i eventually convert it to a JSON Array.

when the stream size is small ( in KB ) every thing works fine, the minute it starts to process the 4mb stream it runs out of memory

below is what i use use to convert the stream to string, I've tried almost every thing and i suspect the issue is with the while loop. can some one please help?

  public String convertStreamToString(InputStream is)
            throws IOException {

        if (is != null) {
            Writer writer = new StringWriter();

            char[] buffer = new char[1024];
            try
            {
                Reader reader = new BufferedReader(
                        new InputStreamReader(is, "UTF-8"));
                int n;
                while ((n = reader.read(buffer)) != -1) 
                {
                    writer.write(buffer, 0, n);
                }
            }
            finally 
            {
                is.close();
            }
            return writer.toString();
        } else {       
            return "";
        }
    }

Update: ok this is where i reached at the moment, am i on the right track? I think i am close.. not sure what else i can close or flush to regain memory..

public String convertStreamToString(InputStream is)
        throws IOException {

    String encoding = "UTF-8";
    int maxlines = 2000;
    StringWriter sWriter = new StringWriter(7168);
    BufferedWriter writer = new BufferedWriter(sWriter);
    BufferedReader reader = null;
    if (is == null) {
        return "";
    } else {     


        try {
            int count = 0;
            reader = new BufferedReader(new InputStreamReader(is, encoding));
            for (String line; (line = reader.readLine()) != null;) {
                if (count++ % maxlines == 0) {
                    sWriter.close();
                    // not sure what else to close or flush here to regain memory
                    //Log.v("Max Lines Reached", "Max Lines Reached");;
                }

                writer.write(line);


            }
            Log.v("Finished Loop", "Looping over");


    } finally {
        is.close();
        writer.close();

    }
        return writer.toString();
    }
}
trincot
  • 317,000
  • 35
  • 244
  • 286
Leon Leony
  • 21
  • 1
  • 3
  • 1
    First things first: why not invert the `if` condition at the beginning? `if (is == null) return "";` Also, try and use Jackson: it has a streaming API to convert input streams to JSON. – fge Dec 22 '12 at 17:26
  • i looked into Jackson and GSON, trying to avoid them as they involve more work... this was working fine until the stream became too big to handle. – Leon Leony Dec 22 '12 at 17:30
  • 1
    Why not use a StringBuilder instead of StringWriter. But I am not sure if this solves your memory issue. – MrSmith42 Dec 22 '12 at 17:32
  • You either need to increase the available memory to the JVM with the `-Xmx` setting or ... don't load the entire thing into memory. – Brian Roach Dec 22 '12 at 17:32
  • @LeonLeony: "more work"? Jackson is dead simple! Create an ObjectMapper, use its read methods, and that's it. – fge Dec 22 '12 at 17:33
  • this is an android project, i looked into increasing the heap but apparently it doesn't really help with android, or didn't seem to work. – Leon Leony Dec 22 '12 at 17:34
  • When you use Stringwriter and convert it to string, you use almost twice the memory. Why not use mutable string classes while building string out of stream. Also you can build JSON Array as you read through the steam – coder000001 Dec 22 '12 at 17:48

1 Answers1

3

StringWriter writes to a StringBuffer internally. A StringBuffer is basically a wrapper round a char array. That array has a certain capacity. When that capacity is insufficient, StringBuffer will allocate a new larger char array and copy the contents of the previous one. At the end you call toString() on the StringWriter, which will again copy the contents of the char array into the char array of the resulting String.

If you have any means of knowing beforehand what the needed capacity is, you should use StringWriter's contructor that sets the initial capacity. That would avoid needlessly copying arrays to increase the buffer.

Yet that doesn't avoid the final copy that happens in toString(). If you're dealing with streams that can be large, you may need to reconsider whether you really need that inputstream as a String. Using a sufficiently large char array directly would avoid all the copying around, and would greatly reduce memory usage.

The ultimate solution would be to do some of the processing of the input, before all of the input has come in, so the characters that have been processed can be discarded. This way you'll only need to hold as much in memory as what is needed for a processing step.

bowmore
  • 10,842
  • 1
  • 35
  • 43
  • thank you.. i just tested this, i also doubled the capacity, it didnt help.. i am unable to get out of the while loop without an "out of memory" crash. – Leon Leony Dec 22 '12 at 18:47
  • new StringWriter(1024) and new StringWriter(5120) – Leon Leony Dec 22 '12 at 20:02
  • 1
    If you have 4 MB of data it will not fit in 5120 characters. The internal buffer will still need enlarging. And if you've read my answer carefully, you'll notice that proper initialization of the internal buffer solves only half of the problem. – bowmore Dec 22 '12 at 20:36
  • yes thank you.. i read your answer carefully, i understand that this solves half of the problem. i tried increasing the buffer to 102400 but still .. no luck.. – Leon Leony Dec 23 '12 at 17:22
  • Closing a StringWriter has no effect. And in the end you still have the entire input in the StringWriter and copy it to a String => memory usage * 2. Do you really need a String? The entire String? – bowmore Dec 23 '12 at 17:37
  • yes that same string is then converted to a jsonarray obj = new JSONArray(thestring); – Leon Leony Dec 23 '12 at 17:43
  • You could consider parsing the elements of the array individually. Or, if you are also the one writing the original stream, consider writing the individual elements. – bowmore Dec 23 '12 at 20:30