0

I am trying to read data from a snowflake resultset and construct a line by appending all the columns as per my requirement. I have 1300 columns to read from the resultset. I wrote the below approach.

int available  = -1;
while (rs.next()) {
    for ( int i=1; i <= 1300; i++) {
        String value = rs.getString(i);
        if(value == null)
            value = "";
        row.append(value).append("\t");
    }
    byte[] line = (row.toString().trim()+"\n").getBytes(StandardCharsets.UTF_8);
    available = line.length;
    bufferedOutputStream.write(line, 0, available);
    row.setLength(0);
}
bufferedOutputStream.flush();
bufferedOutputStream.close();

This works till 100K records but fails when I tried to write 1 Million records throwing,

Java.lang.OutOfMemoryError: null
at java.base/java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:125)
at java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:119)
at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95)
at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156)
at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:81)
at java.base/java.io.BufferedOutputStream.write(BufferedOutputStream.java:127)

Please correct me If I am wrong above. Basically I need to write 10 Million records without having memory issues. Need some help here, Thanks in advance.

Edit: I am writing this into ByteArrayOutputStream to read from ByteArrayInputStream as InputStream. From this input stream will read the data for a validation process.

Akan
  • 270
  • 7
  • 21
  • You seem to be writing to a `ByteArrayOutputStream`. – Kayaman Dec 09 '21 at 09:20
  • You can allocate more heap (assuming you really do need to have all the data in a `byte[]`?) – tgdavies Dec 09 '21 at 09:21
  • What's your `BufferedOutputStream` attached to? To process 10m records you'll want it backed by something that's streaming to disk. – Andy Brown Dec 09 '21 at 09:25
  • @AndyBrown I have added it as part of edit after your question. Thanks!. I am trying write this into a ByteArrayOutputStream so that I can read from a Inputstream. My validation job accept the data as Inputstream only. – Akan Dec 09 '21 at 09:40
  • @Kayaman Yes, I want to convert this into a input stream since my donwstream system only accept as inputstream. – Akan Dec 09 '21 at 09:41
  • Does this answer your question? [How to deal with "java.lang.OutOfMemoryError: Java heap space" error?](https://stackoverflow.com/questions/37335/how-to-deal-with-java-lang-outofmemoryerror-java-heap-space-error) – 9ilsdx 9rvj 0lo Dec 09 '21 at 09:47
  • 1
    Create a temporary file. Open it with a `FileOutputStream` wrapped in your `BufferedOutputStream`. Do your processing. Close the file. Re-open it with a `FileInputStream` wrapped in a `BufferedInputStream` and give that `BufferedInputStream` to your downstream API. Delete the file when it's finished. – Andy Brown Dec 09 '21 at 09:48
  • Or create your implementation of `InputStream` that wraps the logic in your question and pass that downstream if possible. Although it'll be a lot easier to just use a temp file. – Kayaman Dec 09 '21 at 09:50

1 Answers1

2

Remove the row.

try {
    while (rs.next()) {
        for (int i = 1; i <= 1300; i++) {
            String value = rs.getString(i);
            if (value != null && !value.isEmpty()) {
                byte[] b = value.getBytes(StandardCharsets.UTF_8);
                bufferedOutputStream.write(b);
            }
            bufferedOutputStream.write('\t');
        }
        bufferedOutputStream.write('\n');
    }
} finally {
    bufferedOutputStream.close();
}

It could be that you hav CBLOB fields with huge texts. Then rs.getString could be replaced by a reader.

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
  • Good one but I am still getting the same java.lang.OutOfMemoryError: null error. – Akan Dec 09 '21 at 13:11
  • 1
    Where is that ByteArrayOutputStream used? Hopefully not wrapped in the BufferedOutputStream above? Note: `new ByteArrayOutputStream(400_000_000)` or such might help (less resizing). Using a compression, `GZipOutputStream` too. – Joop Eggen Dec 09 '21 at 14:45
  • Yes, its wrapped around BufferedOutputStream `bufferedOutputStream = new BufferedOutputStream(byteArrayOutputStream);` – Akan Dec 09 '21 at 16:59