0

I am parsing a 2GB CSV file using Apache Commons CSV library but I am getting heap memory issue.

Error nested exception is java.lang.OutOfMemoryError: Java heap space

Reader reader = new InputStreamReader(inputStream);
        List<SiebelRecord> siebelRecords = new ArrayList<>();
        CSVParser csvParser = null;
        try {
            csvParser = new CSVParser(reader, CSVFormat.DEFAULT
                    .withEscape('/')
                    .withFirstRecordAsHeader()
                    .withDelimiter('|')
                    .withIgnoreHeaderCase()
                    .withTrim());


            List<CSVRecord> recordList = csvParser.getRecords();
            siebelRecords = recordList.stream().sequential().map(csvRecord -> new SiebelRecord(csvRecord.get("CUSTOMER_ID"), csvRecord.get("CUSTOMER_NAME"), csvRecord.get("CUSTOMER_ORG"), csvRecord.get("CUSTOMER_PIN"), csvRecord.get("CUSTOMER_TYPE"), csvRecord.get("CUSTOMER_STATUS")
                    , csvRecord.get("CUSTOMER_DOM"), csvRecord.get("BILLING_ID"), csvRecord.get("BILLING_NAME"), csvRecord.get("BILLING_NUMBER"), csvRecord.get("BILLING_STATUS"), csvRecord.get("BILLING_PIN")
                    , csvRecord.get("BILLING_ACCOUNT_TYPE"), csvRecord.get("BILLING_METHOD"), csvRecord.get("BILLING_TYPE"), csvRecord.get("SERVICE_ID"), csvRecord.get("SERVICE_TYPE"),
                    csvRecord.get("CONNECTION_STATUS"), csvRecord.get("SERVICE_PIN"), csvRecord.get("PRIMARY_SERVICE_ID"), csvRecord.get("ROOT_ASSET_ID"), csvRecord.get("PRODUCT_NAME")
                    , csvRecord.get("CONNECTION_NAME"), csvRecord.get("SECONDARY_SERVICE_ID"))).collect(Collectors.toList());

        } finally {
            inputStream.close();
            reader.close();
            if (csvParser != null) {
                csvParser.close();
            }
        }

Is there any property which I am missing or is it library issue.

David
  • 507
  • 2
  • 6
  • 14

1 Answers1

-1

My answer would be a little tangential to increasing heap space. I would try to parse huge files line by line instead of loading them all in the JVM. Here's a link on a similar question and the last answer demonstrates a way to deal with huge files with a buffered reader. How can I process a large file via CSVParser?

Arpan Kanthal
  • 475
  • 2
  • 7