1

I have 1 million records of data in an excel sheet. Client requirement is to convert this file into csv format.

I tried the following set of code,

File src = new File("C:\\test.xlsx")
File dest = new File("C:\\test.csv")
src.renameTo(dest);

This is converting the file. But when i am opening i am getting error when opening the file.

Also, i tried the following code,

class XlstoCSV 
  {
    public static void main(String[] args) 
    {
            File inputFile = new File("C:\test.xls");
            File outputFile = new File("C:\output.csv");
              // For storing data into CSV files
    StringBuffer data = new StringBuffer();
    try 
    {
    FileOutputStream fos = new FileOutputStream(outputFile);

    // Get the workbook object for XLS file
    HSSFWorkbook workbook = new HSSFWorkbook(new FileInputStream(inputFile));
    // Get first sheet from the workbook
    HSSFSheet sheet = workbook.getSheetAt(0);
    Cell cell;
    Row row;

    // Iterate through each rows from first sheet
    Iterator<Row> rowIterator = sheet.iterator();
    while (rowIterator.hasNext()) 
    {
            row = rowIterator.next();
            // For each row, iterate through each columns
            Iterator<Cell> cellIterator = row.cellIterator();
            while (cellIterator.hasNext()) 
            {
                    cell = cellIterator.next();

                    switch (cell.getCellType()) 
                    {
                    case Cell.CELL_TYPE_BOOLEAN:
                            data.append(cell.getBooleanCellValue() + ",");
                            break;

                    case Cell.CELL_TYPE_NUMERIC:
                            data.append(cell.getNumericCellValue() + ",");
                            break;

                    case Cell.CELL_TYPE_STRING:
                            data.append(cell.getStringCellValue() + ",");
                            break;

                    case Cell.CELL_TYPE_BLANK:
                            data.append("" + ",");
                            break;

                    default:
                            data.append(cell + ",");
                    }

                    data.append('\n'); 
            }
    }

    fos.write(data.toString().getBytes());
    fos.close();
    }
    catch (FileNotFoundException e) 
    {
            e.printStackTrace();
    }
    catch (IOException e) 
    {
            e.printStackTrace();
    }
    }

But my code is getting failed in this file,

HSSFWorkbook workbook = new HSSFWorkbook(new FileInputStream(inputFile));

I am getting Heap Memory error in the above line itself. I am not sure how to do bulk data upload by using java. I even tried Apache POI jar code. But that code is also failing.

Can anybody help me on this?

2 Answers2

1

Use Apache POI to import the Excel file and MapDB to cache the rows in a disk-based file database.

MHDx
  • 51
  • 5
0

Easiest way (assuming the code works on smaller file): increase the memory used by the JVM.

Alternatively, you can write the file line by line:

Path outputFile = Paths.get("C:\output.csv");

HSSFWorkbook workbook = new HSSFWorkbook(new FileInputStream(inputFile));
HSSFSheet sheet = workbook.getSheetAt(0);

for (Row row : sheet) {
  List<String> csv = new StringBuilder();
  for (Cell cell : row) {
    //add the logic with csv.add(cell.getXXXValue()); etc., no comma here
  }
  String csvRow = String.join(",", csv) + "\n";
  Files.write(outputFile, csvRow.getBytes(UTF_8), StandardOpenOption.APPEND);
}
assylias
  • 321,522
  • 82
  • 660
  • 783