0

I am using the accepted answer from here. Basically, I am converting a csv to .xlsx, and it looks like the solution pulls everything in individual cells into 1 line using the buffered reader, and then using:

String str[] = currentLine.split(",");

.. the string is split up into separate parts of the array for each column. My problem is that in some of my data, there are commas, so the algorithm gets confused and makes more columns than needed, splitting sentences into different columns which doesn't really work for me. Is there another way I can split the sentences up perhaps? I'd happily split the string up using a different unique character (maybe |?), but I don't know how to replace the comma provided by the bufferedreader. Any help would be great. Code I am using below for reference:

public static void csvToXLSX() {
try {
    String csvFileAddress = "test.csv"; //csv file address
    String xlsxFileAddress = "test.xlsx"; //xlsx file address
    XSSFWorkbook workBook = new XSSFWorkbook();
    XSSFSheet sheet = workBook.createSheet("sheet1");
    String currentLine=null;
    int RowNum=0;
    BufferedReader br = new BufferedReader(new FileReader(csvFileAddress));
    while ((currentLine = br.readLine()) != null) {
        String str[] = currentLine.split(",");
        RowNum++;
        XSSFRow currentRow=sheet.createRow(RowNum);
        for(int i=0;i<str.length;i++){
            currentRow.createCell(i).setCellValue(str[i]);
        }
    }

    FileOutputStream fileOutputStream =  new FileOutputStream(xlsxFileAddress);
    workBook.write(fileOutputStream);
    fileOutputStream.close();
    System.out.println("Done");
} catch (Exception ex) {
    System.out.println(ex.getMessage()+"Exception in try");
}
}
Community
  • 1
  • 1
jimjamian
  • 93
  • 1
  • 10
  • You should follow the naming conventions. That means `rowNum` instead of `RowNum`. Also, `String[] str` is preferred rather than `String str[]`, see [JLS 7](http://docs.oracle.com/javase/specs/jls/se7/jls7.pdf). – MC Emperor Sep 15 '16 at 16:07
  • The conventions used were in the accepted answer, but thanks. – jimjamian Sep 15 '16 at 16:28

1 Answers1

0

Well, CSV is something more than just text file with lines separated with commas.

For example, some fields in CSV can be quoted; this is the way comma is escaped within one field.

Quotes are quoted as well, with double-quotes.

And there also could be newlines within one CSV line, they must also be quoted.

So, to sum up, a CSV lines

1,"2,3","4
5",6,7,""""

should be parsed to array of "1", "2,3", "4\n5", "6", "7","\"" (and that is a single row of a CSV table).

As you can see, you can't just mindlessly split every line by comma. I suggest you to use some library instead of doing this by yourself. http://www.liquibase.org/javadoc/liquibase/util/csv/opencsv/CSVReader.html will work just fine.

Nartallax
  • 123
  • 6
  • Okay, I'm trying to use this library, but I am not sure it is any more obvious to get around this issue. I can read everything in, but I still need to figure out how to escape certain commas and consider others, so that the columns are generated correctly. What else can be done with the library to get around the issue? – jimjamian Sep 15 '16 at 16:20