0

I have to parse a csv file which has fields that can look like the following:

("FOO, BAR BAZ", 42)

And yield the two fields:

FOO, BAR BAZ  
42

I'm not sure how to do this succinctly using Apache Commons CSV or OpenCSV, so I'm looking for some guidance. It may just be that I don't fully understand the org.apache.commons.csv.CSVFormat property "quoteChar" which is touched on in the documentation but never clearly explained anywhere I could find. If so, it'd be very helpful if you could point me towards better documentation of that feature.

Here's a brief example that shows my problem as well as what I've tried and the results:

        String test = "(\"FOO, BAR BAZ\", 42)";
        int numTries = 5;
        CSVParser[] tries = new CSVParser[numTries];
        tries[0] = CSVParser.parse(line, CSVFormat.DEFAULT.withRecordSeparator("\n"));//BAR BAZ"
        tries[1] = CSVParser.parse(line, CSVFormat.DEFAULT.withQuote('"'));//BAR BAZ"
        tries[2] = CSVParser.parse(line, CSVFormat.DEFAULT.withQuote(null));//BAR BAZ"
        tries[3] = CSVParser.parse(line, CSVFormat.DEFAULT.withQuote('"').withQuoteMode(QuoteMode.NON_NUMERIC));//BAR BAZ"
        tries[4] = CSVParser.parse(line, CSVFormat.DEFAULT.withRecordSeparator(")\n("));//BAR BAZ"

        for(int i = 0; i < numTries; i++){
            CSVRecord record = tries[i].getRecords().get(0);
            System.out.println(record.get(1));//.equals("42"));
        }  

Note that it works fine if you exclude the parentheses from the input.

Eric M.
  • 642
  • 5
  • 16

3 Answers3

0

You can use OpenCSV's CSVReader to read the data and get the data elements as shown below:

public static void main(String[] args) {
    try(FileReader fr = new FileReader(new File("C:\\Sample.txt"));
                CSVReader csvReader = new CSVReader(fr);) {
            String[] data = csvReader.readNext();
            for(String data1 : data) {
                System.out.println(data1);
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
Vasu
  • 21,832
  • 11
  • 51
  • 67
0

For me the default-format of commons-csv does the right thing for a correctly formatted CSV message:

    Reader in = new StringReader("\"FOO, BAR BAZ\", 42");
    Iterable<CSVRecord> records = CSVFormat.DEFAULT.parse(in);
    for (CSVRecord record : records) {
        for(int i = 0;i < record.size();i++) {
            System.out.println("At " + i + ": " + record.get(i));
        }
    }

Leads to:

At 0: FOO, BAR BAZ
At 1:  42

For the specially formatted lines you likely need to do a bit more handling top remove those brackets:

    BufferedReader lineReader = new BufferedReader(
            new StringReader("(\"FOO, BAR BAZ\", 42)\n(\"FOO, BAR FOO\", 44)"));

    while(true) {
        String line = lineReader.readLine();
        if (line == null) {
            break;
        }

        String adjustedLine = line.substring(1, line.length() - 1);
        records = CSVFormat.DEFAULT.parse(new StringReader(adjustedLine));
        for (CSVRecord record : records) {
            for (int i = 0; i < record.size(); i++) {
                System.out.println("At " + i + ": " + record.get(i));
            }
        }
    }
centic
  • 15,565
  • 9
  • 68
  • 125
  • That's what I expected, however I get `invalid char between encapsulated token and delimiter` and the solutions [here](http://stackoverflow.com/questions/26729799/invalid-char-between-encapsulated-token-and-delimiter-in-apache-commons-csv-libr) suggest changing _withQuote_ to fix it. – Eric M. Nov 15 '16 at 15:37
  • Can include a minimal test-case in your question? Because I don't see where your code is different, mine works exactly as I posted it and does not report any error. At least check which quotes you have in your text, it may fail if they are "typographical quotes", e.g. added in Word. – centic Nov 15 '16 at 17:05
  • Ah, I see the problem. You don't have the enclosing parentheses around your input line. Probably my fault though -- they were initially left out of the question. I've added a test case, though. – Eric M. Nov 15 '16 at 17:48
  • You will need to split the lines and remove those parantheses manually before doing the CSV parsiing, likely none of the libs will allow to parse things like that... – centic Nov 16 '16 at 05:30
0

You can achieve this with opencsv as follows:

import com.opencsv.CSVReader;
import java.io.FileReader;
import java.io.IOException;

public class NewClass1 {
    public static void main(String[] args) throws IOException {
        String fileName = "C:\\yourFile.csv";
        String [] nextLine;
        // use the three arg constructor to tell the reader which delimiter you have in your file(2nd arg : here ',')                                                          
        // you can change this to '\t' if you have tab separeted file or ';' or ':' ... whatever your delimiter is
        // (3rd arg) '"' if your fields are double quoted or '\'' if single quoted or no 3rd arg if the fields are not quoted
        CSVReader reader = new CSVReader(new FileReader(fileName), ',' ,'"');
        // nextLine[] is an array of values from the line
        // each line represented by String[], and each field as an element of the array
        while ((nextLine = reader.readNext()) != null) {        
            System.out.println("nextLine[0]: " +nextLine[0]);
            System.out.println("nextLine[1]: " +nextLine[1]);
        }
    }
}
Eritrean
  • 15,851
  • 3
  • 22
  • 28