37

I am getting the following error while parsing the CSV file using the Apache Commons CSV library.

Exception in thread "main" java.io.IOException: (line 2) invalid char between encapsulated token and delimiter

at org.apache.commons.csv.Lexer.parseEncapsulatedToken(Lexer.java:275)
at org.apache.commons.csv.Lexer.nextToken(Lexer.java:152)
at org.apache.commons.csv.CSVParser.nextRecord(CSVParser.java:450)
at org.apache.commons.csv.CSVParser.getRecords(CSVParser.java:327)
at parse.csv.file.CSVFileParser.main(CSVFileParser.java:29)

What's the meaning of this error ?

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
Santhosh Sridhar
  • 641
  • 1
  • 6
  • 12
  • 2
    Can you show your csv file line 2 in specific if it is long? – Santhosh Nov 04 '14 at 07:35
  • Here is the sample line 2: "---","88104310D64DCG","10-20-2014","10:03 AM","10-20-2014","10:03 AM","00:00:00"," "," ","172.21.128.74"," ","h323",256," ","OUTGOING",45,1,0," "," ","user:---","172.21.128.74"," "," "," "," "," "," "," ","Failed Attempt;"The call has ended.; Rolling Over."",16,0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0.0,0,0 – Santhosh Sridhar Nov 04 '14 at 09:12
  • One more observation is that if I open the CSV file in Microsoft Excel Workbook, make some modification and then save it. Now run the parser program which works!! – Santhosh Sridhar Nov 04 '14 at 09:17
  • 3
    @SanthoshSridhar Please put that additional information neatly into the Question rather than posting as comments. Use the "edit" link below your Question’s tags (if in a web browser). – Basil Bourque Jun 26 '15 at 20:15

5 Answers5

51

We ran into this issue when we had embedded quote in our data.

0,"020"1,"BS:5252525  ORDER:99999"4

Solution applied was CSVFormat csvFileFormat = CSVFormat.DEFAULT.withQuote(null);

@Cuga tip helped us to resolve. Thanks @Cuga

Full code is

    public static void main(String[] args) throws IOException {
    FileReader fileReader = null;
    CSVFormat csvFileFormat = CSVFormat.DEFAULT.withQuote(null);
    String fileName = "test.csv";

    fileReader = new FileReader(fileName);
    CSVParser csvFileParser = new CSVParser(fileReader, csvFileFormat);

    List<CSVRecord> csvRecords = csvFileParser.getRecords();

    for (CSVRecord csvRecord : csvRecords) {
        System.out.println(csvRecord);
    }
    csvFileParser.close();
}

Result is

CSVRecord [comment=null, mapping=null, recordNumber=1, values=[0, "020"1, "BS:5252525  ORDER:99999"4]]
Anand
  • 1,845
  • 2
  • 20
  • 25
9

That line in the CSV file contains an invalid character between one of your cells and either the end of line, end of file, or the next cell. A very common cause for this is a failure to escape your encapsulating character (the character that is used to "wrap" each cell, so CSV knows where a cell (token) starts and ends.

Steve Siebert
  • 1,874
  • 12
  • 18
7

I found the solution to the problem. One of my CSV file has an attribute as follows: "attribute with nested "quote" "

Due to nested quote in the attribute the parser fails.

To avoid the above problem escape the nested quote as follows: "attribute with nested """"quote"""" "

This is the one way to solve the problem.

Santhosh Sridhar
  • 641
  • 1
  • 6
  • 12
  • 3
    Looks like [the Answer by Steve Siebert](http://stackoverflow.com/a/26730109/642706) was correct. I suggest you accept his Answer (click the big empty checkmark), delete your own Answer while moving its text to a comment on his Answer. – Basil Bourque Jun 26 '15 at 20:18
2

We ran into this in this same error with data containing quotes in otherwise unquoted input. I.e.:

some cell|this "cell" caused issues|other data

It was hard to find, but in Apache's docs, they mention the withQuote() method which can take null as a value.

We were getting the exact same error message and this (thankfully) ended up fixing the issue for us.

Cuga
  • 17,668
  • 31
  • 111
  • 166
  • 1
    Thanks @Cuga. We had embedded quote and we had parse with data. Your comment helped us. – Anand Mar 31 '16 at 09:37
1

I ran into this issue when I forgot to call .withNullString("") on my CSVFormat. Basically, this exception always occurs when:

  • your quote symbol is wrong
  • your null string representation is wrong
  • your column separator char is wrong

Make sure you know the details of your format. Also, some programs use leading byte-order-marks (for example, Excel uses \uFEFF) to denote the encoding of the file. This can also trip up your parser.

Martin Häusler
  • 6,544
  • 8
  • 39
  • 66