0

In java, I am reading list of values from excel sheet. After reading the file, the output is something like below.

12345678,abcdefg,123,"Summer class, embedded",2012

I need to remove commas from the above output.

I used StringUtils.commaDelimitedListToStringArray() and assigned to a String[]. While using this commaDelimitedListToStringArray() method, "Summer class, embedded" is getting divided into two results.

Is there any way to avoid this?

I want to read as whole string.

Blake Yarbrough
  • 2,286
  • 1
  • 20
  • 36
user3242119
  • 43
  • 2
  • 11
  • Do not use commaDelimitedList... – phil652 Sep 22 '15 at 18:50
  • 1
    Parsing CSV files not as simple as you're hoping. Use a CSV library. http://stackoverflow.com/questions/200609/can-you-recommend-a-java-library-for-reading-and-possibly-writing-csv-files – Ben M. Sep 22 '15 at 18:50

3 Answers3

0

You need to use a parsing method that is more sophisticated than a simple split on a character.

At a minimum, it should have two modes, Splitting and Skipping. Then the logic would look like this

  1. Start in Splitting mode.
  2. Read a character.
  3. If in skipping mode, and the character is a quote, then shift to splitting mode.
  4. If in splitting mode, and the character is a comma, then split.
  5. If in splitting mode, and the character is a quote, then shift to skipping mode.
  6. Continue at 2 until all characters are read.

Learning how to parse is a very useful tool, even though there are plenty of pre-built parsers out there. There are always problems that require "just" enough parsing to require you to roll a new tool.

With that in mind, I'd first reach for a CSV file parsing tool. Then, in some cases, regex parsing might be a good choice. Finally, rolling your own parser might be advisable, but if you do, please read up on discrete finite automata.

If you learn DFA, those that don't understand the math behind it will marvel that your parsers work, and a well built DFA is often very fast.

Edwin Buck
  • 69,361
  • 7
  • 100
  • 138
0

Here is an example using the commons csv library:

import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVRecord;

List<String> results = new ArrayList<String>();
try (Reader rdr = new FileReader(pathToFile); CSVParser parser = CSVFormat.DEFAULT.parse(rdr);)
{
    Iterator<CSVRecord> records = parser.iterator();
    while (records.hasNext())
    {
        CSVRecord row = records.next();
        Iterator<String> values = row.iterator();
        while(values.hasNext())
        {
            results.add(values.next());
        }
    }
}
catch(IOException e)
{
    // log the error here
}
JeredM
  • 897
  • 1
  • 14
  • 25
0

univocity-parsers allows you to handle this without any trouble.

CsvParserSettings settings = new CsvParserSettings();
CsvParser parser = new CsvParser(settings);
List<String[]> allRows = parser.parseAll(new FileReader(new File("/path/to/your.csv")));

Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).

josliber
  • 43,891
  • 12
  • 98
  • 133
Jeronimo Backes
  • 6,141
  • 2
  • 25
  • 29