0

Possible Duplicate:
CSV parsing in Java - working example..?

I have a list of names, age and country in the format of "Name",16,"Canada" and some are "First, Second",21,"Canada" how can I separate these?

I have been using .split but cannot get it to work for these format of string.

Community
  • 1
  • 1
orange
  • 5,297
  • 12
  • 50
  • 71

3 Answers3

0

I'm using Java CSV LIbrary this has two classes one reading and another one writting csv which can handle quoted strings.

stacker
  • 68,052
  • 28
  • 140
  • 210
0

I would use OpenCSV and do something like this:

CSVReader reader = new CSVReader(new FileReader("yourfile.csv"));
String [] nextLine;
while ((nextLine = reader.readNext()) != null) {
    // nextLine[] is an array of values from the line        
}
Mike K.
  • 3,751
  • 28
  • 41
0

There are likely libraries that can do this for you (see previous answers). However, if you want to code it by hand, you will need to build yourself a finite state machine, and examine each character in the string independently to determine whether or not you fall within quotes. You essentially need two states - IN_QUOTE, NO_QUOTE - since the examination rules differ based on your state. If you are within quotation marks, you want to ignore commas. If you are outside quotation marks, then you want commas to separate your fields.

Psuedo code off top of my head would look something like:

String line = <input string>
List<String>fields = new ArrayList<String>();
StringBuffer field = new StringBuffer();
for( int i = 0 ; i < line.length(); i++){
   char c = line.charAt(i);
   switch( state ){
      NO_QUOTE:
         // check if character is a quote or a comma.  If neither append character to field
         if( quote )  
            // change state
            state = IN_QUOTES;
         else if( comman )
            // close the field and start a new one
            fields.add(field.toString());
            field = new StringBuffer();
         else
            field.append(c);
         break;

      IN_QUOTES:
         // only search for a closed quote mark
         if( quote )  
            // change state
            state = OUT_QUOTES;
         else
            field.append(c);
   }

All this being said, your examination rules can become overly tricky and complex (do you need to examine for escaped quote marks? What about UTF-8 or other charsets? etc..) and probably not worth your effort to re-invent the wheel when several other libs appear to do this work for you already.

Eric B.
  • 23,425
  • 50
  • 169
  • 316