172

Can anyone recommend a simple API that will allow me to use read a CSV input file, do some simple transformations, and then write it.

A quick google has found http://flatpack.sourceforge.net/ which looks promising.

I just wanted to check what others are using before I couple myself to this API.

CharlesB
  • 86,532
  • 28
  • 194
  • 218
David Turner
  • 5,014
  • 6
  • 26
  • 27
  • 1
    Use the sister site [*Software Recommendations Stack Exchange*](https://softwarerecs.stackexchange.com) when asking for suggestions on a software library. Has [several hits for Java & CSV](https://softwarerecs.stackexchange.com/search?q=java+CSV). – Basil Bourque Jan 06 '19 at 22:28
  • In my experience [uniVocity](https://github.com/uniVocity/csv-parsers-comparison) is the fastest and very customizable. – R. Oosterholt Jan 17 '23 at 10:55

10 Answers10

87

I've used OpenCSV in the past.

import au.com.bytecode.opencsv.CSVReader;

String fileName = "data.csv";
CSVReader reader = new CSVReader(new FileReader(fileName ));

// if the first line is the header String[] header = reader.readNext();
// iterate over reader.readNext until it returns null String[] line = reader.readNext();

There were some other choices in the answers to another question.

Community
  • 1
  • 1
Jay R.
  • 31,911
  • 17
  • 52
  • 61
  • Unfortunately, OpenCSV's latest download (v2.2 at time of comment) does not compile, and they don't provide a pre-built binary. – opyate Mar 21 '11 at 21:24
  • 9
    The package I downloaded from SourceForge had a binary in the deploy folder. – Mike Sickler Jun 01 '11 at 02:29
  • 8
    If you're using maven, please note that the dependency code on official website contains version declaration "2.0" which has some bugs, but there is updated version 2.3 in repositories. – broundee Jul 16 '12 at 06:15
  • this lib doesn't write file in separate thread, no? – Ewoks Oct 09 '14 at 20:04
  • 3
    according to https://github.com/uniVocity/csv-parsers-comparison in average 73% slower than uniVocity.. – Ewoks Sep 17 '15 at 13:39
38

Apache Commons CSV

Check out Apache Common CSV.

This library reads and writes several variations of CSV, including the standard one RFC 4180. Also reads/writes Tab-delimited files.

  • Excel
  • InformixUnload
  • InformixUnloadCsv
  • MySQL
  • Oracle
  • PostgreSQLCsv
  • PostgreSQLText
  • RFC4180
  • TDF
Community
  • 1
  • 1
  • I've used the sandboxed Commons CSV for quite some time and never experienced a problem. I really hope they promote it to full standing and get it out of the sandbox. – Alex Marshall Dec 14 '10 at 19:38
  • 3
    @bmatthews68 the sandbox link is defunct - looks like it's moved to [apache commons proper](http://commons.apache.org/proper/commons-csv/) (I edited the link in the answer too) – drevicko Jun 15 '13 at 07:50
34

Update: The code in this answer is for Super CSV 1.52. Updated code examples for Super CSV 2.4.0 can be found at the project website: http://super-csv.github.io/super-csv/index.html


The SuperCSV project directly supports the parsing and structured manipulation of CSV cells. From http://super-csv.github.io/super-csv/examples_reading.html you'll find e.g.

given a class

public class UserBean {
    String username, password, street, town;
    int zip;

    public String getPassword() { return password; }
    public String getStreet() { return street; }
    public String getTown() { return town; }
    public String getUsername() { return username; }
    public int getZip() { return zip; }
    public void setPassword(String password) { this.password = password; }
    public void setStreet(String street) { this.street = street; }
    public void setTown(String town) { this.town = town; }
    public void setUsername(String username) { this.username = username; }
    public void setZip(int zip) { this.zip = zip; }
}

and that you have a CSV file with a header. Let's assume the following content

username, password,   date,        zip,  town
Klaus,    qwexyKiks,  17/1/2007,   1111, New York
Oufu,     bobilop,    10/10/2007,  4555, New York

You can then create an instance of the UserBean and populate it with values from the second line of the file with the following code

class ReadingObjects {
  public static void main(String[] args) throws Exception{
    ICsvBeanReader inFile = new CsvBeanReader(new FileReader("foo.csv"), CsvPreference.EXCEL_PREFERENCE);
    try {
      final String[] header = inFile.getCSVHeader(true);
      UserBean user;
      while( (user = inFile.read(UserBean.class, header, processors)) != null) {
        System.out.println(user.getZip());
      }
    } finally {
      inFile.close();
    }
  }
}

using the following "manipulation specification"

final CellProcessor[] processors = new CellProcessor[] {
    new Unique(new StrMinMax(5, 20)),
    new StrMinMax(8, 35),
    new ParseDate("dd/MM/yyyy"),
    new Optional(new ParseInt()),
    null
};
Salix alba
  • 7,536
  • 2
  • 32
  • 38
kbg
  • 431
  • 4
  • 3
  • 1
    Your code would not compile so I submitted some corrections. Also, ParseDate() does not work correctly so I replaced it to read a String. It can be parsed later. –  Jul 01 '12 at 18:31
  • 1
    Big limitation: SuperCSV is not threadsafe, I'm going to looking to Jackson, although it may be more feature limited – ZiglioUK Feb 04 '14 at 04:11
  • SuperCsv also doesn't allow using multimaps. Would be nice to see it work with MultiMaps. – Sid Apr 16 '16 at 09:02
18

Reading CSV format description makes me feel that using 3rd party library would be less headache than writing it myself:

Wikipedia lists 10 or something known libraries:

I compared libs listed using some kind of check list. OpenCSV turned out a winner to me (YMMV) with the following results:

+ maven

+ maven - release version   // had some cryptic issues at _Hudson_ with snapshot references => prefer to be on a safe side

+ code examples

+ open source   // as in "can hack myself if needed"

+ understandable javadoc   // as opposed to eg javadocs of _genjava gj-csv_

+ compact API   // YAGNI (note *flatpack* seems to have much richer API than OpenCSV)

- reference to specification used   // I really like it when people can explain what they're doing

- reference to _RFC 4180_ support   // would qualify as simplest form of specification to me

- releases changelog   // absence is quite a pity, given how simple it'd be to get with maven-changes-plugin   // _flatpack_, for comparison, has quite helpful changelog

+ bug tracking

+ active   // as in "can submit a bug and expect a fixed release soon"

+ positive feedback   // Recommended By 51 users at sourceforge (as of now)
Shantha Kumara
  • 3,272
  • 4
  • 40
  • 52
gnat
  • 6,213
  • 108
  • 53
  • 73
8

We use JavaCSV, it works pretty well

Mat Mannion
  • 3,315
  • 2
  • 30
  • 31
  • 3
    The only issue with this library is that it won't allow you to output CSV files with Windows line terminators (`\r\n`) when not running on Windows. The author has not provided support for years. I had to fork it to allow that missing feature: [JavaCSV 2.2](https://github.com/pupi1985/JavaCSV-Reloaded) – Mosty Mostacho Sep 23 '13 at 20:42
6

For the last enterprise application I worked on that needed to handle a notable amount of CSV -- a couple of months ago -- I used SuperCSV at sourceforge and found it simple, robust and problem-free.

Cheekysoft
  • 35,194
  • 20
  • 73
  • 86
  • +1 for SuperCSV, but it has some nasty bugs which aren't fixed yet, new bugs aren't handled currently, and the last release is almost two years old. But we are using a patched/modified version in production without any problems. – MRalwasser Jul 13 '10 at 09:39
  • 2
    @MRalwasser [Super CSV 2.0.0-beta-1](http://supercsv.sourceforge.net/release_notes.html) has recently been released. It includes many bug fixes and new features (including Maven support and a new Dozer extension for mapping nested properties and arrays/Collections) – James Bassett Oct 16 '12 at 02:39
  • 1
    @Hound-Dog Thank you for the update, I already noticed the new beta and I'm glad to see the project alive - although the frequency of commits still fears me a little bit (almost all commits on a few days only). But I'll take a look. Is there an estimated release date of the final 2.0? – MRalwasser Oct 16 '12 at 06:31
  • 2
    @MRalwasser I'm the only dev at the moment and have full time work, so I tend to work on this whenever I get a free weekend - hence the sporadic commits :) Nearly 1000 SF downloads of the beta now, and no bugs, so looking on track for a final release early next month. If you have any ideas for future features please let us know. – James Bassett Oct 16 '12 at 07:15
  • 1
    SuperCSV is not threadsafe at this stage, that makes it not really robust imho – ZiglioUK Feb 04 '14 at 04:13
6

You can use csvreader api & download from following location:

http://sourceforge.net/projects/javacsv/files/JavaCsv/JavaCsv%202.1/javacsv2.1.zip/download

or

http://sourceforge.net/projects/javacsv/

Use the following code:

/ ************ For Reading ***************/

import java.io.FileNotFoundException;
import java.io.IOException;

import com.csvreader.CsvReader;

public class CsvReaderExample {

    public static void main(String[] args) {
        try {

            CsvReader products = new CsvReader("products.csv");

            products.readHeaders();

            while (products.readRecord())
            {
                String productID = products.get("ProductID");
                String productName = products.get("ProductName");
                String supplierID = products.get("SupplierID");
                String categoryID = products.get("CategoryID");
                String quantityPerUnit = products.get("QuantityPerUnit");
                String unitPrice = products.get("UnitPrice");
                String unitsInStock = products.get("UnitsInStock");
                String unitsOnOrder = products.get("UnitsOnOrder");
                String reorderLevel = products.get("ReorderLevel");
                String discontinued = products.get("Discontinued");

                // perform program logic here
                System.out.println(productID + ":" + productName);
            }

            products.close();

        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }

    }

}

Write / Append to CSV file

Code:

/************* For Writing ***************************/

import java.io.File;
import java.io.FileWriter;
import java.io.IOException;

import com.csvreader.CsvWriter;

public class CsvWriterAppendExample {

    public static void main(String[] args) {

        String outputFile = "users.csv";

        // before we open the file check to see if it already exists
        boolean alreadyExists = new File(outputFile).exists();

        try {
            // use FileWriter constructor that specifies open for appending
            CsvWriter csvOutput = new CsvWriter(new FileWriter(outputFile, true), ',');

            // if the file didn't already exist then we need to write out the header line
            if (!alreadyExists)
            {
                csvOutput.write("id");
                csvOutput.write("name");
                csvOutput.endRecord();
            }
            // else assume that the file already has the correct header line

            // write out a few records
            csvOutput.write("1");
            csvOutput.write("Bruce");
            csvOutput.endRecord();

            csvOutput.write("2");
            csvOutput.write("John");
            csvOutput.endRecord();

            csvOutput.close();
        } catch (IOException e) {
            e.printStackTrace();
        }

    }
}
DarthJDG
  • 16,511
  • 11
  • 49
  • 56
Dhananjay Joshi
  • 704
  • 1
  • 7
  • 8
3

There is also CSV/Excel Utility. It assumes all thos data is table-like and delivers data from Iterators.

Frank
  • 31
  • 1
2

The CSV format sounds easy enough for StringTokenizer but it can become more complicated. Here in Germany a semicolon is used as a delimiter and cells containing delimiters need to be escaped. You're not going to handle that easily with StringTokenizer.

I would go for http://sourceforge.net/projects/javacsv

paul
  • 13,312
  • 23
  • 81
  • 144
0

If you intend to read csv from excel, then there are some interesting corner cases. I can't remember them all, but the apache commons csv was not capable of handling it correctly (with, for example, urls).

Be sure to test excel output with quotes and commas and slashes all over the place.

daveb
  • 74,111
  • 6
  • 45
  • 51
  • The [*Apache Commons CSV*](https://commons.apache.org/proper/commons-csv/) library does offer [a specific variant for Microsoft Excel](http://commons.apache.org/proper/commons-csv/archives/1.6/apidocs/org/apache/commons/csv/CSVFormat.Predefined.html#Excel). I don’t know if that now handles the problems you mention or not. – Basil Bourque Jan 06 '19 at 22:22