How can I skip the first line of a csv in Java?

Question

I want to skip the first line and use the second as header.

I am using classes from apache commons csv to process a CSV file.

The header of the CSV file is in the second row, not the first (which contains coordinates).

My code looks like this:

static void processFile(final File file) {
    FileReader filereader = new FileReader(file);
    final CSVFormat format = CSVFormat.DEFAULT.withDelimiter(';');
    CSVParser parser = new CSVParser(filereader, format);
    final List<CSVRecord> records = parser.getRecords();
    //stuff
}

I naively thought,

CSVFormat format = CSVFormat.DEFAULT.withFirstRecordAsHeader().withDelimiter(;)

would solve the problem, as it's different from withFirstRowAsHeader and I thought it would detect that the first row doesn't contain any semicolons and is not a record. It doesn't. I tried to skip the first line (that CSVFormat seems to think is the header) with

CSVFormat format = CSVFormat.DEFAULT.withSkipHeaderRecord().withFirstRecordAsHeader().withDelimiter(;);

but that also doesn't work. What can I do? What's the difference between withFirstRowAsHeader and withFirstRecordAsHeader?

Have you tried reading until newLine before giving the fileReader to the parser? — Fildor, Aug 24 '17 at 12:40

Sully · Answer 1 · 2022-08-16T05:09:04.897

29

The correct way to skip the first line if it is a header is by using a different CSVFormat

CSVFormat format = CSVFormat.DEFAULT.withDelimiter(';').withFirstRecordAsHeader();

Update: June 30 2022

For 1.9+, use

CSVFormat.DEFAULT.builder()                                                                  
    .setDelimiter(';')
    .setHeader()
    .setSkipHeaderRecord(true)  // skip header
    .build();

edited Aug 16 '22 at 05:09

answered Aug 14 '18 at 14:09

Sully

14,672
5
54
79

+1 for withFirstRecordAsHeader(), I use it with CSVParser and it skips the header when you iterate over the parser. – keni Aug 21 '18 at 17:42
4

This should be the accepted answer, since it uses the library, instead of an ad-hoc pure Java solution – jmm Nov 28 '19 at 20:54
This should be the accepted answer. Thanks – A MJ Sep 07 '21 at 08:21
1

I think the original question was about the first *two* lines, where the second contains the header. – avandeursen Dec 11 '21 at 16:43
for me it currently shows withDelimiter as deprecated. – Maik Jun 29 '22 at 09:29
1

I think the setHeader() method must be call too: CSVFormat.DEFAULT.builder() .setDelimiter(';').setHeader() .setSkipHeaderRecord(true) // skip header .build(); – fdm Aug 12 '22 at 11:39
setHeader() will read the first record as the headers. The question says the headers are in the second record. – grigouille Apr 10 '23 at 07:27

score 12 · Accepted Answer · answered Aug 24 '17 at 12:41

12

You may want to read the first line, before passing the reader to the CSVParser :

static void processFile(final File file) {
    FileReader filereader = new FileReader(file);
    BufferedReader bufferedReader = new BufferedReader(filereader);
    bufferedReader.readLine();// try-catch omitted
    final CSVFormat format = CSVFormat.DEFAULT.withDelimiter(';');
    CSVParser parser = new CSVParser(bufferedReader, format);
    final List<CSVRecord> records = parser.getRecords();
    //stuff
}

answered Aug 24 '17 at 12:41

Arnaud

17,229
3
31
44

In case of my `,` seperated csv file, I need to change `CSVFormat.DEFAULT.withDelimiter(';');` to `CSVFormat.DEFAULT.withDelimiter(',');`. Is this correct? – Suresh Jul 30 '18 at 06:00
readLine ? What if the first record contains "bla\r\nbli". – grigouille Apr 09 '23 at 17:19

score 6 · Answer 3 · answered Sep 10 '21 at 10:09

6

In version 1.9.0 of org.apache.commons:commons-csv use:

val format = CSVFormat.Builder.create(CSVFormat.DEFAULT)
        .setHeader()
        .setSkipHeaderRecord(true)
        .build()

val parser = CSVParser.parse(reader, format)

answered Sep 10 '21 at 10:09

Markus Lenger

521
5
7

Or: `CSVFormat.DEFAULT.builder()...`. – MC Emperor Nov 09 '21 at 09:50
The headers are in the second record. – grigouille Apr 10 '23 at 07:28

score 2 · Answer 4 · answered Jul 23 '19 at 06:59

2

You can skip the first record using stream:

List<CSVRecord> noHeadersLine = records.stream.skip(1).collect(toList());

answered Jul 23 '19 at 06:59

Frank Why

86
6

Musab Qamri · Answer 5 · 2018-09-14T13:20:41.707

1

You can filter it using Java Streams:

parser.getRecords().stream()
     .filter(record -> record.getRecordNumber() != 1) 
     .collect(Collectors.toList());

edited Sep 14 '18 at 13:20

answered Aug 30 '18 at 10:13

Musab Qamri

111
1
8

Can you explain your code? What's that `csvRecordToPayerCodeMapping` needed for? – Nico Haase Aug 30 '18 at 10:32
Sorry that is for internal use, you can skip .map(). Will edit the same – Musab Qamri Sep 14 '18 at 13:20

score 1 · Answer 6 · answered Mar 30 '22 at 22:20

I am assuming your file format looks something like:

<garbage line here>
<header data>
<record data starts here>

For version 1.9.0, use, as given above, but with one addition:

Reader in = new FileReader(fileName);
BufferedReader bufferedReader = new BufferedReader(in);
System.out.println(bufferedReader.readLine());
CSVFormat format = CSVFormat.Builder.create(CSVFormat.DEFAULT)
            .setHeader()
            .setSkipHeaderRecord(true)
            .build();
CSVParser parser = CSVParser.parse(bufferedReader, format);
for (CSVRecord record : parser.getRecords()) {
    <do something>
}

If you don't skip that first line somehow, you will throw an IllegalArgumentException.

readLine ? What if a column contains CR LF ? – grigouille Apr 10 '23 at 07:29 — grigouille, Apr 10 '23 at 07:29

Murat Karagöz · Answer 7 · 2017-08-24T12:52:34.233

0

You could consume the first line and then pass it to the CSVParser. Other than that there is a method #withIgnoreEmptyLines which might solve the issue.

edited Aug 24 '17 at 12:52

answered Aug 24 '17 at 12:42

Murat Karagöz

35,401
16
78
107

1

the problem is the line isn't empty. But using BufferedReader (which has a readLine method) solved it. – Medusa Aug 24 '17 at 13:17

score 0 · Answer 8 · answered Aug 12 '22 at 11:42

the .setHeader() method must be call for the .setSkipHeaderRecord(true) to take effect.

CSVFormat.DEFAULT.builder()                                                                  
    .setDelimiter(';')
    .setHeader()    
    .setSkipHeaderRecord(true)  // skip header
    .build();

score 0 · Answer 9 · answered Apr 10 '23 at 07:43

If your first record doesn't contain CR LF characters, you can use the "readLine" method. Otherwise you have to read twice.

First get the headers :

CSVFormat format;
List<String> headers = null;
try(Reader reader = getReader()) {
  Iterator<CSVRecord> iter = format.parse(reader).iterator();
  if(iter.hasNext()) iter.next();
  if(iter.hasNext()) {
    headers = iter.next().toList();
  }
}

Then read again :

try(Reader reader = getReader()) {
  format = format.builder().setHeader(headers.toArray(new String[0])).build();
  Iterator<CSVRecord> iter = format.parse(reader).iterator();
  if(iter.hasNext()) iter.next();
  if(iter.hasNext()) iter.next();
  while(iter.hasNext()) {
    CSVRecord record = iter.next();
    //do stuff
  }
}

How can I skip the first line of a csv in Java?

9 Answers9