Spring Batch : Parsing a CSV file with quoteCharacter

Question

I'm new in Spring Batch, we know that CSV files come in all form and shapes… and some of them are syntactically incorrect. I'm tring to parse a CSV file, that line start with '"' and end with '"'this is my CSV :

"1;Paris;13/4/1992;16/7/2006"
"2;Lyon;31/5/1993;1/8/2009"
"3;Metz;21/4/1990;27/4/2010"

I tried this :

  <bean id="itemReader" class="org.springframework.batch.item.file.FlatFileItemReader">
    <property name="resource" value="data-1.txt" />
    <property name="lineMapper">
      <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
        <property name="fieldSetMapper">
          <!-- Mapper which maps each individual items in a record to properties in POJO -->
          <bean class="com.sam.fourthTp.MyFieldSetMapper" />
        </property>
        <property name="lineTokenizer">
          <!-- A tokenizer class to be used when items in input record are separated by specific characters -->
          <bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
            <property name="quoteCharacter" value="&quot;" />
            <property name="delimiter" value=";" />
          </bean>
        </property>
      </bean>
    </property>
  </bean>

But this work when a CSV file be like this :

"1";"Paris";"13/4/1992";"16/7/2006"
"2;"Lyon";"31/5/1993";"1/8/2009"
"3";"Metz";"21/4/1990";"27/4/2010"

My question is how I can parse my CSV when a line start with '"' and end with '"' ??!

did you try to depure the cvs by removing the `"`? – Angelo Immediata Apr 17 '19 at 16:35 — Angelo Immediata, Apr 17 '19 at 16:35

score 1 · Accepted Answer · answered Apr 18 '19 at 07:14

The quoteCharacter is as you mentioned applicable to fields, not records.

My question is how I can parse my CSV when a line start with '"' and end with '"' ??!

What you can do is:

Read lines as raw Strings
Use a composite item processor with two delegates: One that trims the " from the start/end of each record, and another one that parses the line and map it to your domain object

Here is a quick example:

import java.util.Arrays;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.item.ItemProcessor;
import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper;
import org.springframework.batch.item.file.transform.DelimitedLineTokenizer;
import org.springframework.batch.item.file.transform.FieldSet;
import org.springframework.batch.item.support.CompositeItemProcessor;
import org.springframework.batch.item.support.ListItemReader;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.ApplicationContext;
import org.springframework.context.annotation.AnnotationConfigApplicationContext;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
@EnableBatchProcessing
public class MyJob {

    @Autowired
    private JobBuilderFactory jobs;

    @Autowired
    private StepBuilderFactory steps;

    @Bean
    public ItemReader<String> itemReader() {
        return new ListItemReader<>(Arrays.asList(
                "\"1;Paris;13/4/1992;16/7/2006\"",
                "\"2;Lyon;31/5/1993;1/8/2009\"",
                "\"3;Metz;21/4/1990;27/4/2010\"",
                "\"4;Lille;21/4/1980;27/4/2011\""
                ));
    }

    @Bean
    public ItemProcessor<String, String> itemProcessor1() {
        return item -> item.substring(1, item.length() - 1);
    }

    @Bean
    public ItemProcessor<String, Record> itemProcessor2() {
        DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer();
        lineTokenizer.setNames("id", "ville");
        lineTokenizer.setDelimiter(";");
        lineTokenizer.setStrict(false);
        BeanWrapperFieldSetMapper<Record> fieldSetMapper = new BeanWrapperFieldSetMapper<>();
        fieldSetMapper.setTargetType(Record.class);
        return item -> {
            FieldSet tokens = lineTokenizer.tokenize(item);
            return fieldSetMapper.mapFieldSet(tokens);
        };
    }

    @Bean
    public ItemWriter<Record> itemWriter() {
        return items -> {
            for (Record item : items) {
                System.out.println(item);
            }
        };
    }

    @Bean
    public CompositeItemProcessor<String, Record> compositeItemProcessor() {
        CompositeItemProcessor<String, Record> compositeItemProcessor = new CompositeItemProcessor<>();
        compositeItemProcessor.setDelegates(Arrays.asList(itemProcessor1(), itemProcessor2()));
        return compositeItemProcessor;
    }

    @Bean
    public Step step() {
        return steps.get("step")
                .<String, Record>chunk(2)
                .reader(itemReader())
                .processor(compositeItemProcessor())
                .writer(itemWriter())
                .build();
    }

    @Bean
    public Job job() {
        return jobs.get("job")
                .start(step())
                .build();
    }

    public static class Record {

        private int id;
        private String ville;

        public Record() {
        }

        public int getId() {
            return id;
        }

        public void setId(int id) {
            this.id = id;
        }

        public String getVille() {
            return ville;
        }

        public void setVille(String ville) {
            this.ville = ville;
        }

        @Override
        public String toString() {
            return "Record{" +
                    "id=" + id +
                    ", ville='" + ville + '\'' +
                    '}';
        }
    }

    public static void main(String[] args) throws Exception {
        ApplicationContext context = new AnnotationConfigApplicationContext(MyJob.class);
        JobLauncher jobLauncher = context.getBean(JobLauncher.class);
        Job job = context.getBean(Job.class);
        jobLauncher.run(job, new JobParameters());
    }

}

I used a simple POJO called Record and mapped only two fields. This sample prints:

Record{id=1, ville='Paris'}
Record{id=2, ville='Lyon'}
Record{id=3, ville='Metz'}
Record{id=4, ville='Lille'}

Hope this helps.

Hey Mahmoud :), thanks for your reply I tried your code and I got this : `@Bean method ScopeConfiguration.jobScope is non-static and returns an object assignable to Spring's BeanFactoryPostProcessor interface. This will result in a failure to process annotations such as @Autowired, @Resource and @PostConstruct within the method's declaring @Configuration class. Add the 'static' modifier to this method to avoid these container lifecycle issues; see @Bean javadoc for complete details.` — abdallah, Apr 18 '19 at 08:13
That's a warning which has been fixed in Spring Batch 3.0.9/4.0.1. See [BATCH-2161](https://jira.spring.io/browse/BATCH-2161). — Mahmoud Ben Hassine, Apr 18 '19 at 09:02
Thnaks again :D, but now I got this :/ :`Exception in thread "main" org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'myJob': Injection of autowired dependencies failed; nested exception is org.springframework.beans.factory.BeanCreationException: Could not autowire field: private org.springframework.batch.core.configuration.annotation.JobBuilderFactory com.drihem.sam.fourthTp.MyJob.jobs; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'jobBuilders' ` — abdallah, Apr 18 '19 at 09:25
@Mohmoud thanks again for your reply I use : `spring-batch-core 4.0.1.RELEASE` — abdallah, Apr 18 '19 at 10:03
to get dates I added this method : `public User mapFieldSet(FieldSet fieldSet) throws BindException { User result = new User(); result.setId(fieldSet.readInt(0)); result.setVille(fieldSet.readRawString(1)); result.setDateOfBirth(new LocalDate(fieldSet.readDate(2,"dd/MM/yyyy"))); result.setDateFirstDec(new LocalDate(fieldSet.readDate(2,"dd/MM/yyyy"))); return result; }` thanks again ! — abdallah, Apr 18 '19 at 13:10
in your code you use an `Arrays` my question is how can I user a file .CSV ? — abdallah, Apr 18 '19 at 14:58
That's just an example as I can't upload a csv file to the answer. You can change the reader with a `FlatFileItemReader` and it should work in the same way. The most important point is about the composite item processor. — Mahmoud Ben Hassine, Apr 18 '19 at 16:10
Hey Mahmoud thanks again; I make it as a question can you explain this [here](https://stackoverflow.com/questions/55749966/spring-batch-reading-a-csv-file-into-array-list) please ? — abdallah, Apr 18 '19 at 16:14
if I user `FlatFileItemReader` my `@Bean` will return reader of type String ? i need more details please — abdallah, Apr 19 '19 at 07:49
Yes, it should return a String so that the first processor can trim the leading/trailing `"` characters. I answered your other question https://stackoverflow.com/questions/55749966/spring-batch-reading-a-csv-file-into-array-list with an example. — Mahmoud Ben Hassine, Apr 19 '19 at 07:53
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/192112/discussion-between-abdallah-and-mahmoud-ben-hassine). — abdallah, Apr 19 '19 at 15:09
Hey @MahmoudBenHassine , I want to create a class of my ItemProcessor `public class MyItemProcessorTwo implements ItemProcessor { @Override public User process(String item) throws Exception {DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer() lineTokenizer.setNames("id", "ville", "dateOfBirth", "dateFirstDec"); lineTokenizer.setDelimiter(";"); return item -> { FieldSet tokens = lineTokenizer.tokenize(item); return mapFieldSet(tokens); };} }` but I got this `The target type of this expression must be a functional interface` in `item ->` — abdallah, Apr 26 '19 at 09:58

Spring Batch : Parsing a CSV file with quoteCharacter

1 Answers1

Linked