0

I am working on a Spring batch project where I have to push data from a CSV file into a DB. Managed to implement the batch and the rest, currently the data is being pushed as it should but I wonder if there's anyway to skip some of the columns in the CSV file as some of them are irrelevant.

I did a bit of research but I wasn't able to find an answer, unless I missed something.

Sample of my code below.

<bean id="mysqlItemWriter"
      class="org.springframework.batch.item.database.JdbcBatchItemWriter">
    <property name="dataSource" ref="dataSource" />
    <property name="sql">
        <value>
            <![CDATA[
            insert into WEBREPORT.RAWREPORT(CLIENT,CLIENTUSER,GPS,EXTENSION) values (:client, :clientuser, :gps, :extension)
        ]]>
        </value>
    </property>
user2342259
  • 345
  • 2
  • 9
  • 27

2 Answers2

1

You can implement your FieldSetMapper which will map structure from one line to your POJO in reader.

Lets say you have:

name, surname, email
Mike, Evans, test@test.com

And you have model of Person with only name and email. You are not interested in surname. Here is reader example:

@Component
@StepScope
public class PersonReader extends FlatFileItemReader<Person> {

    @Override
    public void afterPropertiesSet() throws Exception {
        //load file in csvResource variable
        setResource(csvResource);
        setLineMapper(new DefaultLineMapper<Person>() {
            {
                setLineTokenizer(new DelimitedLineTokenizer());
                setFieldSetMapper(new PersonFieldSetMapper());
            }
        });
        super.afterPropertiesSet();
    }
}

And you can define PersonFieldSetMapper:

@Component
@JobScope
public class PersonFieldSetMapper implements FieldSetMapper<Person> {

    @Override
    public Person mapFieldSet(final FieldSet fieldSet) throws BindExceptio   
    {
        final Person person = new Person();
        person.setName(fieldSet.readString(0)); // columns are zero based
        person.setEmail(fieldSet.readString(2));

        return person;
    }
}

This is for skipping columns, if I understood right this is what you want. If you want to skip rows, it can be done as well and I explained how to skip blank lines for example in this question.

Community
  • 1
  • 1
Nenad Bozic
  • 3,724
  • 19
  • 45
0

if the check for the skip is simple and does not need a database roundtrip, you can use a simple itemProcessor, which returns null for skipped items

real simple pseudo code

public class SkipProcessor implements ItemProcessor<Foo,Foo>{
    public Foo process(Foo foo) throws Exception {
        //check for a skip
        if(skip(foo)) {
          return null;
        } else {
          return foo;
        }
    }
}

if the skip check is more complex and needs a database roundtrip, you can use the item processor, but the performance (if needed) will suffer

if performance is critical...well then it depends on setup, requirements and your possibilities, i would try it with 2 steps, one step loads cvs into database (without any checks), second steps reads data from database, and the skip check is done with a clever sql JOIN in the SQL for the itemReader

Michael Pralow
  • 6,560
  • 2
  • 30
  • 46