1

I'm creating a job that will read and process different .csv files based on an input parameter. There are 3 different types of .csv files with different headers. I want to map each line of a file to a POJO using a generic FlatFileItemReader.

Each type of file will have its own POJO implementation, and all "File Specific POJOs" are subclassed from an abstract GenericFilePOJO.

A tasklet will first read the input parameter to decide which file type needs to be read, and construct a LineTokenizer with the appropriate header columns. It places this information in the infoHolder for retrieval at the reader step.

@Bean
public FlatFileItemReader<GenericFilePOJO> reader() {
    FlatFileItemReader<RawFile> reader = new FlatFileItemReader<GenericFilePOJO>();
    reader.setLinesToSkip(1); // header

    reader.setLineMapper(new DefaultLineMapper() {
        {
            // The infoHolder will contain the file-specific LineTokenizer
            setLineTokenizer(infoHolder.getLineTokenizer());
            setFieldSetMapper(new BeanWrapperFieldSetMapper<GenericFilePOJO>() {
                {
                    setTargetType(GenericFilePOJO.class);
                }
            });
        }
    });
    return reader;
}

Can this reader handle the different File Specific POJOs despite returning the GenericFilePOJO?

Dmytro Maslenko
  • 2,247
  • 9
  • 16

1 Answers1

2

You wrote:

A tasklet will first read the input parameter to decide which file type needs to be read.

Because the tasklet or infoHolder knows about type of file you can implement the creation of specific FieldSetMapper instance.

This is a demo example how it can be implemented:

public class Solution<T extends GenericFilePOJO> {
    private InfoHolder infoHolder = new InfoHolder();

    @Bean
    public FlatFileItemReader<T> reader()
    {
        FlatFileItemReader<T> reader = new FlatFileItemReader<T>();
        reader.setLinesToSkip(1);

        reader.setLineMapper(new DefaultLineMapper() {
            {
                setLineTokenizer(infoHolder.getLineTokenizer());
                setFieldSetMapper(infoHolder.getFieldSetMapper());
            }
        });
        return reader;
    }

    private class InfoHolder {
        DelimitedLineTokenizer getLineTokenizer() {
            return <some already existent logic>;
        }

        FieldSetMapper<T> getFieldSetMapper() {
            if (some condition for specific file POJO 1){
                return new BeanWrapperFieldSetMapper<T>() {
                    {
                        setTargetType(FileSpecificPOJO_1.class);
                    }
                };
            } else if (some condition for specific file POJO 2){
                return new BeanWrapperFieldSetMapper<T>() {
                    {
                        setTargetType(FileSpecificPOJO_2.class);
                    }
                };
            }
        }
    }
}
Dmytro Maslenko
  • 2,247
  • 9
  • 16
  • Great, thank you! Right before I saw your answer, I implemented the Reader in such a way that returns the GenericFilePOJO class (parent class). As long as the file specific POJOs are subclassed from GenericFilePOJO, it seems to be working ... Is there a reason why using a generic T would be better? – TheLifeOfParallax Jan 01 '20 at 20:20
  • Particular in your case this is the same to use `T` or the parent class. – Dmytro Maslenko Jan 02 '20 at 01:21
  • Ah I see. Also, some headers on some files are named differently but ultimately should map to a common POJO field - i.e. file A can have "birth_date" while file B can have "date_of_birth" but each should map to the same field in the parent POJO. How can I achieve this? – TheLifeOfParallax Jan 02 '20 at 23:30
  • The parent `GenericFilePOJO` may have all needed fields, say `birthDate`, the child classes just have the corresponding get/setters for these fields according to specific file headers, say `get/setDateOfBirth()` which works with parent `birthDate`. The main app will work with the parent class, say `getBirthDate()`. – Dmytro Maslenko Jan 03 '20 at 02:44
  • Just to follow up on this - I'm using a tasklet to first determine which files should be read and where. I'm storing this info in an infoHolder and retrieving it to set the Resources in the MultiResourceItemReader. However, Spring tries to set the Resources before executing the tasklet -- causing a **The resources must not be null** exception. How can I resolve this? – TheLifeOfParallax Jan 07 '20 at 02:52
  • You can solve **The resources must not be null** with adding the @JobScope to the tasklet. – Dragoslav Petrovic Dec 22 '21 at 19:38