0

I want to have an application where user can upload a csv file in front-end (Angular). I want a rest service to read it and then store the file as a BLOB into Oracle database.

For the REST service, I will receive a MultipartFile object:

@PostMapping(value = "/upload")
public String processUploadFile(@RequestParam MultipartFile file) {
    // Call to a service 
}

For the entity, it will be something like this:

@Entity
@DynamicUpdate
@Table(name = "FILE_UPLOAD")
public class FileUploadEntity implements Serializable {

    @Id
    @Column(name = "ID")
    private Long id;

    @Column(name = "BLOB")
    @Lob
    private Blob blob;
    
    // ...
}

I saw that I have getInputStream() and getBytes() methods.

What is the best and optimized way to read the CSV file line by line to do a treatment and then to store it if the treatment is in success without error please?

phildeg31
  • 169
  • 1
  • 14

1 Answers1

0

In order to process csv file, line by line, you could use any out of the following libraries:

<dependency>
    <groupId>com.opencsv</groupId>
    <artifactId>opencsv</artifactId>
</dependency>

or

<dependency>
    <groupId>com.fasterxml.jackson.dataformat</groupId>
    <artifactId>jackson-dataformat-csv</artifactId>
</dependency>

Let's imagine that your csv line represents some DTO object YourDtoClass. Example with usage of those libraries (make sure to customize according to your needs):

import com.fasterxml.jackson.dataformat.csv.CsvMapper;
import com.fasterxml.jackson.dataformat.csv.CsvParser;
import com.fasterxml.jackson.dataformat.csv.CsvSchema;
import com.opencsv.CSVParserBuilder;
import com.opencsv.ICSVParser;
...

@PostMapping(value = "/upload", consumes = MediaType.MULTIPART_FORM_DATA_VALUE)
public void processUploadFile(@RequestParam MultipartFile file) throws IOException {
    // option #1. using `opencsv` library
    ICSVParser parser = new CSVParserBuilder()
            .withQuoteChar(ICSVParser.DEFAULT_QUOTE_CHARACTER)
            .build();
    BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(file.getInputStream(), UTF_8));
    bufferedReader.lines()
            .forEach(line -> {
                // process line...
                log.info("line has been processed");
            });

    // option #2. using `jackson-dataformat-csv` library
    List<YourDtoClass> list = readCsv(YourDtoClass.class, file.getInputStream());
}

public <T> List<T> readCsv(Class<T> clazz, InputStream stream) throws IOException {
    CsvMapper mapper = new CsvMapper();
    CsvSchema schema = mapper.schemaFor(clazz)
            .withoutHeader()
            .withColumnSeparator(CsvSchema.DEFAULT_COLUMN_SEPARATOR)
            .withArrayElementSeparator(CsvSchema.DEFAULT_ARRAY_ELEMENT_SEPARATOR)
            .withNullValue(StringUtils.EMPTY)
            .withoutEscapeChar();
    return mapper
            .readerFor(clazz)
            .with(CsvParser.Feature.TRIM_SPACES)
            .with(CsvParser.Feature.SKIP_EMPTY_LINES)
            .with(schema)
            .<T>readValues(stream)
            .readAll();
}

// your csv line represents this DTO class
class YourDtoClass {
    private String name;
    private String surname;
    // ...
}

And if you need to persist csv file into database, you could convert InputStream into byte array and persist it to database.

Actually, InputStream can't be processed twice, but there are some workarounds, and one of them - store InputStream into a temporary file, and after that, you could read data from temp file multiple times.

File tempFile = File.createTempFile(prefix, suffix);
FileUtils.copyInputStreamToFile(inputStream, tempFile); // from `org.apache.commons.io`

and after operating on temp file, make sure that you remove it.

Vasyl Sarzhynskyi
  • 3,689
  • 2
  • 22
  • 55
  • Thank you for your answer. If use an InputStream to parse the file, I will need another InputStream to get bytes to store the file right? Is it better to use getBytes() method of the MultipartFile to not have 2 objects in that case to parse an to store the file? Also, it means that the file is stored in memory (RAM) no? – phildeg31 Jun 28 '21 at 06:30
  • I posted update into my answer. Indeed, you can't process `InputStream` multiple times, but for that you could use temp file – Vasyl Sarzhynskyi Jul 02 '21 at 05:25