1

I am using below code to insert records from a large csv (100,000 records) in Oracle using jOOQ. Here's the snippet:

CSVReader csvReader = null;
    String csvError;
    try {
      csvReader = new CSVReader(new FileReader(tempFile));
    } catch (FileNotFoundException e) {
      e.printStackTrace();
    }

    //delete previous uploaded stuff from the same user
    clearTable(user.getUserId());
    List<BSearchRecord> records = new ArrayList<>();
    boolean isFirst = true;
    while (csvReader.hasNext()) {
      if(isFirst) {
        //validate headers
        String[] headers = csvReader.next();
        uploadedHeaders = headers;
        csvError = validateHeadersLength(headers);
        if(csvError != null) {
          return csvError;
        }
        for(int i=0; i<headers.length; i++) {
          csvError = validateEachHeader(i, headers[i]);
          if(csvError != null) {
            return csvError;
          }
        }
        isFirst = false;
        continue;
      } else {
        String[] row = csvReader.next();
        if(row != null) {
          BSearchRecord bSearchRecord = new BSearchRecord();
          bSearchRecord.set(RET_BSEARCH.UPLOADEDBY, user.getUserId());
          for(int i=0; i<csvHeaders.length; i++){
            Field field = bSearchRecord.field(backendColumns[i]);
            bSearchRecord.set(field, row[i]);
          }
          records.add(bSearchRecord);
        }
      }
    }
    db.batchInsert(records).execute(); // IS THIS OKAY ? (is this batch enabled?)

I went across some suggestions like: PostgreSQL/JooQ bulk insertion performance issues when loading from CSV; how do I improve the process?

However, my use-case was a bit different, so asking this just to get a suggestion, whether I am doing it in the right way or not?

Also, can you suggest, does batchInsert(..) implementation in jOOQ support batch execution ? (in the docs, I saw .bind(..) approach, so asking this for clarity)

1 Answers1

0

jOOQ has an out of the box API for importing CSV data: https://www.jooq.org/doc/latest/manual/sql-execution/importing/importing-csv

ctx.loadInto(BOOK)
   .loadCSV(inputstream, encoding)
   .fields(BOOK.ID, BOOK.AUTHOR_ID, BOOK.TITLE)
   .execute();

It lets you conveniently specify bulk/batch/commit sizes and additional behaviour. I suggest you use that.

Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
  • What if we want to add an extra column and set its value explicitly. For e.g, BOOK.CREATOR is not given in the CSV, but we also want to set a known value for BOOK.CREATOR, because this column is needed in the table. Any way of adding an extra column over-the-head in this API ? – Anmol Deora May 14 '20 at 17:36
  • This isn't possible out of the box yet: https://github.com/jOOQ/jOOQ/issues/4941. But it's relatively easy if you use a CSV library that can pre-process the input stream and append data to it prior to passing it to jOOQ... – Lukas Eder May 15 '20 at 07:15
  • @AnmolDeora: This will now be supported in jOOQ 3.14 via https://github.com/jOOQ/jOOQ/issues/10583 and https://github.com/jOOQ/jOOQ/issues/4941 – Lukas Eder Sep 03 '20 at 11:04