0

I am trying to read, split into meaningful variables and then insert the information of a 4Gb text file into a database. The reading (and printing when required) of a sample 1.2Mb takes few hundred mili seconds but inserting the information into the database takes many seconds (30+). I am using two databases (the file contains two different types of records). My code looks something like this:

    try {
    reader = new BufferedReader(new FileReader(filename));
    openConnection();
    ch = reader.read();
    while (ch != -1) {
        if ((char)ch == '1') {
            sql = parseCliente();
            st.executeUpdate(sql);
        }
        else if ((char)ch == '2') {
            sql = parseRegistro();
            st.executeUpdate(sql);
        }
        else if (ch == -1) break;
        else {
            ch = reader.read();
        }
    }
    closeConnection();
}
catch (IOException e) {
    e.printStackTrace();
} catch (SQLException e) {
    e.printStackTrace();
}

My code actually works but I need a faster method or the operation would take days!

CBozanic
  • 11
  • 3
  • 3
    Check the documentation of "batch statement", see https://stackoverflow.com/questions/3784197/efficient-way-to-do-batch-inserts-with-jdbc for example –  Jul 19 '17 at 18:24
  • can you use a transaction ? are you using autocommit ? – Sidias-Korrado Jul 19 '17 at 18:24
  • It's hard to tell without knowing anything about your database - is it a large database, are there things like unique and foreign key constraints or indices? These can all influence how fast inserts and updates are. – Jesper Jul 19 '17 at 18:25
  • Use a transaction and group your inserts into batches. Bulk inserting in a transaction will reduce the amount of I/O operations that need to occur, removing a bottleneck, however the database will be hit with a single large commit. A transaction is also going to be logically more sound because you would want to always insert all the records from a file you are processing together. Not performing this task in a transaction could mean that part of a file could be inserted in the case of a fatal error. – ebraley Jul 19 '17 at 18:31
  • Export a table and inspect the SQL script it creates. Since this seems like a large one-time shot, you can write your app to create a similar script and then load that SQL script using your favorite mysql tool. – phatfingers Jul 19 '17 at 21:38
  • I changed to using batches and still took 37 seconds for a small sample; basically no change from using individual insertions! – CBozanic Jul 19 '17 at 22:05

1 Answers1

1

I changed strategy and loaded the processed data into a CSV file and from there I transferred it to the DB using the LOAD DATA INFILE command. I went from 37 seconds to less than 1 sec for my sample.

CBozanic
  • 11
  • 3